## Summary

Rolling hash is a common technique to compare if two substrings are equal or not. Given `O(n)`

preprocessing time, the comparison becomes `O(1)`

at the best case.

## OJ

Trick | Difficulty | |
---|---|---|

LC 1316. Distinct Echo Substrings | Rolling Hash + string_view or DP in Linear Structure + string_view | 6 points |

## Details

We use two large coprime numbers for base and for mod. We use a polynomial to calculate the hash value. Typical values are and

Given a substring `s[i, j]`

, it is not difficult to get that the hash value of it is,

We could preprocess tables for all the and in `O(n)`

. After that, the complexity of getting a hash value for any substring is `O(1)`

.

We then have the following truth,

```
int b = 31;
int m = 1e9 + 9;
vector<int> hash;
vector<int> bases;
void preprocess(string_view text) {
ll sum = 0;
ll base = 1;
for (auto ch : text) {
sum += (ch - '0') * base;
sum %= m;
hash.push_back(sum);
bases.push_back(base);
base = (base * b) % m;
}
}
ll get_hash(int left, int right) {
return (hash[right] - ((left >= 1) ? hash[left - 1] : 0) + m) % m;
}
// s[i, i + len - 1], s[j, j + len - 1]
bool equal(int i, int j, int len, string_view const &text) {
int hash0 = (get_hash(i, i + len - 1) * bases[j]) % m, hash1 = (get_hash(j, j + len - 1) * bases[i]) % m;
if (hash0 != hash1) {
return false;
} else {
return text.substr(i, len) == text.substr(j, len);
}
}
```

### Two Hash Functions

In LC 1044. Longest Duplicate Substring, you might find out one hash rolling hash function causes collisions. Another technique for rolling hash is to use Two hash functions (two bases and two mods).

Use `f_1 ^ f_2`

directly here will cause collisions. Instead, we use a hash set of pairs to store these two values. HashSet in the standard library helps us handle hash collisions, which is sufficient to pass the OJ.

```
struct PairHash {
size_t operator() (pair<ll, ll> const &p) const {
return hash<int>()(p.first) ^ hash<int>()(p.second);
}
};
unordered_set<pair<int, int>, PairHash> visited;
```

Another technique you might use here is the **Modular** **Multiplicative Inverse**. Please refer to this post.