Consider the problem of implementing a spelling checker. It is
fairly easy to break a text file up into words, but how do you check
to see if all of the words are spelled correctly? Because comparing
strings can take a long time, even a O(log n) binary search of a decent sized
dictionary is prohibitively slow. On the other hand, there isn't
nearly enough memory in the world to give each sequence of ten characters
or less its own bit.
The solution is to use a hash function, a function which takes
a string and returns an integer between 0 and m, where m is larger than
the number of strings in the dictionary but small enough to fit an
array of size m into memory. A good hash function sends all of the
strings into the dictionary to different integers. (Such a function
is called a perfect hash function. If, in addition, m is
equal to the size of the dictionary, it is called a minimal perfect
hash function.)
Here's an example hash function:
unsigned int StrHash(char *lstr)
{
unsigned int hval = 0;
while (*lstr != '\0')
hval = ( hval* 48 + 45 + *lstr++ ) % StrTableSize;
return(hval);
}