Abstract:C library does not have a string hash table, how to operate.

The C library does not have string hash table operations. While none of the exam questions have to use a C hash table (at least I can do it in regular C), I do need to be prepared. Here’s a typical question to try: leetcode-cn.com/problems/su…

Given a string s and some words of the same length. Find the starting position of the substring in S that happens to be formed by concatenating all the words in words.


Note that the substring must match exactly the words in words, without any other characters in the middle, but you do not need to consider the sequence in which words are concatenated.

Example: Enter s ="barfoothefoobarman",
  words = ["foo"."bar"] output: [0,9] Explanation: the substring starting from index 0 and 9 is"barfoo""foobar". The order of the outputs is not important, [9,0] is also a valid answer.Copy the code

This problem regardless of the programming language, using a hash table is easier, but if you use C, you can use your own hash table, which is more than enough to deal with this kind of problem.

Please refer to leetcode-cn.com/problems/su… I’m just going to show you the easiest way to construct a hash table.

The first is to select the hash function, here I use the djB2 algorithm, see www.cse.yorku.ca/~oz/hash.ht… The collision rate is fairly low, the distribution is balanced, and the implementation is simple, just two or three lines of code, remembering the key numbers (5381 and 33).

If you just want to have a good hash function, and cannot wait, djb2 is one of the best string hash functions i know. it has excellent distribution and speed on many different sets of keys and table sizes.

Language – code

unsigned long
hash(unsigned char *str)
{
    unsigned long hash = 5381;

    int c;

    while (c = *str++)

        hash = ((hash< < 5) +hash) + c; / *hash * 33 + c */

    return hash;

}Copy the code

With the string hash function, you can convert large strings into numbers, which in turn store information as subscripts (keys) of arrays. So what’s the size of the hash table? In general, the size should be larger than the number of data stores, such as a maximum of 100 data stores in the hash table must be larger than 100. For this question, the maximum number of words is about 300, and the average number of words is less than 50.

5 -> 110/173
10 -> 143/173
50 -> 170/173
100 -> 170/173
300 -> 172/173
400 -> 173/173Copy the code

Here’s my answer:

C code

// The maximum value of a string,hashTable size, estimation and the actual number of data#define MAXWORDCOUNT 1000


static int wordCount[MAXWORDCOUNT];

static int currWordCount[MAXWORDCOUNT];



// ref: http://www.cse.yorku.ca/~oz/hash.html


unsigned long DJBHash(const char* s, int len) {

    unsigned long hash= 5381; // Experience value,hashLow conflict probability, balanced distributionwhile (len--) {

        hash = (((hash< < 5) +hash) + *(s++)) % MAXWORDCOUNT; / *hash * 33 + c */


    }

    return hash;


}



int* findSubstring(char * s, char ** words, int wordsSize, int* returnSize){

    memset(wordCount, 0, sizeof(wordCount));


    *returnSize = 0;



    const int kSLen = strlen(s);

    if (kSLen == 0 || wordsSize == 0) returnNULL; const int kWordLen = strlen(words[0]); // Store the number of words in the hash table, key: word, value: number of wordsfor (int i = 0; i < wordsSize; ++i)


        ++wordCount[DJBHash(words[i], kWordLen)];



    int *result = malloc(sizeof(int) * kSLen);

    for (int i = 0; i < kWordLen; ++i) {


        for(int j = i; j + kWordLen * wordsSize <= kSLen; J += kWordLen) {// Count the words in the current windowfor(int k = (j == i ? 0 : wordsSize - 1); k < wordsSize; ++k) ++currWordCount[DJBHash(s + j + k * kWordLen, kWordLen)]; // Determine whether the two hash tables are equal, i.e. whether the words in the window exactly match the given dictionaryif (memcmp(wordCount, currWordCount, sizeof(wordCount)) == 0)

                result[(*returnSize)++] = j; --currWordCount[DJBHash(s + j, kWordLen)]; } // memset(currWordCount, 0, sizeof(currWordCount)); }return result;

}Copy the code


Click to follow, the first time to learn about Huawei cloud fresh technology ~