# KMP Algorithm

Hello Everyone Today I will explain one of the standard pattern matching algorithms named as KMP(Knuth-Morris-Pratt) Algorithm. It will reduce the time complexity for pattern matching.

So let us first see the Naive approach and understand the problem with the approach.

# Naive Approach

Here the given string is “ABBAA” and we need to find the pattern “AA” in the given string. We will start from index 0 till the last and compare if the substring in string and the pattern matches or not, if yes we will save the index. But each time comparing take O(k) where k is the length of the pattern and we are from 0 to n-k+1 so time complexity will be O(k*(n-k+1)). So can we optimize the solution?

# KMP Approach

So what is the basic idea of KMP, it tries to avoid the backtrack of iterator on pattern again by observing prefix and suffix. the prefix of a string is let us for “ABC” the prefixes are “A”, “AB”, “ABC” and suffixes are “C”, “BC”, “ABC”.So it is checking if the beginning part of the pattern appears again anywhere else in the pattern.

Here we can see the prefix AB and ABC are repeating. So the question is what it will do after checking that? In KMP for the pattern string, we generate the Π or LPS(Longest Prefix Suffix ) array to use these checks.

Let us first create the Π table for the pattern “ABABD”

Here we can see the value in of character A of index 3 is 1 so why, because the prefix A matches the character A and same the same for the index 4 character value is 2 as AB till index B matches so that why.

so this is how we can create the lps table, so now let us see the code for the lps table.

so now let us see for string “ABABABD” how the lps table of pattern “ABABD” saves the backtracking of iterator in string. Let us assume iterator of string is i and of pattern is j. Initially value of both iterator are 0.

string “ABABABD” is compared in with “ABABD” whose lps is {0,0,1,2,0} so in the step str[i] being compared with pattern[j] and as they both are equal so both i and j will increment. it will happen till step 4.

In step 5 character is not matching so the now j will shift to the position in pattern whose value is equal to value in lps[j-1] which is 2 in this case, but if j is equal to zero than we have to increment i by 1.

Now here instead of going from the starting we are using prefix pattern to avoid backtracking of pattern iterator. Now the process of comparing will continue as same.

so in the final step as pattern is found the function will return with the message that pattern is present in the string , if they were not same the comparision will end and function would return not found.

Let us see the code of kmp algorithm

So this is how Kmp algorithm is better then navie algothim.

The worst time complexity of algorithm is almost O(n).

# You did it!

It’s undoubtedly a long explanation but the important thing is you reached to the end! Hope you have understand it well.

To know more about this algothim you can visit. Wikipidia

Thanks for giving your time to read this article.