KMP Algorithm

Hello Everyone Today I will explain one of the standard pattern matching algorithms named as KMP(Knuth-Morris-Pratt) Algorithm. It will reduce the time complexity for pattern matching.

So let us first see the Naive approach and understand the problem with the approach.

Naive Approach

Here the given string is “ABBAA” and we need to find the pattern “AA” in the given string. We will start from index 0 till the last and compare if the substring in string and the pattern matches or not, if yes we will save the index. But each time comparing take O(k) where k is the length of the pattern and we are from 0 to n-k+1 so time complexity will be O(k*(n-k+1)). So can we optimize the solution?

KMP Approach

So what is the basic idea of KMP, it tries to avoid the backtrack of iterator on pattern again by observing prefix and suffix. the prefix of a string is let us for “ABC” the prefixes are “A”, “AB”, “ABC” and suffixes are “C”, “BC”, “ABC”.So it is checking if the beginning part of the pattern appears again anywhere else in the pattern.

Here we can see the prefix AB and ABC are repeating. So the question is what it will do after checking that? In KMP for the pattern string, we generate the Π or LPS(Longest Prefix Suffix ) array to use these checks.

Let us first create the Π table for the pattern “ABABD”

LSB table of pattern

Here we can see the value in of character A of index 3 is 1 so why, because the prefix A matches the character A and same the same for the index 4 character value is 2 as AB till index B matches so that why.

so this is how we can create the lps table, so now let us see the code for the lps table.

so now let us see for string “ABABABD” how the lps table of pattern “ABABD” saves the backtracking of iterator in string. Let us assume iterator of string is i and of pattern is j. Initially value of both iterator are 0.

string “ABABABD” is compared in with “ABABD” whose lps is {0,0,1,2,0} so in the step str[i] being compared with pattern[j] and as they both are equal so both i and j will increment. it will happen till step 4.

In step 5 character is not matching so the now j will shift to the position in pattern whose value is equal to value in lps[j-1] which is 2 in this case, but if j is equal to zero than we have to increment i by 1.

Now here instead of going from the starting we are using prefix pattern to avoid backtracking of pattern iterator. Now the process of comparing will continue as same.

so in the final step as pattern is found the function will return with the message that pattern is present in the string , if they were not same the comparision will end and function would return not found.

Let us see the code of kmp algorithm

So this is how Kmp algorithm is better then navie algothim.

The worst time complexity of algorithm is almost O(n).

You did it!

It’s undoubtedly a long explanation but the important thing is you reached to the end! Hope you have understand it well.

To know more about this algothim you can visit. Wikipidia

Thanks for giving your time to read this article.

--

--

--

Hey reader I am Mayank Mittal 3rd year IIT Dharwad guy, Enjoying my collage life while learning new things daily.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Top 10 GitHub Repos To Bookmark Right Now

Top 10 GitHub Repos Every Developer Should Know | Programming | Technology | GitHub 2021

Go Lang Interfaces -1

Deploy static website to AWS with HTTPS — Using S3, Route 53, CloudFront, Certificate Manager

Breaking Down MACH Architecture and Its Principles

Gopter: Property Based Testing in Golang

Azure DevOps

Best of the Week — April 26/May 2

CI/CD Pipeline With github,Jenkins & kubernetes

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mayank Mittal

Mayank Mittal

Hey reader I am Mayank Mittal 3rd year IIT Dharwad guy, Enjoying my collage life while learning new things daily.

More from Medium

Leetcode 142. Linked List Cycle II

Which OOPL to choose?

Using Tail Recursion in Binary Tree Pre Order Traversal

Cracking the Coding Interview Book Problem Series: Problem 7