How to implement Navie String Searching Algorithm in Python

In this post, we will study about finding a pattern in the text. There will be the main text a substring. The goal is to find how many times at what positions the substring occurs in the text. This technique of pattern-finding helps when there is a huge text and we have to find the occurrences of some keywords or especially given words. Here, we will talk about the most basic ‘Naive String Matching Algorithm in Python’ and will further improvise it through better and shorter codes.

Naive Algorithms as the word ‘naive’ itself suggest algorithms that are very basic and simple to implement. These algorithms perform the most simple and obvious techniques to perform work just like how a child would. These methods are good to start with for the newbies before proceeding towards more efficient and complicated algorithms. A naive string searching algorithm is also one of them. It is the simplest method among other string matching/pattern-finding algorithms.

The method starts by matching the string letter by letter. It checks for the first character in the main text and the first character in the substring. If it matches it moves ahead checking the next character of both the strings. If at any place the characters don’t match the loop breaks and it starts again from the next character of the main text string.

Python code for Naive String matching algorithm

Below is the code for the Naive String matching algorithm.

def naive(txt,wrd):
    lt=len(txt)#length of the string
    lw=len(wrd)/3length of the substring(pattern)
    for i in range(lt-lw+1):
        j=0
        while(j<lw):
            if txt[i+j]==wrd[j]:
                j+=1
            else:
                break
        else:
            print('found at position',i)

In the code above the function, ‘naive’ takes two arguments txt(the main string from which the pattern is to searched) and ward(the pattern to be searched).  A loop is taken from 0 to (length of string-length of substring+1) since at least the substring’s length should be left to be matched towards the end.  Each character is extracted from the string through the ‘for’ loop(txt[i]). Then there is an inner while loop that matches that character with the subsequent character of the substring unless the entire substring is matched. If it is not found the loop breaks and the next iteration as in the very next character is taken out for the process. As soon as the entire substring is found the while condition gets false and the else part is executed and the position is printed.  It is to be noted carefully that one else is inside the loop which gets executed only when the if the condition is false whereas the other else gets executed when the while loop condition becomes False.

Let’s try the code for multiple inputs-

  1. String- “”AABAACAADAABAABA”
    Substring- “AABA”

    naive("AABAACAADAABAABA","AABA")
    
    

    Output-

    found at position 0
    found at position 9
    found at position 12
  2. String- “1011101110”
    Substring- “111”

    naive("1011101110","111")

    Output-

Both of the programs generate the same output as the previous one.

Also read: Using binarytree module in Python for Binary Tree

Leave a Reply

Your email address will not be published. Required fields are marked *