Longest Proper Prefix Suffix Array in C++ naive approach

Hello friends, today we are going to learn about how to construct the longest proper prefix suffix array (LPS array) in C++.

Though for many people, this topic may look completely random but it has a huge implementation in the very famous KMP algorithm. There are two solution approaches towards the problem, first one is the Naive approach and the second one is the efficient approach. Here we are going to learn about Naive approach.

For a string “VTRDF” the prefix are – “”, “V”, “VT”, “VTR”, “VTRD”, “VTRDF”
but proper prefix are – “”, “V”, “VT”, “VTR”, “VTRD” (length of proper prefix is less than length of given string)
similarly the suffix are – “”, “F”, “DF”, “RDF”, “TRDF”, “VTRDF”.

Let us see an example:

We are given a string text as input and we need to construct an LPS array.
Input : “abacabad”

Output : LPS array is  0 0 1 0 1 2 3 0
LPS[i] stores the length of the longest proper prefix suffix of text[0…i]
for text = “pqprpqps”
LPS[0] = 0 (because length of proper prefix of any single character string is always 0)
LPS[1] = 0 (for “pq” there is no prefix which is also a suffix.)
LPS[2] = 1 (for “pqp” longest prefix which is also a suffix =”p”)
LPS[3] = 0 (for “pqpr” there is no prefix which is also a suffix)
LPS[4] = 1 (for “pqprp” longest prefix which is also a suffix =”p”)
LPS[5] = 2 (for “pqprpq” longest prefix which is also a suffix =”pq”)
LPS[6] = 3 (for “pqprpqp” longest prefix which is also a suffix =”pqp”)
LPS[7] = 0 (for “pqprpqps” there is no prefix which is also a suffix)

Let us now see the code implementation.

C++ code to construct the longest proper prefix suffix array (LPS array)

The naive algorithm has O(n^3) time complexity where n is the length of the input text.

#include <bits/stdc++.h> 
using namespace std; 

vector<int> LPS(string txt){
    vector<int>v(txt.length(),0);
    for(int i=0;i<txt.length();i++){
        int n = i+1;                  // n = length of string str[0...i]
        int len = n-1;                // maximum possible length of LPS[i]
        
        // for each value of len we check whether it is correct for LPS[i] 
        for( ; len>0; len--){
            int flag = 1;
            for(int i=0; i<len; i++){
                if(txt[i] != txt[n-len+i]){
                    flag = 0;
                    break;
                }
            }
            if(flag == 1){
                v[i] = len;
                break;
            }
        }
    }
    return v;
}
 
int main() 
{ 
    string text = "pqprpqps";
    vector<int>lps = LPS(text);
    cout << "LPS array is ";
    for(int i=0;i<text.length();i++){
        cout<<lps[i]<<" ";
    }
    
    return 0; 
}

Output –

LPS array is 0 0 1 0 1 2 3 0

 

Understanding the algorithm

Let us consider an example for better understanding

Let the input text be “pqprpqps

i=0 => we can straightforward see lps[0] = 0.

i=1 => string being considered is “pq“. For string “pq” maximum possible length of lps is 1. So we set len = 1. Lps[i=1] can take any value between 1 to len. Hence we check for each value of len starting from 1 to 0 by use of for loop.

len =1 => we check  whether “p” (prefix of 1 character) is lps by comparing it with “q” (suffix of 1 characters). Hence we run another loop. Here we find that there is a mismatch at first character itself so we can say that Lps[i=1] cannot be 1 hence we break the loop and go to i=2.

……….
……….
……….

i = 5 => string being considered is “pqprpq“. For string “pqprpq” maximum possible length of lps is 5. So we set len = 5. Lps[i=5] can take any value between 1 to len. Hence we check for each value of len starting from 5 to 0 by use of for loop.

len = 5 => we check  whether “pqprp” (prefix of 5 characters) is lps by comparing it with “qprpq” (suffix of 5 characters). Hence we run another loop. Here we find that there is a mismatch at first character itself so we can say that Lps[i=5] cannot be 5 hence we break the loop and do len–.

len = 4 => we check whether “pqpr” (prefix of 4 characters) is lps by comparing it with “prpq” (suffix of 4 characters). We run a loop and find that there is a mismatch at the second character thus Lps[i=5]  cannot be 4, we break the loop.

len = 3 => we check whether “pqp” (prefix of 3 characters) is lps by comparing it with “rpq” (suffix of 3 characters). We run a loop and find that there is a mismatch at the first character thus Lps[i=5] cannot be 3, we break the loop.

len = 2 => we check whether “pq” (prefix of 2 characters) is lps by comparing it with “pq” (suffix of 3 characters). We run a loop and find that there is no mismatch hence we put Lps[i=5] = 2. Now we move onto i=6.

i = 6 => Same process is repeated as explained for i=5.

i = 7 => Same process is repeated as explained for i=5.

Hope this tutorial helped you. Thank You!

One response to “Longest Proper Prefix Suffix Array in C++ naive approach”

  1. Abdelrahman says:

    How can i do this code with complexity n2

Leave a Reply

Your email address will not be published. Required fields are marked *