Tutorial by apsdehal


June 29, 2013

Recently, I came across the suffix arrays which are a complementary to suffix trees and easier to implement while doing the the LCS question on the spoj the naive algo with complexity O(m*n) is simply unable to pass. So we must apply suffix array to find Longest Common Substring subsequently. SO I googled Suffix Array with finding nohing easy to understand or elseworth simple to write so after much struggle, I have wrote done the code for suffix array in C++, thats easy to understand and implement. Here some theory regarding Suffix Arrays followed by the code

Given a string A = (a0 a1 a2 a3 ….an-1) of length n.
A suffix of string S starting at position i is Suffix(i) = (ai ai+1 ai+2 ….an-1).
There are n suffixes of string A ( starting at each index to the end ).

Let U be a set of suffixes of string A in sorted order. ie obtain all the suffixes of A and sort them to form U.

A suffix array S is an array such that S[i] gives the index of Suffix(i) in U.

for example if A = banana suffixes of A are:
banana = Suffix(0)
anana = Suffix(1)
nana = Suffix(2)
ana = Suffix(3)
na = Suffix(4)
a = Suffix(5)

after sorting U = {
0 a
1 ana
2 anana
3 banana
4 na
5 nana
}

thus suffix array S would be,
S = { 3, 2, 5, 1, 4, 0 }

S[1] = 2 because index of Suffix(1) = anana in U is 2
Feel free to comment below if you don’t understand this.

Follow me on Twitter and Github, that's where I'm most active these days. You can also email me at [email protected]. Thanks!