Edit distance

#Day 1

Edit distance is a very famous algorithm to get the fuzziness between two strings. Fuzziness means the difference between two strings or in simple words, the number of  replaces, add or delete operation to be used on a string to transform it into another string.

For example:

The number of operation required converting ‘cat’ into ‘rat’ is 1.

That is replace  ‘c’ with ‘r’.

This problem can be solved using dynamic programming. Approach to solving this problem is a bottom-up calculation. We will solve this problem with an example.

Example: convert ‘random’ to ‘radis’.

For solving the problem we make a two-dimensional array of size (n+1)*(m+1), where n is the size of final string and m is the size of the original string. Then we initialize the array’s first row from 0 to n. and the first column from 0 to m. (Image:1)

n = 6      [ length of ‘random’]

m = 5      [ length of ‘radis’]

edit_table_1
Image: 1

The table is initialized with these number according to the following rule. If we go to row 0 and start traversing from column 0 to column 6, it means that the number of edits required converting

‘ ‘ to ‘ ‘ is 0.

‘r’ to ‘ ‘ is 1

‘ra’ to ‘ ‘ is 2

‘ran’ to ‘ ‘ is 3 and so on.

And in the same way, if we traverse in column 0 from row 0 to row 5, it means the number of edits required converting

‘ ‘ to ‘r’ is 1

‘ ‘ to ‘ra’ is 2

‘ ‘ to ‘rad’ is 3 and so on.

Then we need to fill the rest of the table row by row. The value in the field is dependent on following parameters

If row index value of the final string is same as column index value of the original string, then the value is same as table[row_index-1][column_index-1].

Else it is (minimum of three values adjacent to the field)+1 as shown in Image:2

edit_table_2
Image: 2

The operation which we are performing is dependent on the field which has the minimum value.

  • REPLACE
    • condition: minimum is table[row_index-1][column_index-1] and original_string[column_index]!=final_string[row_index]
    • action: “replace” original_string[column_index] with final_string[row_index]
  • ADD
    • condition: minimum is table[row_index-1][column_index]
    • action: “add” final_string[row_index]
  • DELETE
    • condition: minimum is table[row_index][column_index-1] =>
    • action: “delete” original_string[column_index]
edit_table_1-1
Image: 3

Then we fill up the whole table row by row as-

edit_table_3
Image:4
edit_table_5
Image:5
edit_table_5
Image:6

Our Final result is table[last_row][last_column]

In this example, it is 3.

This solves the number of operation required to transform one string into another. Time complexity for this algorithm is O(n*m) where n and m are the sizes of original string and desired string respectively.

Another thing which can also be figured out with the same table generated above is the operations required to get the desired string.

If we need to trace the operations which are needed to achieve this, we trace back the table from table[lasr_row][last_column]. We trace from where did we get the value for this field on the basis of rules which we defined earlier. For this example, it is as in image 7

trace-back
Image:7

In backward order it is –

  • replace m with s
  • replace o with i
  • pass
  • delete n
  • pass
  • pass

On the original string following operation needs to be done to get the desired string-

delete n->replace o with i->replace m with s

Variation of the same problem:

Other variation of the same problem can vary if edit,remove and add operations have different weights(cost).

Uses of this algorithm

Search engine

  • did you mean: please

Source code

Source for this problem along with few other algorithms can be found here

source code link

Suggestions and Edits

All the suggestions and edits are welcome. Please feel free to email me at Email address

Credits

Implementation of this algorithm is inspired by Tushar Roy’s Youtube channel .

I will be back with new algorithm tomorrow, till then bye bye and happy coding.

Advertisements

One thought on “Edit distance”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s