Edit distance is a very famous algorithm to get the fuzziness between two strings. Fuzziness means the difference between two strings or in simple words, the number of replaces, add or delete operation to be used on a string to transform it into another string.
The number of operation required converting ‘cat’ into ‘rat’ is 1.
That is replace ‘c’ with ‘r’.
This problem can be solved using dynamic programming. Approach to solving this problem is a bottom-up calculation. We will solve this problem with an example.
Example: convert ‘random’ to ‘radis’.
For solving the problem we make a two-dimensional array of size (n+1)*(m+1), where n is the size of final string and m is the size of the original string. Then we initialize the array’s first row from 0 to n. and the first column from 0 to m. (Image:1)
n = 6 [ length of ‘random’]
m = 5 [ length of ‘radis’]
The table is initialized with these number according to the following rule. If we go to row 0 and start traversing from column 0 to column 6, it means that the number of edits required converting
‘ ‘ to ‘ ‘ is 0.
‘r’ to ‘ ‘ is 1
‘ra’ to ‘ ‘ is 2
‘ran’ to ‘ ‘ is 3 and so on.
And in the same way, if we traverse in column 0 from row 0 to row 5, it means the number of edits required converting
‘ ‘ to ‘r’ is 1
‘ ‘ to ‘ra’ is 2
‘ ‘ to ‘rad’ is 3 and so on.
Then we need to fill the rest of the table row by row. The value in the field is dependent on following parameters
If row index value of the final string is same as column index value of the original string, then the value is same as table[row_index-1][column_index-1].
Else it is (minimum of three values adjacent to the field)+1 as shown in Image:2
The operation which we are performing is dependent on the field which has the minimum value.
- condition: minimum is table[row_index-1][column_index-1] and original_string[column_index]!=final_string[row_index]
- action: “replace” original_string[column_index] with final_string[row_index]
- condition: minimum is table[row_index-1][column_index]
- action: “add” final_string[row_index]
- condition: minimum is table[row_index][column_index-1] =>
- action: “delete” original_string[column_index]
Then we fill up the whole table row by row as-
Our Final result is table[last_row][last_column]
In this example, it is 3.
This solves the number of operation required to transform one string into another. Time complexity for this algorithm is O(n*m) where n and m are the sizes of original string and desired string respectively.
Another thing which can also be figured out with the same table generated above is the operations required to get the desired string.
If we need to trace the operations which are needed to achieve this, we trace back the table from table[lasr_row][last_column]. We trace from where did we get the value for this field on the basis of rules which we defined earlier. For this example, it is as in image 7
In backward order it is –
- replace m with s
- replace o with i
- delete n
On the original string following operation needs to be done to get the desired string-
delete n->replace o with i->replace m with s
Variation of the same problem:
Other variation of the same problem can vary if edit,remove and add operations have different weights(cost).
Uses of this algorithm
- did you mean: please
Source for this problem along with few other algorithms can be found here
Suggestions and Edits
All the suggestions and edits are welcome. Please feel free to email me at Email address
Implementation of this algorithm is inspired by Tushar Roy’s Youtube channel .
I will be back with new algorithm tomorrow, till then bye bye and happy coding.