[Golang] Wagner-Fischer algorithm for Edit Distance of Two Strings


Go implementation of Wagner-Fischer algorithm that computes the edit distance between two strings of characters. The edit distance here is Levenshtein distance, the minimum number of single-character edits (insertions, deletions or substitutions) required to change one string into the other.

Run Code on Go Playground

The following code is utility for converting a string to rune slice, and minimum value of three integers.

util.go | repository | view raw
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
package ld

func StringToRuneSlice(s string) []rune {
	var r []rune
	for _, runeValue := range s {
		r = append(r, runeValue)
	}
	return r
}

func minimum(a, b, c int) int {
	min := a
	if min > b {
		min = b
	}
	if min > c {
		min = c
	}
	return min
}

The Wagner-Fischer algorithm:

wf.go | repository | view raw
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
package ld

// The edit distances here is actually Levenshtein distance, minimum number of
// single-character edits (insertions, deletions or substitutions) required to
// change one string into the other.
func EditDistance(s, t string) int {
	ss := StringToRuneSlice(s)
	m := len(ss) + 1
	tt := StringToRuneSlice(t)
	n := len(tt) + 1

	d := make([][]int, m)
	for i := range d {
		d[i] = make([]int, n)
	}

	for i := 0; i < m; i++ {
		d[i][0] = i
	}

	for j := 0; j < n; j++ {
		d[0][j] = j
	}

	for j := 1; j < n; j++ {
		for i := 1; i < m; i++ {
			var substitutionCost int
			if ss[i-1] == tt[j-1] {
				substitutionCost = 0
			} else {
				substitutionCost = 1
			}
			d[i][j] = minimum(
				d[i-1][j]+1,
				d[i][j-1]+1,
				d[i-1][j-1]+substitutionCost,
			)
		}
	}

	return d[m-1][n-1]
}

Testing:

wf_test.go | repository | view raw
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
package ld

import (
	"testing"
)

func TestEditDistance(t *testing.T) {
	if EditDistance("soccer", "otter") != 3 {
		t.Error(`"soccer", "otter"`)
	}

	if EditDistance("你好他他是誰", "好我我是誰") != 3 {
		t.Error(`"你好他他是誰", "好我我是誰"`)
	}

	if EditDistance("kitten", "sitten") != 1 {
		t.Error(`"kitten", "sitten"`)
	}

	if EditDistance("sitten", "sittin") != 1 {
		t.Error(`"sitten", "sittin"`)
	}

	if EditDistance("sittin", "sitting") != 1 {
		t.Error(`"sittin", "sitting"`)
	}

	if EditDistance("sort", "some") != 2 {
		t.Error(`"sort", "some"`)
	}

	if EditDistance("sort", "same") != 3 {
		t.Error(`"sort", "same"`)
	}

	if EditDistance("sort", "soft") != 1 {
		t.Error(`"sort", "soft"`)
	}
}

For recursive implementation of Levenshtein distance, see [1].


Tested on:


References:

[1][Golang] Levenshtein Distance Recursive Implementation
[2][Golang] Initialize Two Dimensional Array/Slice
[3]How do I know when recursive calls have all been completed? : golang
[4]How is infinite recursion and stack allocation controlled in Golang? : golang