Edit distance

closed account (1vf9z8AR)
Question link-
https://www.spoj.com/problems/EDIST/

Question description-
Minimum no of operations to convert one string to another.
Operations being replace, remove, insert.

Algorithm used-
Levenshtein Distance

Problem-
I tried the possible edge cases of empty strings and code works correct but still, I am getting the wrong answer on spoj. What am I missing here?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
  #include<iostream>
using namespace std;
int edit[3000][3000];
int min(int a,int b,int c)
{
	if(a<b && a<c)
		return a;
	else if(b<a && b<c)
		return b;
	else
		return c;
}
int editdist(string str,string sub)
{
	int row,col;
	row=1;
	col=1;
	while(row<=str.size())
	{
		while(col<=sub.size())
		{
			if(str[row-1]==sub[col-1])
				edit[row][col]=edit[row-1][col-1];
			else
				edit[row][col]=min(edit[row][col-1],edit[row-1][col-1],edit[row-1][col])+1;
			col++;
		}
		row++;
		col=1;
	}
}
int main()
{
	edit[0][0]=0;
	for(int i=1;i<2005;i++)
	{
		edit[0][i]=i;
		edit[i][0]=i;
	}
	int t;
	cin>>t;
	cin.ignore();
	while(t>0)
	{
		string str,sub;
		getline(cin,str);
		getline(cin,sub);
		editdist(str,sub);
		cout<<edit[str.size()][sub.size()]<<endl;
		t--;
	}
}
Last edited on
So have you tried to debug it?

Poor man's debugger is to put cout statements in various places.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
int min(int a,int b,int c)
{
	if(a<b && a<c)
		return a;
	else if(b<a && b<c)
		return b;
	else
		return c;
}
int editdist(string str,string sub)
{
	int row,col;
	row=1;
	col=1;
	while(row<=str.size())
	{
		while(col<=sub.size())
		{
			if(str[row-1]==sub[col-1]) {
				edit[row][col]=edit[row-1][col-1];
			} else {
				cout << "DEBUG:" << str[row-1] << "!=" << sub[col-1] << endl;
				// print anything else which interests you
				edit[row][col]=min(edit[row][col-1],edit[row-1][col-1],edit[row-1][col])+1;
			}
			col++;
		}
		row++;
		col=1;
	}
}

int main ( ) {
    // test your min function
    cout << min(1,2,3) << endl;
    cout << min(2,1,3) << endl;
    cout << min(3,2,1) << endl;

    // test your editdist function
    string str = "ABC", sub = "BCD";
    editdist(str,sub);
    cout<<edit[str.size()][sub.size()]<<endl;
}

What you're trying to find is where the program reality diverges from your expectation of what is supposed to happen. At that point, you've found a bug.

Or use a real debugger.
A real debugger means you can single step the code and examine variables interactively, instead of the "rinse and repeat" cycle of figuring out "where and what" for your next DEBUG statement.
Poor man's debugger is to put cout statements in various places.

This hits hard in the feels, still use the poor man's man one top of a real one..
Last edited on
closed account (1vf9z8AR)
@salem c

Yes, I double checked everything before asking here.
1
2
3
4
5
6
7
8
9
10
11
$ g++ -Wall -Wextra proj.cpp
proj.cpp: In function ‘int editdist(std::__cxx11::string, std::__cxx11::string)’:
proj.cpp:18:11: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
  while(row<=str.size())
           ^
proj.cpp:20:12: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   while(col<=sub.size())
            ^
proj.cpp:35:1: warning: no return statement in function returning non-void [-Wreturn-type]
 }
 ^


https://en.wikipedia.org/wiki/Levenshtein_distance#Definition
Your code doesn't look like this either.

https://planetcalc.com/1721/
THECATSATONTHEMAT
THEQUICKBROWNFOXJUMPS
15


Your code.
THECATSATONTHEMAT
THEQUICKBROWNFOXJUMPS
17
closed account (1vf9z8AR)
@salem c
ideone didn't give me those errors. what compiler are you using?
The command line says it's g++.

If you want full control, then install a compiler on your local machine.
The online things are fine for quick hacks when you're away from your desk, but they're no substitute for a proper development environment.

Or use http://cpp.sh/
Which at least gives you a few toys to play with.

closed account (1vf9z8AR)
but according to videos on youtube
if character same copy diagonal value
else
min value of top,diagonal,left +1
The only guarantee of a YT video is the consumption of bandwidth and disk space.

SPOJ is telling you that your algorithm is wrong.
Wikipedia is telling you that your algorithm is wrong.
Planetcalc is telling you that your algorithm is wrong.

It may well work, and give useful answers.
But it ISN'T what SPOJ is looking for.

You could have implemented a nice functioning game of chess, and it would still be wrong as a test for https://www.spoj.com/problems/EDIST/

closed account (1vf9z8AR)
Even this is giving wrong answer.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
 #include<iostream>
using namespace std;
int edit[3000][3000];
int minimum(int a,int b,int c)
{
	if(a<b && a<c)
		return a;
	else if(b<a && b<c)
		return b;
	else
		return c;
}
void editdist(string str,string sub)
{
	int l1=str.size(),l2=sub.size(),cost;
	for(int i=1;i<=l1;i++)
    {
        for(int j=1;j<=l2;j++)
        {
            if(str[i-1]==sub[j-1])
                cost=0;
            else
                cost=1;
            edit[i][j]=minimum(edit[i-1][j]+1,edit[i][j-1]+1,edit[i-1][j-1]+cost);
        }
    }
}
int main()
{
	edit[0][0]=0;
	for(int i=1;i<3000;i++)
	{
		edit[0][i]=i;
		edit[i][0]=i;
	}
	int t;
	cin>>t;
	cin.ignore();
	while(t>0)
	{
		string str,sub;
		getline(cin,str);
		getline(cin,sub);
		editdist(str,sub);
		cout<<edit[str.size()][sub.size()]<<endl;
		t--;
	}
	return 0;
}

Try this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <algorithm>
#include <iostream>
#include <vector>

int levenshtein_distance(std::string const& s0, std::string const& s1)
{
    std::vector<int> v0(s1.size() + 1), v1(s1.size() + 1);

    for (std::size_t i = 0; i < v0.size(); ++i)
        v0[i] = i;

    for (std::size_t i = 0; i < s0.size(); ++i)
    {
        v1[0] = i + 1;

        for (std::size_t j = 0; j < s1.size(); ++j)
            v1[j + 1] = std::min({ v1[j]     + 1,
                                   v0[j + 1] + 1,
                                   v0[j]     + (s0[i] != s1[j]) });
        std::swap(v0, v1);
    }

    return v0.back();
}

int main() {
  { int i; std::cin >> i; }

  for (std::string a, b; std::cin >> a >> b; )
    std::cout << levenshtein_distance(a, b) << '\n';
}
Last edited on
closed account (1vf9z8AR)
@mbozzi
this is the two-row method.

I am trying to get the matrix method correct first then will try this.

This is my implementation of the two row method though. Its coming wrong too:(
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
 #include<iostream>
 #include<vector>
using namespace std;
int minimum(int a,int b)
{
	if(a<b)
        return a;
    return b;
}
int editdist(string str,string sub)
{
    int l1=str.size(),l2=sub.size();
    if(l1==0)
        return l2;
    else if(l2==0)
        return l1;
    vector<int>v1(l2+1),v2(l2+1);
    for(int i=0;i<=l2;i++)
        v1[i]=i;
    for(int i=1;i<=l1;i++)
    {
        v2[0]=i;
        int cost;
        for(int j=1;j<=l2;j++)
        {
            if(str[i-1]==sub[j-1])
                cost=0;
            else
                cost=1;
            v2[j]=minimum(v1[j-1]+cost,minimum(v2[j-1]+1,v1[j]+1));
        }
        for(int k=0;k<=l2;k++)
            v1[k]=v2[k];
    }
    return v1[l2];
}
int main()
{
	int t;
	cin>>t;
	cin.ignore();
	while(t>0)
	{
		string str,sub;
		getline(cin,str);
		getline(cin,sub);
		cout<<editdist(str,sub)<<endl;
		t--;
	}
	return 0;
}

Last edited on
closed account (1vf9z8AR)
The issue was using getline instead of cin
Topic archived. No new replies allowed.