<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments for codejanitor</title>
	<atom:link href="http://codejanitor.com/wp/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://codejanitor.com/wp</link>
	<description>open source code and solutions</description>
	<pubDate>Sun, 14 Mar 2010 16:54:28 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Jerome S.</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2753</link>
		<dc:creator>Jerome S.</dc:creator>
		<pubDate>Thu, 25 Feb 2010 10:38:14 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2753</guid>
		<description>Thanks a lot for this algorithm implementation. 

After considering all the sources available on the web, I decided to use your implementation, and I have adapted it according to http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#The_algorithm so that I have a MySQL UDF for Damerau-Levenshtein.
Basically, you just have to add 3 lines between the 2 last lines of the main loop :
IF c &#62; c_temp THEN SET c = c_temp; END IF;
## Add here
SET cv0 = CONCAT(cv0, UNHEX(HEX(c))), j = j + 1;

Here are the lines to be added :
IF (i&#62;1 AND j&#62;1 AND s1_char = SUBSTRING(s2, j-1, 1) AND SUBSTRING(s1, i-1, 1) = SUBSTRING(s2, j, 1) THEN
   SET c_temp = CONV(HEX(SUBSTRING(cv1, j+1, 1)), 16, 10);
END IF
IF c &#62; c_temp THEN SET c = c_temp; END IF;

Please let me know if you think it is incorrect, but in my hands it seem to work.
Have fun!</description>
		<content:encoded><![CDATA[<p>Thanks a lot for this algorithm implementation. </p>
<p>After considering all the sources available on the web, I decided to use your implementation, and I have adapted it according to <a href="http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#The_algorithm" rel="nofollow">http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#The_algorithm</a> so that I have a MySQL UDF for Damerau-Levenshtein.<br />
Basically, you just have to add 3 lines between the 2 last lines of the main loop :<br />
IF c &gt; c_temp THEN SET c = c_temp; END IF;<br />
## Add here<br />
SET cv0 = CONCAT(cv0, UNHEX(HEX(c))), j = j + 1;</p>
<p>Here are the lines to be added :<br />
IF (i&gt;1 AND j&gt;1 AND s1_char = SUBSTRING(s2, j-1, 1) AND SUBSTRING(s1, i-1, 1) = SUBSTRING(s2, j, 1) THEN<br />
   SET c_temp = CONV(HEX(SUBSTRING(cv1, j+1, 1)), 16, 10);<br />
END IF<br />
IF c &gt; c_temp THEN SET c = c_temp; END IF;</p>
<p>Please let me know if you think it is incorrect, but in my hands it seem to work.<br />
Have fun!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Meer</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2686</link>
		<dc:creator>Meer</dc:creator>
		<pubDate>Fri, 29 Jan 2010 18:35:29 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2686</guid>
		<description>mmmm i give up. is there any way i can post without getting messed up by this blog's html parser?</description>
		<content:encoded><![CDATA[<p>mmmm i give up. is there any way i can post without getting messed up by this blog&#8217;s html parser?</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Meer</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2685</link>
		<dc:creator>Meer</dc:creator>
		<pubDate>Fri, 29 Jan 2010 18:31:13 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2685</guid>
		<description>mmm, again, the different operator doesn't work. changed it to !=

final code:



CREATE PROC EDIT_DISTANCE

@string1 nvarchar(3999),
@string2 nvarchar(3999),
@edit_distance int output

AS

BEGIN

IF @string1 = @string2
	BEGIN

	SET @edit_distance = 0
	RETURN

	END

DECLARE	@string1_len int,
		@string2_len int,
		@i int,
		@j int,
		@string1_char nchar,
		@eval int,
		@next_set varbinary(8000),
		@current_set varbinary(8000)

SELECT	@string1_len = LEN(@string1),
		@string2_len = LEN(@string2),
		@current_set = cast(0 as binary(2)),
		@j = 1,
		@i = 1,
		@edit_distance = 0

IF @string1_len = 0

	BEGIN

	SET @edit_distance = @string2_len
	RETURN

	END

IF @string2_len = 0

	BEGIN

	SET @edit_distance = @string1_len
	RETURN

	END

IF	Len(@string1) &#60; Len(@string2) AND CharIndex(@string1, @string2) != 0

	BEGIN

	SET @edit_distance = Len(@string2) - Len(@string1)
	RETURN

	END

IF	Len(@string2)  @string2_len

	SELECT	@current_set = @current_set + CAST(@j AS binary(2)),
			@j = @j + 1

WHILE @i !&#62; @string1_len

	BEGIN

	SELECT	@string1_char = SUBSTRING(@string1, @i, 1),
			@edit_distance = @i,
			@next_set = CAST(@i AS binary(2)),
			@j = 1

	WHILE @j !&#62; @string2_len

		BEGIN

		SET	@edit_distance = @edit_distance + 1

		SET	@eval = CAST(SUBSTRING(@current_set, 2 * @j - 1, 2) AS int) +
			CASE
				WHEN @string1_char = SUBSTRING(@string2, @j, 1)
					THEN 0
				ELSE
					1
			END

		IF @edit_distance &#62; @eval
			SET @edit_distance = @eval

		SET @eval = CAST(SUBSTRING(@current_set, 2 * @j + 1, 2) AS int)+1

		IF @edit_distance &#62; @eval
			SET @edit_distance = @eval

		SELECT @next_set = @next_set + CAST(@edit_distance AS binary(2)), @j = @j + 1

		END

	SELECT @current_set = @next_set, @i = @i + 1

	END

END
GO</description>
		<content:encoded><![CDATA[<p>mmm, again, the different operator doesn&#8217;t work. changed it to !=</p>
<p>final code:</p>
<p>CREATE PROC EDIT_DISTANCE</p>
<p>@string1 nvarchar(3999),<br />
@string2 nvarchar(3999),<br />
@edit_distance int output</p>
<p>AS</p>
<p>BEGIN</p>
<p>IF @string1 = @string2<br />
	BEGIN</p>
<p>	SET @edit_distance = 0<br />
	RETURN</p>
<p>	END</p>
<p>DECLARE	@string1_len int,<br />
		@string2_len int,<br />
		@i int,<br />
		@j int,<br />
		@string1_char nchar,<br />
		@eval int,<br />
		@next_set varbinary(8000),<br />
		@current_set varbinary(8000)</p>
<p>SELECT	@string1_len = LEN(@string1),<br />
		@string2_len = LEN(@string2),<br />
		@current_set = cast(0 as binary(2)),<br />
		@j = 1,<br />
		@i = 1,<br />
		@edit_distance = 0</p>
<p>IF @string1_len = 0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = @string2_len<br />
	RETURN</p>
<p>	END</p>
<p>IF @string2_len = 0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = @string1_len<br />
	RETURN</p>
<p>	END</p>
<p>IF	Len(@string1) &lt; Len(@string2) AND CharIndex(@string1, @string2) != 0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = Len(@string2) - Len(@string1)<br />
	RETURN</p>
<p>	END</p>
<p>IF	Len(@string2)  @string2_len</p>
<p>	SELECT	@current_set = @current_set + CAST(@j AS binary(2)),<br />
			@j = @j + 1</p>
<p>WHILE @i !&gt; @string1_len</p>
<p>	BEGIN</p>
<p>	SELECT	@string1_char = SUBSTRING(@string1, @i, 1),<br />
			@edit_distance = @i,<br />
			@next_set = CAST(@i AS binary(2)),<br />
			@j = 1</p>
<p>	WHILE @j !&gt; @string2_len</p>
<p>		BEGIN</p>
<p>		SET	@edit_distance = @edit_distance + 1</p>
<p>		SET	@eval = CAST(SUBSTRING(@current_set, 2 * @j - 1, 2) AS int) +<br />
			CASE<br />
				WHEN @string1_char = SUBSTRING(@string2, @j, 1)<br />
					THEN 0<br />
				ELSE<br />
					1<br />
			END</p>
<p>		IF @edit_distance &gt; @eval<br />
			SET @edit_distance = @eval</p>
<p>		SET @eval = CAST(SUBSTRING(@current_set, 2 * @j + 1, 2) AS int)+1</p>
<p>		IF @edit_distance &gt; @eval<br />
			SET @edit_distance = @eval</p>
<p>		SELECT @next_set = @next_set + CAST(@edit_distance AS binary(2)), @j = @j + 1</p>
<p>		END</p>
<p>	SELECT @current_set = @next_set, @i = @i + 1</p>
<p>	END</p>
<p>END<br />
GO</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Meer</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2684</link>
		<dc:creator>Meer</dc:creator>
		<pubDate>Fri, 29 Jan 2010 18:24:10 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2684</guid>
		<description>looks like minor_equal operators get messed while posting. changed them to !&#62;



--here is the code


CREATE PROC EDIT_DISTANCE

@string1 nvarchar(3999),
@string2 nvarchar(3999),
@edit_distance int output

AS

BEGIN

IF @string1 = @string2
	BEGIN

	SET @edit_distance = 0
	RETURN

	END

DECLARE	@string1_len int,
		@string2_len int,
		@i int,
		@j int,
		@string1_char nchar,
		@eval int,
		@next_set varbinary(8000),
		@current_set varbinary(8000)

SELECT	@string1_len = LEN(@string1),
		@string2_len = LEN(@string2),
		@current_set = cast(0 as binary(2)),
		@j = 1,
		@i = 1,
		@edit_distance = 0

IF @string1_len = 0

	BEGIN

	SET @edit_distance = @string2_len
	RETURN

	END

IF @string2_len = 0

	BEGIN

	SET @edit_distance = @string1_len
	RETURN

	END

IF	Len(@string1) &#60; Len(@string2) AND CharIndex(@string1, @string2)  0

	BEGIN

	SET @edit_distance = Len(@string2) - Len(@string1)
	RETURN

	END

IF	Len(@string2) &#60; Len(@string1) AND CharIndex(@string2, @string1)  0

	BEGIN

	SET @edit_distance = Len(@string1) - Len(@string2)
	RETURN

	END


WHILE @j !&#62; @string2_len

	SELECT	@current_set = @current_set + CAST(@j AS binary(2)),
			@j = @j + 1

WHILE @i !&#62; @string1_len

	BEGIN

	SELECT	@string1_char = SUBSTRING(@string1, @i, 1),
			@edit_distance = @i,
			@next_set = CAST(@i AS binary(2)),
			@j = 1

	WHILE @j !&#62; @string2_len

		BEGIN

		SET	@edit_distance = @edit_distance + 1

		SET	@eval = CAST(SUBSTRING(@current_set, 2 * @j - 1, 2) AS int) +
			CASE
				WHEN @string1_char = SUBSTRING(@string2, @j, 1)
					THEN 0
				ELSE
					1
			END

		IF @edit_distance &#62; @eval
			SET @edit_distance = @eval

		SET @eval = CAST(SUBSTRING(@current_set, 2 * @j + 1, 2) AS int)+1

		IF @edit_distance &#62; @eval
			SET @edit_distance = @eval

		SELECT @next_set = @next_set + CAST(@edit_distance AS binary(2)), @j = @j + 1

		END

	SELECT @current_set = @next_set, @i = @i + 1

	END

END
GO</description>
		<content:encoded><![CDATA[<p>looks like minor_equal operators get messed while posting. changed them to !&gt;</p>
<p>&#8211;here is the code</p>
<p>CREATE PROC EDIT_DISTANCE</p>
<p>@string1 nvarchar(3999),<br />
@string2 nvarchar(3999),<br />
@edit_distance int output</p>
<p>AS</p>
<p>BEGIN</p>
<p>IF @string1 = @string2<br />
	BEGIN</p>
<p>	SET @edit_distance = 0<br />
	RETURN</p>
<p>	END</p>
<p>DECLARE	@string1_len int,<br />
		@string2_len int,<br />
		@i int,<br />
		@j int,<br />
		@string1_char nchar,<br />
		@eval int,<br />
		@next_set varbinary(8000),<br />
		@current_set varbinary(8000)</p>
<p>SELECT	@string1_len = LEN(@string1),<br />
		@string2_len = LEN(@string2),<br />
		@current_set = cast(0 as binary(2)),<br />
		@j = 1,<br />
		@i = 1,<br />
		@edit_distance = 0</p>
<p>IF @string1_len = 0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = @string2_len<br />
	RETURN</p>
<p>	END</p>
<p>IF @string2_len = 0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = @string1_len<br />
	RETURN</p>
<p>	END</p>
<p>IF	Len(@string1) &lt; Len(@string2) AND CharIndex(@string1, @string2)  0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = Len(@string2) - Len(@string1)<br />
	RETURN</p>
<p>	END</p>
<p>IF	Len(@string2) &lt; Len(@string1) AND CharIndex(@string2, @string1)  0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = Len(@string1) - Len(@string2)<br />
	RETURN</p>
<p>	END</p>
<p>WHILE @j !&gt; @string2_len</p>
<p>	SELECT	@current_set = @current_set + CAST(@j AS binary(2)),<br />
			@j = @j + 1</p>
<p>WHILE @i !&gt; @string1_len</p>
<p>	BEGIN</p>
<p>	SELECT	@string1_char = SUBSTRING(@string1, @i, 1),<br />
			@edit_distance = @i,<br />
			@next_set = CAST(@i AS binary(2)),<br />
			@j = 1</p>
<p>	WHILE @j !&gt; @string2_len</p>
<p>		BEGIN</p>
<p>		SET	@edit_distance = @edit_distance + 1</p>
<p>		SET	@eval = CAST(SUBSTRING(@current_set, 2 * @j - 1, 2) AS int) +<br />
			CASE<br />
				WHEN @string1_char = SUBSTRING(@string2, @j, 1)<br />
					THEN 0<br />
				ELSE<br />
					1<br />
			END</p>
<p>		IF @edit_distance &gt; @eval<br />
			SET @edit_distance = @eval</p>
<p>		SET @eval = CAST(SUBSTRING(@current_set, 2 * @j + 1, 2) AS int)+1</p>
<p>		IF @edit_distance &gt; @eval<br />
			SET @edit_distance = @eval</p>
<p>		SELECT @next_set = @next_set + CAST(@edit_distance AS binary(2)), @j = @j + 1</p>
<p>		END</p>
<p>	SELECT @current_set = @next_set, @i = @i + 1</p>
<p>	END</p>
<p>END<br />
GO</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Meer</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2683</link>
		<dc:creator>Meer</dc:creator>
		<pubDate>Fri, 29 Jan 2010 18:20:26 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2683</guid>
		<description>This is my version:

[code]
CREATE PROC EDIT_DISTANCE

@string1 nvarchar(3999),
@string2 nvarchar(3999),
@edit_distance int output

AS

BEGIN

IF @string1 = @string2
	BEGIN

	SET @edit_distance = 0
	RETURN

	END

DECLARE	@string1_len int,
		@string2_len int,
		@i int,
		@j int,
		@string1_char nchar,
		@eval int,
		@next_set varbinary(8000),
		@current_set varbinary(8000)

SELECT	@string1_len = LEN(@string1),
		@string2_len = LEN(@string2),
		@current_set = cast(0 as binary(2)),
		@j = 1,
		@i = 1,
		@edit_distance = 0

IF @string1_len = 0

	BEGIN

	SET @edit_distance = @string2_len
	RETURN

	END

IF @string2_len = 0

	BEGIN

	SET @edit_distance = @string1_len
	RETURN

	END

IF	Len(@string1) &#60; Len(@string2) AND CharIndex(@string1, @string2)  0

	BEGIN

	SET @edit_distance = Len(@string2) - Len(@string1)
	RETURN

	END

IF	Len(@string2) &#60; Len(@string1) AND CharIndex(@string2, @string1)  0

	BEGIN

	SET @edit_distance = Len(@string1) - Len(@string2)
	RETURN

	END


WHILE @j &#60;= @string2_len

	SELECT	@current_set = @current_set + CAST(@j AS binary(2)),
			@j = @j + 1

WHILE @i &#60;= @string1_len

	BEGIN

	SELECT	@string1_char = SUBSTRING(@string1, @i, 1),
			@edit_distance = @i,
			@next_set = CAST(@i AS binary(2)),
			@j = 1

	WHILE @j  @eval
			SET @edit_distance = @eval

		SET @eval = CAST(SUBSTRING(@current_set, 2 * @j + 1, 2) AS int)+1

		IF @edit_distance &#62; @eval
			SET @edit_distance = @eval

		SELECT @next_set = @next_set + CAST(@edit_distance AS binary(2)), @j = @j + 1

		END

	SELECT @current_set = @next_set, @i = @i + 1

	END

END
GO
[/code]</description>
		<content:encoded><![CDATA[<p>This is my version:</p>
<p>[code]<br />
CREATE PROC EDIT_DISTANCE</p>
<p>@string1 nvarchar(3999),<br />
@string2 nvarchar(3999),<br />
@edit_distance int output</p>
<p>AS</p>
<p>BEGIN</p>
<p>IF @string1 = @string2<br />
	BEGIN</p>
<p>	SET @edit_distance = 0<br />
	RETURN</p>
<p>	END</p>
<p>DECLARE	@string1_len int,<br />
		@string2_len int,<br />
		@i int,<br />
		@j int,<br />
		@string1_char nchar,<br />
		@eval int,<br />
		@next_set varbinary(8000),<br />
		@current_set varbinary(8000)</p>
<p>SELECT	@string1_len = LEN(@string1),<br />
		@string2_len = LEN(@string2),<br />
		@current_set = cast(0 as binary(2)),<br />
		@j = 1,<br />
		@i = 1,<br />
		@edit_distance = 0</p>
<p>IF @string1_len = 0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = @string2_len<br />
	RETURN</p>
<p>	END</p>
<p>IF @string2_len = 0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = @string1_len<br />
	RETURN</p>
<p>	END</p>
<p>IF	Len(@string1) &lt; Len(@string2) AND CharIndex(@string1, @string2)  0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = Len(@string2) - Len(@string1)<br />
	RETURN</p>
<p>	END</p>
<p>IF	Len(@string2) &lt; Len(@string1) AND CharIndex(@string2, @string1)  0</p>
<p>	BEGIN</p>
<p>	SET @edit_distance = Len(@string1) - Len(@string2)<br />
	RETURN</p>
<p>	END</p>
<p>WHILE @j &lt;= @string2_len</p>
<p>	SELECT	@current_set = @current_set + CAST(@j AS binary(2)),<br />
			@j = @j + 1</p>
<p>WHILE @i &lt;= @string1_len</p>
<p>	BEGIN</p>
<p>	SELECT	@string1_char = SUBSTRING(@string1, @i, 1),<br />
			@edit_distance = @i,<br />
			@next_set = CAST(@i AS binary(2)),<br />
			@j = 1</p>
<p>	WHILE @j  @eval<br />
			SET @edit_distance = @eval</p>
<p>		SET @eval = CAST(SUBSTRING(@current_set, 2 * @j + 1, 2) AS int)+1</p>
<p>		IF @edit_distance &gt; @eval<br />
			SET @edit_distance = @eval</p>
<p>		SELECT @next_set = @next_set + CAST(@edit_distance AS binary(2)), @j = @j + 1</p>
<p>		END</p>
<p>	SELECT @current_set = @next_set, @i = @i + 1</p>
<p>	END</p>
<p>END<br />
GO<br />
[/code]</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Henry Days</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2205</link>
		<dc:creator>Henry Days</dc:creator>
		<pubDate>Fri, 04 Sep 2009 01:02:57 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2205</guid>
		<description>Good contribution!
Thanks!</description>
		<content:encoded><![CDATA[<p>Good contribution!<br />
Thanks!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Bitte um Feedback zu Buchverwaltungsideen - php.de</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2107</link>
		<dc:creator>Bitte um Feedback zu Buchverwaltungsideen - php.de</dc:creator>
		<pubDate>Wed, 29 Jul 2009 18:01:41 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2107</guid>
		<description>[...] MySQL implementiert soweit ich weiß. Daher könnte vielleicht das hier für dich interessant sein: codejanitor Levenshtein Distance as a MySQL Stored Function  Die Datenbank normalisiert könnte nun so aussehen:  author id &#124; name  book id &#124; title  publisher  [...]</description>
		<content:encoded><![CDATA[<p>[...] MySQL implementiert soweit ich weiß. Daher könnte vielleicht das hier für dich interessant sein: codejanitor Levenshtein Distance as a MySQL Stored Function  Die Datenbank normalisiert könnte nun so aussehen:  author id | name  book id | title  publisher  [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Sean</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2099</link>
		<dc:creator>Sean</dc:creator>
		<pubDate>Fri, 10 Jul 2009 05:54:05 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2099</guid>
		<description>@Aaron 'took 1 seconds'. That's not bad for 70,000 rows. You should avoid doing table scans if you're planning to do this sort of thing online. Your SQL is a bit odd - you seem to be allowing for an edit distance of 10 in a 5 character string. What I do with one of my projects is search the table on string length first, to limit the number of times the levenshtein function is invoked. I use a version of Levenshtein that returns early if an edit distance limit is met, so it doesn't check whole strings when they're already too distant from each other. Also the data is already LOWER()ed in the database, that would save you 70,000 LOWER()s in your application, each time you check a zip code.

I'd be interested to know if the hacks get you an appreciable speedup - the damlev UDFs (and some Java) versions are on my blog - linked above.</description>
		<content:encoded><![CDATA[<p>@Aaron &#8216;took 1 seconds&#8217;. That&#8217;s not bad for 70,000 rows. You should avoid doing table scans if you&#8217;re planning to do this sort of thing online. Your SQL is a bit odd - you seem to be allowing for an edit distance of 10 in a 5 character string. What I do with one of my projects is search the table on string length first, to limit the number of times the levenshtein function is invoked. I use a version of Levenshtein that returns early if an edit distance limit is met, so it doesn&#8217;t check whole strings when they&#8217;re already too distant from each other. Also the data is already LOWER()ed in the database, that would save you 70,000 LOWER()s in your application, each time you check a zip code.</p>
<p>I&#8217;d be interested to know if the hacks get you an appreciable speedup - the damlev UDFs (and some Java) versions are on my blog - linked above.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Aaron D. Campbell</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2075</link>
		<dc:creator>Aaron D. Campbell</dc:creator>
		<pubDate>Sun, 28 Jun 2009 14:50:08 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2075</guid>
		<description>I have a zip code database for the US.  When I do a query like this it took almost 30 seconds:

SELECT  `zipcode` ,  `city` ,  `statecode` , (
	( 5 * levenshtein(
		'85326', LOWER( `zipcode` ) )
	)
) AS ld
FROM `zipcode`
HAVING `ld` &#60;50
ORDER BY `ld` , `city`
LIMIT 50


The same query after compiling and installing the UDF from http://joshdrew.com/ took .1 seconds.  I don't know if it's related to the all number issue mentioned above or if it's because it's an 8.7M table with 70k+ rows, but it's definitely too slow for me to use.</description>
		<content:encoded><![CDATA[<p>I have a zip code database for the US.  When I do a query like this it took almost 30 seconds:</p>
<p>SELECT  `zipcode` ,  `city` ,  `statecode` , (<br />
	( 5 * levenshtein(<br />
		&#8216;85326&#8242;, LOWER( `zipcode` ) )<br />
	)<br />
) AS ld<br />
FROM `zipcode`<br />
HAVING `ld` &lt;50<br />
ORDER BY `ld` , `city`<br />
LIMIT 50</p>
<p>The same query after compiling and installing the UDF from <a href="http://joshdrew.com/" rel="nofollow">http://joshdrew.com/</a> took .1 seconds.  I don&#8217;t know if it&#8217;s related to the all number issue mentioned above or if it&#8217;s because it&#8217;s an 8.7M table with 70k+ rows, but it&#8217;s definitely too slow for me to use.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Levenshtein Distance as a MySQL Stored Function by Morgan</title>
		<link>http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/comment-page-1/#comment-2043</link>
		<dc:creator>Morgan</dc:creator>
		<pubDate>Thu, 18 Jun 2009 06:41:41 +0000</pubDate>
		<guid isPermaLink="false">http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/#comment-2043</guid>
		<description>Good work. I add this part to check those null values. Thanks

DECLARE cv0, cv1 VARBINARY(256); 
SET cv1 = 0x00, j = 1, i = 1, c = 0; 
IF s1 IS NULL THEN SET s1_len =0; 
ELSE SET s1_len = CHAR_LENGTH(s1); 
END IF; 
IF s2 IS NULL THEN SET s2_len =0; 
ELSE SET s2_len = CHAR_LENGTH(s2); 
END IF;</description>
		<content:encoded><![CDATA[<p>Good work. I add this part to check those null values. Thanks</p>
<p>DECLARE cv0, cv1 VARBINARY(256);<br />
SET cv1 = 0&#215;00, j = 1, i = 1, c = 0;<br />
IF s1 IS NULL THEN SET s1_len =0;<br />
ELSE SET s1_len = CHAR_LENGTH(s1);<br />
END IF;<br />
IF s2 IS NULL THEN SET s2_len =0;<br />
ELSE SET s2_len = CHAR_LENGTH(s2);<br />
END IF;</p>
]]></content:encoded>
	</item>
</channel>
</rss>
