打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Levenshtein distance

The Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertion, deletion, substitution) required to change one word into the other.

This works on both ASCII and UTF-8 encoding.This implementations use two-dimensional array to store the distances of prefixes of the words compared.


uses  Classes, SysUtils, Math, LCLProc; function LevenshteinDistance(const s1 : string; s2 : string) : integer;function LevenshteinDistanceText(const s1, s2: string): Integer; implementation {------------------------------------------------------------------------------  Name:    LevenshteinDistance  Params: s1, s2 - UTF8 encoded strings  Returns: Minimum number of single-character edits.  Compare 2 UTF8 encoded strings, case sensitive.------------------------------------------------------------------------------} function LevenshteinDistance(const s1 : string; s2 : string) : integer;var  length1, length2, i, j ,  value1, value2, value3 : integer;  matrix : array of array of integer;begin  length1 := UTF8Length( s1 );  length2 := UTF8Length( s2 );  SetLength (matrix, length1 + 1, length2 + 1);  for i := 0 to length1 do matrix [i, 0] := i;  for j := 0 to length2 do matrix [0, j] := j;  for i := 1 to length1 do    for j := 1 to length2 do      begin        if UTF8Copy( s1, i, 1) = UTF8Copy( s2, j, 1 )          then matrix[i,j] := matrix[i-1,j-1]          else  begin            value1 := matrix [i-1, j] + 1;            value2 := matrix [i, j-1] + 1;            value3 := matrix[i-1, j-1] + 1;            matrix [i, j] := min( value1, min( value2, value3 ));          end;      end;  result := matrix [length1, length2];end; {------------------------------------------------------------------------------  Name:    LevenshteinDistanceText  Params: s1, s2 - UTF8 encoded strings  Returns: Minimum number of single-character edits.  Compare 2 UTF8 encoded strings, case insensitive.------------------------------------------------------------------------------}function LevenshteinDistanceText(const s1, s2: string): integer;var  s1lower, s2lower: string;begin  s1lower := UTF8LowerCase( s1 );  s2lower := UTF8LowerCase( s2 );  result := LevenshteinDistance( s1lower, s2lower );end; end.
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
字符串相似度算法(编辑距离算法 Levenshtein Distance)原理及C#代码实...
正方加密解密算法及获取密钥
华为OD两轮技术面试
Delphi如何粘贴HTML格式文本到Windows剪切板
!!!!! 字符串相似度算法
java乱码问题分析
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服