isub.pl -- isub: a string similarity measure
The library(isub) implements a similarity measure between strings, i.e., something similar to the Levenshtein distance. This method is based on the length of common substrings.
- isub(+Text1:text, +Text2:text, -Similarity:float, +Options:list) is det
- Similarity is a measure of the similarity/dissimilarity between
Text1 and Text2. E.g.
?- isub('E56.Language', 'languange', D, [normalize(true)]). D = 0.4226950354609929. % [-1,1] range ?- isub('E56.Language', 'languange', D, [normalize(true),zero_to_one(true)]). D = 0.7113475177304964. % [0,1] range ?- isub('E56.Language', 'languange', D, []). % without normalization D = 0.19047619047619047. % [-1,1] range ?- isub(aa, aa, D, []). % does not work for short substrings D = -0.8. ?- isub(aa, aa, D, [substring_threshold(0)]). % works with short substrings D = 1.0. % but may give unwanted values % between e.g. 'store' and 'spore'. ?- isub(joe, hoe, D, [substring_threshold(0)]). D = 0.5315315315315314. ?- isub(joe, hoe, D, []). D = -1.0.
This is a new version of isub/4 which replaces the old version while providing backwards compatibility. This new version allows several options to tweak the algorithm.
Undocumented predicates
The following predicates are exported, but not or incorrectly documented.