commit | dcd0fc209f368792aa8048c145b311ea5aa2aa93 | [log] [tgz] |
---|---|---|
author | Dimitri Papadopoulos <3234522+DimitriPapadopoulos@users.noreply.github.com> | Sun Mar 26 20:25:21 2023 +0200 |
committer | Dimitri Papadopoulos <3234522+DimitriPapadopoulos@users.noreply.github.com> | Sun Mar 26 20:25:21 2023 +0200 |
tree | 9e8b6002e1f34933b91772533de4c34b9c80a21b | |
parent | 2bf1d4bd7956e5b38947f2987787acd749f5ad1a [diff] |
UTF-8 is implicit in Python 3
jellyfish is a library for approximate & phonetic matching of strings.
Source: https://github.com/jamesturk/jellyfish
Documentation: https://jamesturk.github.io/jellyfish/
Issues: https://github.com/jamesturk/jellyfish/issues
String comparison:
Phonetic encoding:
>>> import jellyfish >>> jellyfish.levenshtein_distance('jellyfish', 'smellyfish') 2 >>> jellyfish.jaro_distance('jellyfish', 'smellyfish') 0.89629629629629637 >>> jellyfish.damerau_levenshtein_distance('jellyfish', 'jellyfihs') 1 >>> jellyfish.metaphone('Jellyfish') 'JLFX' >>> jellyfish.soundex('Jellyfish') 'J412' >>> jellyfish.nysiis('Jellyfish') 'JALYF' >>> jellyfish.match_rating_codex('Jellyfish') 'JLLFSH'