0.10.0
1 file changed
tree: b6fd99cc638220e181aa073f6752f2e30f02e6b3
  1. .github/
  2. docs/
  3. jellyfish/
  4. .coveragerc
  5. .gitignore
  6. .gitmodules
  7. .pre-commit-config.yaml
  8. build-wheels.sh
  9. LICENSE
  10. MANIFEST.in
  11. mkdocs.yml
  12. README.md
  13. run-cov.sh
  14. setup.py
  15. tox.ini
  16. upload-releases.sh
README.md

Overview

jellyfish is a library for approximate & phonetic matching of strings.

Source: https://github.com/jamesturk/jellyfish

Documentation: https://jamesturk.github.io/jellyfish/

Issues: https://github.com/jamesturk/jellyfish/issues

PyPI badge Test badge Coveralls

Included Algorithms

String comparison:

  • Levenshtein Distance
  • Damerau-Levenshtein Distance
  • Jaro Distance
  • Jaro-Winkler Distance
  • Match Rating Approach Comparison
  • Hamming Distance

Phonetic encoding:

  • American Soundex
  • Metaphone
  • NYSIIS (New York State Identification and Intelligence System)
  • Match Rating Codex

Example Usage

>>> import jellyfish
>>> jellyfish.levenshtein_distance('jellyfish', 'smellyfish')
2
>>> jellyfish.jaro_distance('jellyfish', 'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance('jellyfish', 'jellyfihs')
1

>>> jellyfish.metaphone('Jellyfish')
'JLFX'
>>> jellyfish.soundex('Jellyfish')
'J412'
>>> jellyfish.nysiis('Jellyfish')
'JALYF'
>>> jellyfish.match_rating_codex('Jellyfish')
'JLLFSH'