Categories to calculate the edit distance between
#import <MDCDamerauLevenshtein/MDCDamerauLevenshtein.h> [@"Central Park" mdc_levenshteinDistanceTo:@"Centarl Prak"]; // => 4 [@"Central Park" mdc_damerauLevenshteinDistanceTo:@"Centarl Prak"]; // => 2
MDCDamerauLevenshtein includes two algorithms for calculating the edit distance between NSString objects:
- Levenshtein distance calculates the number of insertions, deletions, and substitions necessary in order to convert one string into the other.
- Damerau-Levenshtein improves upon Levenshtein to include the transposition of two adjacent characters. Damerau states that some combination of the four operations make up for 80% of all human spelling errors.
Potential applications for this library:
- Don't just use
-[NSString compare:options:]to filter search results, display terms with small edit distances.
- ...and many more!
Benchmarking Against Other Implmentations
The benchmarking app is included in this repository. It consists of two benchmarks:
- Normal: Finding the Levenshtein distance between "sitting" and "kitten"
- Large: Finding the Levenshtein distance between two paragraphs of text (409 and 728 characters, respectively)
|Library||Avg. Time (Normal)||Avg. Time (Large)|
|MDCDamerauLevenshtein||14,218 nanoseconds||0.0792383 seconds|
|NSString+LevenshteinDistance||17,812 nanoseconds (25% slower)||0.0949104 seconds (20% slower)|
koyachi/NSString-LevenshteinDistance only computes Levenshtein distance, not Damerau-Levenshtein, so only Levenshtein benchmarks are included here. The project does not include unit tests, but when benchmarked it produced correct distances.