Fuzzy search of textual information is the search for strings similar or close to the search query.
At the same time, the degree of fuzziness or similarity of text strings is most often assessed using the edit distance (Levenshtein distance). And the edit distance of two strings is the minimum number of character substitution, insertion, and deletion operations required to transform one string into another. A transposition of two adjacent characters can also be considered as a valid editing operation (Damerau-Levenshtein distance).
A fuzzy search in DOCX example for a given fuzziness value of 2 edits is the search query "trees" and the search result "these". Here the character "r" is replaced by "h" and the characters "e" and "s" are transposed. That is, the Damerau-Levenshtein distance for these two words is 2 in this fuzzy search in DOCX example.
The following methods are most often used to implement fuzzy search:
In this application, to get a fuzzy match of words in DOCX, you need to specify the required number of mistakes (fuzziness value) from 1 to 9 characters. You can also set the option to search for words with only a minimum number of differences, or to search for all words within a given number of differences.
The GroupDocs.Search library has many other fuzzy matching options. For example, you can set the number of differences between words as a linear function of word length, or even set the number of differences individually for each value of word length.
You can also perform fuzzy search in many other file formats. Please see the full list below.