elasticlunr vs fuse.js vs fuzzy-search vs fuzzysearch vs fuzzysort
JavaScript Fuzzy Search and Indexing Libraries
elasticlunrfuse.jsfuzzy-searchfuzzysearchfuzzysortSimilar Packages:

JavaScript Fuzzy Search and Indexing Libraries

Fuzzy search and indexing libraries in JavaScript provide tools for searching and retrieving data with approximate matching, allowing for errors, typos, or variations in the search terms. These libraries are useful for implementing search functionality that is more forgiving and user-friendly, especially when dealing with large datasets or when the input may not be perfectly accurate. They use algorithms that calculate the similarity between strings and return results based on a defined threshold, making them ideal for applications like search engines, autocomplete features, and data retrieval systems. elasticlunr is a lightweight, full-text search library that creates an inverted index for fast searching, supporting stemming, tokenization, and custom analyzers. fuse.js is a powerful, lightweight fuzzy-search library that performs client-side searches with high accuracy, allowing for configurable matching algorithms, scoring, and support for nested objects. fuzzy-search is a simple and efficient library for performing fuzzy searches on arrays of strings or objects, offering a straightforward API and customizable matching logic. fuzzysearch is a tiny, fast utility for checking if a substring exists within a string with fuzzy matching, ideal for quick and lightweight searches without the overhead of complex algorithms. fuzzysort is a high-performance fuzzy sorting library that ranks results based on their similarity to the search term, providing fast and accurate matches with a focus on sorting rather than just searching.

Npm Package Weekly Downloads Trend

3 Years

Github Stars Ranking

Stat Detail

Package
Downloads
Stars
Size
Issues
Publish
License
elasticlunr29,9542,070-7710 years agoMIT
fuse.js020,222312 kB2a month agoApache-2.0
fuzzy-search0231-166 years agoISC
fuzzysearch02,745-511 years agoMIT
fuzzysort04,28945.6 kB142 years agoMIT

Feature Comparison: elasticlunr vs fuse.js vs fuzzy-search vs fuzzysearch vs fuzzysort

Search Algorithm

  • elasticlunr:

    elasticlunr uses an inverted index for full-text search, allowing for fast retrieval of documents based on indexed terms. It supports stemming and tokenization, which helps in matching terms more effectively, even if they are not an exact match.

  • fuse.js:

    fuse.js employs a fuzzy matching algorithm that calculates the similarity between the search term and the target strings. It allows for configurable thresholds, weights, and scoring, making it highly flexible and accurate for searching within complex data structures.

  • fuzzy-search:

    fuzzy-search implements a simple fuzzy matching algorithm that compares the search term with the target strings or object properties. It provides a straightforward approach to finding matches with a customizable level of fuzziness, making it efficient for quick searches.

  • fuzzysearch:

    fuzzysearch uses a fast substring search algorithm that allows for approximate matching of a substring within a string. It is designed for quick lookups and is particularly efficient for checking the presence of fuzzy matches without extensive processing.

  • fuzzysort:

    fuzzysort utilizes a high-performance fuzzy matching algorithm that ranks results based on their similarity to the search term. It focuses on sorting the matches by their fuzziness score, providing a fast and accurate way to retrieve and display results.

Data Structure Support

  • elasticlunr:

    elasticlunr supports indexing and searching through plain text, as well as structured data like objects and arrays. It allows for custom tokenization and analysis, making it versatile for different types of data.

  • fuse.js:

    fuse.js is designed to work with various data structures, including arrays of strings, objects, and nested data. It provides flexible options for specifying which fields to search, making it suitable for complex and hierarchical data.

  • fuzzy-search:

    fuzzy-search works with arrays of strings and objects, allowing for fuzzy matching on both simple and complex data. It supports searching through object properties, making it adaptable to different data formats.

  • fuzzysearch:

    fuzzysearch operates on strings, making it ideal for searching within text data. It is focused on substring matching and does not support complex data structures, which keeps it simple and efficient.

  • fuzzysort:

    fuzzysort is compatible with arrays of strings and objects, providing fuzzy matching and sorting capabilities. It can handle both simple and structured data, making it versatile for various applications.

Customization

  • elasticlunr:

    elasticlunr allows for customization of the indexing and searching process, including the ability to define custom tokenizers, analyzers, and scoring algorithms. This flexibility enables developers to tailor the search functionality to meet specific requirements.

  • fuse.js:

    fuse.js offers extensive customization options for the search algorithm, including the ability to adjust the fuzziness level, weight different fields, and define custom matching functions. This makes it highly adaptable to different use cases and data types.

  • fuzzy-search:

    fuzzy-search provides basic customization features, such as adjusting the fuzziness level and specifying which properties to search in objects. However, it is more limited compared to other libraries, focusing on simplicity and ease of use.

  • fuzzysearch:

    fuzzysearch is minimalistic and does not offer much customization beyond adjusting the fuzziness of the substring search. Its simplicity is its strength, making it easy to use without complex configuration.

  • fuzzysort:

    fuzzysort allows for some customization of the matching and sorting process, including the ability to adjust the fuzziness threshold and define custom scoring functions. It is designed to be fast and efficient while still providing flexibility for developers.

Performance

  • elasticlunr:

    elasticlunr is optimized for fast search performance, especially after the initial indexing phase. The use of an inverted index allows for quick lookups, making it suitable for real-time search applications with moderate-sized datasets.

  • fuse.js:

    fuse.js is performant for client-side searches, but its speed can vary depending on the size of the dataset and the complexity of the search configuration. It is best used with optimized data and reasonable fuzziness settings to maintain good performance.

  • fuzzy-search:

    fuzzy-search is designed for efficiency, providing fast fuzzy search capabilities on arrays. Its performance is generally good, making it suitable for applications that require quick lookups without significant overhead.

  • fuzzysearch:

    fuzzysearch is extremely fast, especially for checking the presence of a substring within a string. Its lightweight nature and simple algorithm make it ideal for performance-critical applications where speed is essential.

  • fuzzysort:

    fuzzysort is known for its high performance, particularly in ranking and sorting fuzzy matches. It is optimized for speed, making it a great choice for applications that need quick search results with minimal latency.

Ease of Use: Code Examples

  • elasticlunr:

    Basic Usage of elasticlunr

    // Import the library
    const elasticlunr = require('elasticlunr');
    
    // Create an index
    const index = elasticlunr(function () {
      this.addField('title');
      this.addField('body');
      this.setRef('id');
    });
    
    // Add documents to the index
    index.addDoc({ id: 1, title: 'Hello World', body: 'This is a test document.' });
    index.addDoc({ id: 2, title: 'Elastic Search', body: 'Searching with elasticlunr is fun!' });
    
    // Search the index
    const results = index.search('elastic', { expand: true });
    console.log(results);
    
  • fuse.js:

    Basic Usage of fuse.js

    // Import the library
    const Fuse = require('fuse.js');
    
    // Sample data
    const data = [
      { id: 1, title: 'Apple', tags: ['fruit', 'food'] },
      { id: 2, title: 'Banana', tags: ['fruit', 'yellow'] },
      { id: 3, title: 'Carrot', tags: ['vegetable', 'orange'] },
    ];
    
    // Configure Fuse.js
    const options = {
      keys: ['title', 'tags'], // Fields to search
      threshold: 0.3, // Fuzziness threshold
    };
    
    // Create a Fuse instance
    const fuse = new Fuse(data, options);
    
    // Perform a search
    const result = fuse.search('app');
    console.log(result);
    
  • fuzzy-search:

    Basic Usage of fuzzy-search

    // Import the library
    const { fuzzySearch } = require('fuzzy-search');
    
    // Sample data
    const items = ['apple', 'banana', 'cherry', 'date'];
    
    // Perform a fuzzy search
    const results = fuzzySearch(items, 'appl');
    console.log(results); // Output: ['apple']
    
  • fuzzysearch:

    Basic Usage of fuzzysearch

    // Import the library
    const fuzzysearch = require('fuzzysearch');
    
    // Sample strings
    const str = 'hello world';
    const pattern = 'hlo wr';
    
    // Perform a fuzzy search
    const isMatch = fuzzysearch(pattern, str);
    console.log(isMatch); // Output: true
    
  • fuzzysort:

    Basic Usage of fuzzysort

    // Import the library
    const fuzzysort = require('fuzzysort');
    
    // Sample data
    const items = ['apple', 'banana', 'cherry', 'date'];
    
    // Perform a fuzzy sort
    const results = fuzzysort.go('ap', items);
    console.log(results);
    

How to Choose: elasticlunr vs fuse.js vs fuzzy-search vs fuzzysearch vs fuzzysort

  • elasticlunr:

    Choose elasticlunr if you need a lightweight, client-side search solution that supports full-text indexing and searching with features like stemming and custom analyzers. It is ideal for applications that require fast search capabilities without the need for a server-side solution.

  • fuse.js:

    Select fuse.js if you require a highly configurable fuzzy search library that can handle complex data structures, provide detailed matching scores, and allow for customization of the search algorithm. It is suitable for applications where accuracy and flexibility in searching are paramount.

  • fuzzy-search:

    Opt for fuzzy-search if you want a simple and efficient solution for performing fuzzy searches on arrays with minimal setup. It is great for projects that need a straightforward implementation without the need for extensive configuration or features.

  • fuzzysearch:

    Use fuzzysearch when you need a lightweight and fast utility for checking the presence of a substring within a string with fuzzy matching. It is perfect for scenarios where performance is critical and you need a no-frills solution.

  • fuzzysort:

    Choose fuzzysort if you need a fast fuzzy search library that ranks results based on their similarity to the search term, with a focus on performance and accuracy. It is ideal for applications that require quick searches with sorted results.

README for elasticlunr

Elasticlunr.js

Build Status npm version GitHub license

Elasticlunr.js is a lightweight full-text search engine developed in JavaScript for browser search and offline search. Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time boosting, field search, more rational scoring/ranking methodology, fast computation speed and so on. Elasticlunr.js is a bit like Solr, but much smaller and not as bright, but also provide flexible configuration, query-time boosting, field search and other features.

Why You Need Lightweight Offline Search?

  1. In some system, you don't want to deploy any complex full-text search engine(such as Lucence, Elasticsearch, Sphinx, etc.), you only want to provide some static web pages and provide search functionality , then you could build index in previous and load index in client side(such as Browser).
  2. Provide offline search functionality. For some documents, user usually download these documents, you could build index and put index in the documents package, then provide offline search functionality.
  3. For some limited or restricted network, such WAN or LAN, offline search is a better choice.
  4. For mobile device, Iphone or Android phone, network traffic maybe very expensive, then provide offline search is a good choice.
  5. If you want to provide search functionality in your Node.js system, and you don't want to use a complex system, or you only need to support thousands of documents, then Elasticlunr.js is what you want to use.

Key Features Comparing with Lunr.js

  1. Query-Time Boosting, you don't need to setup boosting weight in index building procedure, Query-Time Boosting make it more flexible that you could try different boosting scheme.
  2. More Rational Scoring Mechanism, Elasticlunr.js use quite the same scoring mechanism as Elasticsearch, and also this scoring mechanism is used by lucene.
  3. Field-Search, you could choose which field to index and which field to search.
  4. Boolean Model, you could set which field to search and the boolean model for each query token, such as "OR", "AND".
  5. Combined Boolean Model, TF/IDF Model and the Vector Space Model, make the results ranking more reliable.
  6. Fast, Elasticlunr.js removed TokenCorpus and Vector from lunr.js, by using combined model there is no need to compute the vector of a document and query string to compute similarity of query and matched document, this improve the search speed significantly.
  7. Small Index Size, Elasticlunr.js did not store TokenCorpus because there is no need to compute query vector and document vector, then the index file is small, and also user could choose if they need to store the origianl JSON doc, if user care more about the index size, they could choose not store the original JSON doc, this could reduce the index size significantly. This is especially helpful when elasticlunr.js is used as offline search. The index size is about half size of lunr.js index file.

Example

A very simple search index can be created using the following scripts:

var index = elasticlunr(function () {
    this.addField('title');
    this.addField('body');
    this.setRef('id');
});

Adding documents to the index is as simple as:

var doc1 = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

var doc2 = {
    "id": 2,
    "title": "Oracle released its profit report of 2015",
    "body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion."
}

index.addDoc(doc1);
index.addDoc(doc2);

Then searching is as simple:

index.search("Oracle database profit");

Also, you could do query-time boosting by passing in a configuration.

index.search("Oracle database profit", {
    fields: {
        title: {boost: 2},
        body: {boost: 1}
    }
});

This returns a list of matching documents with a score of how closely they match the search query:

[{
    "ref": 1,
    "score": 0.5376053707962494
},
{
    "ref": 2,
    "score": 0.5237481076838757
}]

If user do not want to store the original JSON documents, they could use the following setting:

var index = elasticlunr(function () {
    this.addField('title');
    this.addField('body');
    this.setRef('id');
    this.saveDocument(false);
});

Then elasticlunr.js will not store the JSON documents, this will reduce the index size, but also bring some inconvenience such as update a document or delete a document by document id or reference. Actually most of the time user will not udpate or delete a document from index.

API documentation is available, as well as a full working example.

Description

Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time Boosting, Field Search, more rational scoring/ranking methodology, flexible configuration and so on. A bit like Solr, but much smaller and not as bright, but also provide flexible configuration, query-time boosting, field search, etc.

Installation

Simply include the elasticlunr.js source file in the page that you want to use it. Elasticlunr.js is supported in all modern browsers.

Browsers that do not support ES5 will require a JavaScript shim for Elasticlunr.js to work. You can either use Augment.js, ES5-Shim or any library that patches old browsers to provide an ES5 compatible JavaScript environment.

Documentation

This part only contain important apects of elasticlunr.js, for the whole documentation, please go to API documentation.

1. Build Index

When you first create a index instance, you need to specify which field you want to index. If you did not specify which field to index, then no field will be searchable for your documents. You could specify fields by:

var index = elasticlunr(function () {
    this.addField('title');
    this.addField('body');
    this.setRef('id');
});

You could also set the document reference by this.setRef('id'), if you did not set document ref, elasticlunr.js will use 'id' as default.

You could do the above index setup as followings:

var index = elasticlunr();
index.addField('title');
index.addField('body');
index.setRef('id');

Also you could choose not store the original JSON document to reduce the index size by:

var index = elasticlunr();
index.addField('title');
index.addField('body');
index.setRef('id');
index.saveDocument(false);

2. Add document to index

Add document to index is very simple, just prepare you document in JSON format, then add it to index.

var doc1 = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

var doc2 = {
    "id": 2,
    "title": "Oracle released its profit report of 2015",
    "body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion."
}

index.addDoc(doc1);
index.addDoc(doc2);

If your JSON document contains field that not configured in index, then that field will not be indexed, which means that field is not searchable.

3. Remove document from index

Elasticlunr.js support remove a document from index, just provide JSON document to elasticlunr.Index.prototype.removeDoc() function.

For example:

var doc = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

index.removeDoc(doc);

Remove a document will remove each token of that document's each field from field-specified inverted index.

4. Update a document in index

Elasticlunr.js support update a document in index, just provide JSON document to elasticlunr.Index.prototype.update() function.

For example:

var doc = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

index.update(doc);

5. Query from Index

Elasticlunr.js provides flexible query configuration, supports query-time boosting and Boolean logic setting. You could setup a configuration tell elasticlunr.js how to do query-time boosting, which field to search in, how to do the boolean logic. Or you could just use it by simply provide a query string, this will aslo works perfectly because the scoring mechanism is very efficient.

5.1 Simple Query

Because elasticlunr.js has a very perfect scoring mechanism, so for most of your requirement, simple search would be easy to meet your requirement.

index.search("Oracle database profit");

Output is a results array, each element of results array is an Object contain a ref field and a score field. ref is the document reference. score is the similarity measurement.

Results array is sorted descent by score.

5.2 Configuration Query

5.2.1 Query-Time Boosting

Setup which fields to search in by passing in a JSON configuration, and setup boosting for each search field. If you setup this configuration, then elasticlunr.js will only search the query string in the specified fields with boosting weight.

The scoring mechanism used in elasticlunr.js is very complex, please goto details for more information.

index.search("Oracle database profit", {
    fields: {
        title: {boost: 2},
        body: {boost: 1}
    }
});

5.2.2 Boolean Model

Elasticlunr.js also support boolean logic setting, if no boolean logic is setted, elasticlunr.js use "OR" logic defaulty. By "OR" default logic, elasticlunr.js could reach a high Recall.

index.search("Oracle database profit", {
    fields: {
        title: {boost: 2},
        body: {boost: 1}
    },
    bool: "OR"
});

Boolean model could be setted by global level such as the above setting or it could be setted by field level, if both global and field level contains a "bool" setting, field level setting will overwrite the global setting.

index.search("Oracle database profit", {
    fields: {
        title: {boost: 2, bool: "AND"},
        body: {boost: 1}
    },
    bool: "OR"
});

The above setting will search title field by AND model and other fields by "OR" model. Currently if you search in multiply fields, resutls from each field will be merged together to give the query results. In the future elasticlunr will support configuration that user could set how to combine the results from each field, such as "most_field" or "top_field".

5.2.3 Token Expandation

Sometimes user want to expand a query token to increase RECALL, then user could set expand model to true by configuration, default is false. For example, user query token is "micro", and assume "microwave" and "microscope" are in the index, then is user choose expand the query token "micro" to increase RECALL, both "microwave" and "microscope" will be returned and search in the index. The query results from expanded tokens are penalized because they are not exactly the same as the query token.

index.search("micro", {
    fields: {
        title: {boost: 2, bool: "AND"},
        body: {boost: 1}
    },
    bool: "OR",
    expand: true
});

Field level expand configuration will overwrite global expand configuration.

index.search("micro", {
    fields: {
        title: {
            boost: 2,
            bool: "AND",
            expand: false
        },
        body: {boost: 1}
    },
    bool: "OR",
    expand: true
});

6. Add customized stop words

Elasticlunr.js contains some default stop words of English, such as:

  • a
  • about
  • an
  • all
  • also
  • and
  • any
  • but
  • the
  • ...

Defaultly elasticlunr.js contains 120 stop words, user could decide not use these default stop words or add customized stop words.

6.1 Remove default stop words

You could remove default stop words simply as:

elasticlunr.clearStopWords();

6.2 Add customized stop words

User could add a list of customized stop words.

var customized_stop_words = ['an', 'hello', 'xyzabc'];
elasticlunr.addStopWords(customized_stop_words);

7. Use elasticlunr in Node.js

Elasticlunr support Node.js, you could use elastilunr in node.js as a node-module.

Install elasticlunr by:

npm install elasticlunr

then in your node.js project or in node.js console:

var elasticlunr = require('elasticlunr');

var index = elasticlunr(function () {
    this.addField('title')
    this.addField('body')
});

var doc1 = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

var doc2 = {
    "id": 2,
    "title": "Oracle released its profit report of 2015",
    "body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion."
}

index.addDoc(doc1);
index.addDoc(doc2);

index.search("Oracle database profit");

Other Languages

Default supported language of elasticlunr.js is English, if you want to use elasticlunr.js to index other language documents, then you need to use elasticlunr.js combined with lunr-languages.

Other languages example in Browser

Suppose you are using elasticlunr.js in browser for other languages, you could download the corresponding language support from lunr-languages, then include the scripts as:

<script src="lunr.stemmer.support.js"></script>
<script src="lunr.de.js"></script>

then, you could use elasticlunr.js as normal:

var index = elasticlunr(function () {
    // use the language (de)
    this.use(elasticlunr.de);
    // then, the normal elasticlunr index initialization
    this.addField('title')
    this.addField('body')
});

Pay attention to the special code:

    this.use(elasticlunr.de);

If you are using other language, such as es(Spanish), download the corresponding lunr.es.js file and lunr.stemmer.support.js, and change the above line to:

    this.use(elasticlunr.es);

Other languages example in Node.js

Suppose you are using elasticlunr.js in Node.js for other languages, you could download the corresponding language support from lunr-languages, put the files lunr.es.js file and lunr.stemmer.support.js in your project, then in your Node.js module, use elasticlunr.js as:

var elasticlunr = require('elasticlunr');
require('./lunr.stemmer.support.js')(elasticlunr);
require('./lunr.de.js')(elasticlunr);

var index = elasticlunr(function () {
    // use the language (de)
    this.use(elasticlunr.de);
    // then, the normal elasticlunr index initialization
    this.addField('title')
    this.addField('body')
});

For more details, please go to lunr-languages.

Contributing

See the CONTRIBUTING.mdown file.