string-similarity vs similarity
String Similarity Libraries Comparison
1 Year
string-similaritysimilaritySimilar Packages:
What's String Similarity Libraries?

String similarity libraries are essential tools for developers who need to compare and analyze the similarity between strings. These libraries can be used in various applications, such as search engines, recommendation systems, and data deduplication. They provide algorithms to quantify how alike two strings are, which can be crucial for tasks like fuzzy searching, spell checking, and natural language processing. By leveraging these libraries, developers can enhance user experience through improved search functionalities and data handling.

Package Weekly Downloads Trend
Github Stars Ranking
Stat Detail
Package
Downloads
Stars
Size
Issues
Publish
License
string-similarity1,694,9252,526-234 years agoISC
similarity107,75177-05 years agoISC
Feature Comparison: string-similarity vs similarity

Algorithm Variety

  • string-similarity:

    The 'string-similarity' package provides a broader range of algorithms, including Levenshtein distance, Jaro-Winkler, and others. This variety allows for more nuanced comparisons, making it suitable for applications that require detailed string analysis.

  • similarity:

    The 'similarity' package offers a few basic algorithms such as Jaccard and Cosine similarity, focusing on simplicity and performance. It is designed for quick comparisons without the overhead of complex calculations.

Performance

  • string-similarity:

    While 'string-similarity' offers more features, it may come with a slight performance cost due to the complexity of its algorithms. It is best used in scenarios where accuracy is prioritized over speed.

  • similarity:

    This package is optimized for performance, making it suitable for applications that need to process a large number of string comparisons quickly. Its lightweight nature ensures minimal overhead, which is beneficial for real-time applications.

Ease of Use

  • string-similarity:

    Although 'string-similarity' has a more extensive API, it may require a bit more understanding of its algorithms and options. However, it provides detailed documentation that can help developers navigate its features.

  • similarity:

    The 'similarity' package is straightforward to use, with a simple API that allows developers to quickly implement string comparison functionalities without extensive setup or configuration.

Dependencies

  • string-similarity:

    'string-similarity' may have more dependencies due to its advanced features, which could complicate integration in projects where dependency management is a concern.

  • similarity:

    This package has minimal dependencies, making it easy to integrate into projects without worrying about additional overhead or conflicts with other libraries.

Community and Support

  • string-similarity:

    With a larger user base, 'string-similarity' benefits from more community support, including tutorials, examples, and discussions that can assist developers in implementing the library effectively.

  • similarity:

    The 'similarity' package has a smaller community, which may result in less available support and fewer resources for troubleshooting or advanced use cases.

How to Choose: string-similarity vs similarity
  • string-similarity:

    Choose 'string-similarity' if you require a more comprehensive solution that includes advanced algorithms like Levenshtein distance and Jaro-Winkler distance. This package is ideal for applications that need robust string comparison capabilities and can afford a slightly larger footprint.

  • similarity:

    Choose 'similarity' if you need a lightweight library that offers a simple API for calculating similarity scores based on various algorithms, including Jaccard and Cosine similarity. It is suitable for projects where performance and minimal dependencies are critical.

README for string-similarity

string-similarity

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Table of Contents

Usage

For Node.js

Install using:

npm install string-similarity --save

In your code:

var stringSimilarity = require("string-similarity");

var similarity = stringSimilarity.compareTwoStrings("healed", "sealed");

var matches = stringSimilarity.findBestMatch("healed", [
  "edward",
  "sealed",
  "theatre",
]);

For browser apps

Include <script src="//unpkg.com/string-similarity/umd/string-similarity.min.js"></script> to get the latest version.

Or <script src="//unpkg.com/string-similarity@4.0.1/umd/string-similarity.min.js"></script> to get a specific version (4.0.1) in this case.

This exposes a global variable called stringSimilarity which you can start using.

<script>
  stringSimilarity.compareTwoStrings('what!', 'who?');
</script>

(The package is exposed as UMD, so you can consume it as such)

API

The package contains two methods:

compareTwoStrings(string1, string2)

Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-sensitive.

Arguments
  1. string1 (string): The first string
  2. string2 (string): The second string

Order does not make a difference.

Returns

(number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Examples
stringSimilarity.compareTwoStrings("healed", "sealed");
// → 0.8

stringSimilarity.compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "For sale: table in very good  condition, olive green in colour."
);
// → 0.6060606060606061

stringSimilarity.compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "For sale: green Subaru Impreza, 210,000 miles"
);
// → 0.2558139534883721

stringSimilarity.compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "Wanted: mountain bike with at least 21 gears."
);
// → 0.1411764705882353

findBestMatch(mainString, targetStrings)

Compares mainString against each string in targetStrings.

Arguments
  1. mainString (string): The string to match each target string against.
  2. targetStrings (Array): Each string in this array will be matched against the main string.
Returns

(Object): An object with a ratings property, which gives a similarity rating for each target string, a bestMatch property, which specifies which target string was most similar to the main string, and a bestMatchIndex property, which specifies the index of the bestMatch in the targetStrings array.

Examples
stringSimilarity.findBestMatch('Olive-green table for sale, in extremely good condition.', [
  'For sale: green Subaru Impreza, 210,000 miles',
  'For sale: table in very good condition, olive green in colour.',
  'Wanted: mountain bike with at least 21 gears.'
]);
// →
{ ratings:
   [ { target: 'For sale: green Subaru Impreza, 210,000 miles',
       rating: 0.2558139534883721 },
     { target: 'For sale: table in very good condition, olive green in colour.',
       rating: 0.6060606060606061 },
     { target: 'Wanted: mountain bike with at least 21 gears.',
       rating: 0.1411764705882353 } ],
  bestMatch:
   { target: 'For sale: table in very good condition, olive green in colour.',
     rating: 0.6060606060606061 },
  bestMatchIndex: 1
}

Release Notes

2.0.0

  • Removed production dependencies
  • Updated to ES6 (this breaks backward-compatibility for pre-ES6 apps)

3.0.0

  • Performance improvement for compareTwoStrings(..): now O(n) instead of O(n^2)
  • The algorithm has been tweaked slightly to disregard spaces and word boundaries. This will change the rating values slightly but not enough to make a significant difference
  • Adding a bestMatchIndex to the results for findBestMatch(..) to point to the best match in the supplied targetStrings array

3.0.1

  • Refactoring: removed unused functions; used substring instead of substr
  • Updated dependencies

4.0.1

  • Distributing as an UMD build to be used in browsers.

4.0.2

  • Update dependencies to latest versions.

4.0.3

  • Make compatible with IE and ES5. Also, update deps. (see PR56)

4.0.4

  • Simplify some conditional statements. Also, update deps. (see PR50)

Build status Known Vulnerabilities