MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
-
Updated
Jun 4, 2024 - Python
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Quickly search, compare, and analyze genomic and metagenomic data sets.
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
JS implementation of probabilistic data structures: Bloom Filter (and its derived), HyperLogLog, Count-Min Sketch, Top-K and MinHash
Weighted MinHash implementation on CUDA (multi-gpu).
Sketching Algorithms for Clojure (bloom filter, min-hash, hyper-loglog, count-min sketch)
C++ Implementations of sketch data structures with SIMD Parallelism, including Python bindings
Elasticsearch plugin for b-bit minhash algorism
Locality Sensitive Hashing In R
Union, intersection, and set cardinality in loglog space
Quickly estimate the similarity between many sets
Easy-to-use Java similarity algorithms for text and numeric-series
Dynatrace hash library for Java
Detect and visualize text reuse
A method to mine beyond-pairwise relationships using Min-Hashing for large-scale pattern discovery
There are Python 2.7 codes and learning notes for Spark 2.1.1
Locality Sensitive Hashing
Add a description, image, and links to the minhash topic page so that developers can more easily learn about it.
To associate your repository with the minhash topic, visit your repo's landing page and select "manage topics."