Skip to contents

Find Probability of Match Based on Similarity

Usage

euclidean_probability(distance, n_bands, band_width, r)

Arguments

distance

the euclidian distance between the two vectors you want to compare.

n_bands

The number of LSH bands used in hashing.

band_width

The number of hashes in each band.

r

the "r" hyperparameter used to govern the sensitivity of the hash.

Value

a decimal number giving the proability that the two items will be returned as a candidate pair from the minihash algorithm.