Member-only story

Milvus — Understanding Vector Similarity Metrics

Tony
6 min readJan 12, 2024

Similarity Metrics

A vector space is a collection of objects called vectors, which can be added together and multiplied by scalars (i.e., numbers). In machine learning and data science, we often work with data in high-dimensional spaces, where each dimension corresponds to a feature of the data. For example, in a dataset of houses, each house can be represented as a vector, with each dimension corresponding to features like size, location, number of rooms, etc.

One of the key operations when working with vectors is measuring how similar they are to each other. We achieve this by using a similarity metric or similarity score. The choice of similarity metric can significantly impact the performance of machine learning algorithms or any operation that needs to compare vectors.

There are several types of similarity metrics, each with its properties and use cases:

  • Euclidean distance: This is the “ordinary” straight-line distance between two vectors in Euclidean space. The smaller the Euclidean distance, the more similar two vectors are.
  • Cosine similarity: This measures the cosine of the…

--

--

Tony
Tony

No responses yet