Milvus — Search and Query Introduction

Tony
4 min readFeb 18, 2024

Milvus offers powerful tools for handling efficiently search and query large collections of data, particularly in dealing with vector and scalar fields. In this article, let’s delves into the intricacies of searching and querying data within Milvus collections, focusing on vector search, querying scalar fields, and hybrid search.

Vector Search

Milvus supports vector search, a capability vital for applications dealing with image and voice recognition, recommendation systems, and similar tasks. Before diving into the search process, it’s important to note that searches are typically performed on indexed collections. Thus, indexing is a prerequisite for an effective search.

In my previous article, I showed you how to create collections and different types of index . Let’s quickly recap the code:

# Define connection
connections.connect("default", uri=milvus_uri, token=token)

# Define Field Schema
song_name = FieldSchema(
name="song_name",
dtype=DataType.VARCHAR,
description="name of the song",
max_length=200,
)
song_id = FieldSchema(
name="song_id", dtype=DataType.INT64, description="id of the song", is_primary=True
)
play_count = FieldSchema(
name="play_count", dtype=DataType.INT64, description="play count of the song"
)
song_vector = FieldSchema(
name="song_vector",
dtype=DataType.FLOAT_VECTOR,
dim=4,
description="vector of the song",
)

# Define collection schema
collection_schema = CollectionSchema(
fields=[song_name, song_id, play_count, song_vector],
description="collection schema of songs",
)

# Create collection
collection = Collection(name="Songs", schema=collection_schema, using="default")
pprint.pprint(utility.list_collections())

To create vector index, you can run the following:

# Dfine and create vector index
index_params = {
"metric_type": "L2",
"index_type": "ANNOY",
"params": {"n_trees": 64},
"index_name": "annoy_index"
}

collection.create_index(
field_name="song_vector",
index_params=index_params
)

To create scalar index:

collection.create_index(
field_name="song_name",
index_name="song_name_index"
)

--

--