dev-resources.site
for different kinds of informations.
Partial Indexes in MongoDB: A Brief Overview
Cover image credits: Digital Ocean
Introduction
If you have used indexes in MongoDB, then you must be knowing their importance and the huge performance gain they provide. It is part and parcel of almost every project that uses MongoDB and indexes play a vital role in improving the query execution performance.
The Problem
However, indexes don't come without cost. Indexes are generally stored in the memory, but this may not be true all the time. It may happen that your project contains many indexes or a few large indexes, and your computer's available memory may not be sufficient to accommodate all of them.
In such cases, some amount of an index is stored in the memory while the remaining part of it is stored in the storage. Storage accesses can be extremely slow compared to memory accesses. So, this can slow down your query execution.
A Solution
Partial indexes can be used to mitigate this problem to some extent.
A partial index is a space-efficient technique to index only those documents that are most frequently queried. For example, if an online book store has their customers searching for books with a rating of 3β or above most of the times, then indexing only those documents with a rating of 3β or above is sufficient to satisfy low latency queries for most searches. This reduces the size of the index required.
Usage
Here's how to create one, assuming that the collection name is book and genre is a field in the document's schema:
db.books.createIndex(
{ genre: 1 },
{ partialFilterExpression: { rating: { $gte: 3 } } }
)
The partialFilterExpression
option can be used in any kind of index, like a single-key index, a multi-key index or even a compound index.
And here's how to make use of the index:
db.books.find({ genre: "Astronomy", rating: { $gt: 3.5 } })
Caveats
There must be a matching filter expression in the query to trigger the usage of any partial index. In the above example, the partial index will be used because the query's filter on rating will definitely result in documents that make up a proper subset of the documents that were indexed using the above createIndex()
function.
If the query can result in a few documents outside the indexed ones, then the partial index will not be used, not even for the ones that can be fetched using the partial index. So, your query needs to ensure that no document outside the partial index has the slightest possibility of being scanned.
Conclusion
In essence, partial indexes can be used on documents that are queried frequently and you are not very concerned about the P90 or P99 latencies for those queries. This will help reduce the memory requirements to store the index.
Note
The intention of this post is only to make developers aware of partial indexes.
For more details, features and use cases of partial indexes, visit the official documentation for this.
Featured ones: