Logo

dev-resources.site

for different kinds of informations.

When NOT to use Atlas Search

Published at
12/6/2024
Categories
mongodb
lucene
atlas
search
Author
erikhatcher
Categories
4 categories in total
mongodb
open
lucene
open
atlas
open
search
open
Author
11 person written this
erikhatcher
open
When NOT to use Atlas Search

Design reviews are one-on-one meetings where MongoDB experts deliver advice on data modeling best practices and application design challenges. In this series, we are going to explore common real-life scenarios where design reviews helped developers achieve meaningful success with MongoDB. - How to Align Your Data Model With Your Application Needs When Migrating From RDBMS to MongoDB | by NĂ©stor Daza

Note: I’m taking the opportunity to link liberally, sometimes loony-ily. I love the serendipity of following interesting links. I had fun researching and reminding myself of oldies but goodies. Here’s to at least some of the shiny paths followed being entertaining and educational to you too.

We’ve got to start with a couple of assumptions for this article to best fit:

  • You’ve got documents in MongoDB Atlas.
  • The documents need to be findable.

If your documents aren’t in Atlas, then Atlas Search doesn’t (yet) apply, and thus the rest of this write-up is moot.

If anything, I’m pragmatic and agile. Duct tape, twine, or a card catalog—use what works for the job. Whether you use MongoDB or not, the document model is a good way to think about data challenges and worth having handy when the time is right. Do consider Atlas for your future data needs, as it’s a platform that provides a lot of necessary and powerful capabilities. Just sayin’.

Findability is one such necessary database capability. If you can’t find your content, it may as well not exist.

Atlas Search enables powerful, scalable, and relevant search features. Its strength primarily stems from one little, old Java library. There’s a potent elixir in that .jar! And it has been The Solution to All The Challenges for the bulk of my career. One rather fun aspect of my life at MongoDB is tackling Design Reviews that involve some aspect of search. I excel at, and enjoy, solving concrete search problems. These reviews typically are with folks using Atlas Search and wanting to dig in deeper to get a bit more nuance to relevancy tuning, or folks using MongoDB $match and $regex and exploring if and how to leverage Atlas Search instead. Here’s a story about a recent Design Review with a customer already well versed in Atlas Search and using it effectively… to a point.

Atlas Search matching as expected

Here’s the use case presented to me by a customer during a design review session:

We have a service built on MongoDB Atlas that needs to rapidly match identity requests using only a few fields of exact (though case-insensitive) values, such as an ID, e-mail address, and phone number.

Case-insensitive matching over a few fields? Definitely a problem that Atlas Search can solve handily! Take a few compound.should clauses and call us in the morning.

And unsurprisingly, the customer reports that:

Atlas Search matching works as expected...

But not so fast

Literally, and unfortunately,

… However, the time to “eventual consistency” in order to match recently updated documents is too long for the required SLA.

And, to work around that,

A [third-party key-value] caching mechanism was implemented for a first pass lookup.

Both of these topics warrant a bit of a deeper dive, so that we can understand how best to help this customer.

  • Eventual consistency
  • Key lookup using indexes

Eventual consistency

Yes, Atlas Search is awesome! It can slice, dice, and do all sorts of groovy things, yet it obediently stays within the laws of physics. Data, being what it is, always will be adding, updating, and deleting from the database and its replica set. The Atlas Search process (mongot) handles the database change stream and updates the underlying Lucene index. This process, by default, runs co-located with the database processes themselves on the same hardware, though ideally should run on its own hardware nearby.

Coupled Architecture

Coupled architecture

Dedicated Search Nodes

Dedicated Search Nodes

Atlas Search is eventually consistent. These machinations involve shards, replicas, CPU, disk, memory, network, and a bit of time. Changes to the database will, eventually, be reflected in associated search indexes. But it isn’t instantaneous, and there are many variables that affect the lag between a database change and search requests finding documents by the modified criteria: rate of data changes, complexity of index mapping configuration, deployment architecture/capabilities, resource contentions, size of the index, query load, and maybe even solar flares.

Depending on the nature of the application, the eventually consistent lag time may be irrelevant or a critical aspect of consideration. An update to a book record in a library can get reindexed overnight without affecting operations. However, this identity request for a record that just got updated failing to match the latest value in the database is unacceptable.

The trade-off of a search index being eventually consistent is to not delay, or interfere, with database-level updates and transactions. A search index update has so many variables involved and can change over time in complexity; a change to the index configuration could cause vastly more terms or documents to be indexed. An Atlas Search index is an index configuration and its corresponding Lucene index. This word “index” is a great one, but actually a Lucene index is really a collection of special purpose data structures, one for each field (and multi) defined. Each field “type” has its own optimized index data structure. Lexicographically ordered inverted indexes with posting lists complete with term, document, and corpus-level statistics power string mapped fields and queries. This is the heart of relevancy computations.

Key value lookup using indexes

Finding by _id (every document's unique key) is a given (so we throw that one in for free!). What about finding your data by other exact match types of criteria, such as all products in a specific category? Or all documents modified by a particular username? No doubt this type of findability is crucial too. MongoDB is really good at looking up documents by a value, provided the value is indexed.

This particular application needs exact, case-insensitive value lookup over a few fields. Let’s push the case-insensitivity issue to the application, and simply have it lowercase the field value any time it is being written or queried, so now it’s fully an exact match situation on the MongoDB side of things. Following indexing best practices such as the ESR rule, a few single-field B-Tree index definitions are all the customer needs to satisfy their performance SLA. These indexes don’t come for free, either, but are managed in the database process quickly and handled synchronously with every document update, so consistency is guaranteed.

And to be sure, key/value lookup in Lucene (via Atlas Search) is very fast. It’s the eventual consistency lag that drives the design recommendation here. If the use case had been querying across dozens of fields in any combination for exact values and eventual consistency was an acceptable trade-off, Atlas Search would be the better approach here. With a lot of fields to intersect, B-Tree index configuration would be arduous and resource-intensive, whereas an Atlas Search index configured for multiple-field intersection would be quite efficient and performant.

Design recommendation: B-Tree pragmatism

For this case of a few fields of exact value matching, with no full-text fuzzy search needed, the clear winner is leveraging the in-process, consistent, and quick B-Tree index capabilities.

When queries are exact field matches and the eventual consistency time lag is a critical blocker, consider using classic MongoDB B-Tree indexes rather than Atlas Search. Atlas Search indexes are updated in a separate process, maybe even on separate hardware via a network hop, whereas B-Tree index updates happen within the scope of database update transactions and are immediately usable after an update completes. Note that _id is implicitly indexed in this fashion and can be used for domain values if appropriate. With B-Tree-based lookups, a front-end cache is not needed as this is already a fast key/value lookup from a RAM-based index.

Be sure to learn about data modeling and schema design for Atlas Search so that you’re ready for the problems for which it shines!

atlas Article's
25 articles in total
Favicon
When NOT to use Atlas Search
Favicon
Desvendando o Atlas: Gerencie Seus Esquemas de Banco de Dados com Estilo🚀
Favicon
How to migrate a Mongo Database with Ansible Playbook
Favicon
Detectando Ă­ndices no MongoDB
Favicon
Build a full-stack application using Node.js, React and Atlas
Favicon
Feature Flag manager in Rust - MongoDB Atlas Hackathon 2022 on DEV
Favicon
Mongogram - Social media backend api using golang and mongodb
Favicon
Final Weekend of the MongoDB Atlas Hackathon 2022 on DEV!
Favicon
fastdbaccess (Organize and Manage your database) | MongoDB Atlas Hackathon 2022
Favicon
MongoDB 2022 Hackathon submission - Introducing Ghost Notifier
Favicon
MongoDB $weeklyUpdate #83 (August 22, 2022): Atlas Search, MVP Spotlight, and Heading to Malaysia!
Favicon
MongoDB $weeklyUpdate #82 (August 12, 2022): Community Cafe, Atlas Search, & International MUGs!
Favicon
MongoDB $weeklyUpdate #81 (August 5, 2022): Catch Up on Content, Podcasts, and Videos!
Favicon
MongoDB $weeklyUpdate #80 (July 29, 2022): Atlas Search, Rust, and 100 Days of Code!
Favicon
MongoDB $weeklyUpdate #79 (July 22, 2022): Kubernetes, MEAN Stack, & Community Events!
Favicon
MongoDB $weeklyUpdate #78 (July 15, 2022): MongoDB & Kafka, + Cool Tunes at Scale Up!
Favicon
MongoDB $weeklyUpdate #77 (July 8, 2022): Socket.io, AWS Summit, & Going Mobile!
Favicon
MongoDB $weeklyUpdate #75 (June 24, 2022): Scale Up, Kafka, and Cocktail APIs
Favicon
MongoDB $weeklyUpdate #73 (June 10, 2022): Atlas Search, Next.js, & Realm!
Favicon
MongoDB $weeklyUpdate #71 (May 27, 2022): Atlas Data Lake, Cloud Manager, and World!
Favicon
The pain of aggregations in MongoDB Compass
Favicon
Connecting to MongoDb cloud (Atlas)
Favicon
Can I Use MongoDB with Prisma Yet?
Favicon
Self-Hosted MongoDB
Favicon
Working with Atlas Copco Open Protocol

Featured ones: