Find missing indexes — Spanna Docs

If a MongoDB query feels slow, the problem usually is not “MongoDB is bad at queries.” The problem is that MongoDB is being forced to do the expensive version of the work.

Think of a missing index like asking someone to find one invoice in a warehouse with no shelf labels. They can do it. They just have to walk every aisle, open every box, and waste a lot of time getting there. In MongoDB terms, that usually means a COLLSCAN, a large number of examined documents, or a sort that spills extra work into memory.

This guide is about finding those weak spots quickly and building the smallest index that actually matches the workload.

What “missing index” usually looks like

A missing or misaligned index normally shows up as one of these symptoms:

a query that feels slow even though it returns very few rows
a filter that examines far more documents than it returns
a sort that becomes expensive as the collection grows
a query that is fast in development but slows down badly in production
a workload that is read-heavy but still scans large chunks of the collection

The important distinction is this: sometimes there is no index at all, and sometimes there is an index but it is the wrong one. Both problems look similar from the outside.

Start with the real query, not your guess

Before you build any index, identify the exact query pattern that is slow:

the filter
the sort
the projection
the actual limit or pagination pattern

This matters because indexes are built for query shapes, not for vague intentions.

For example, these are not the same problem:

db.orders.find({ status: "open" })
db.orders.find({ status: "open" }).sort({ createdAt: -1 })
db.orders.find({ status: "open", total: { $gt: 500 } }).sort({ createdAt: -1 })

They all touch the same collection, but the best index may be different for each.

`explain()` is the fastest truth source

When a query is slow, run it with explain("executionStats").

db.orders.find({ status: "open" }).sort({ createdAt: -1 }).explain("executionStats")

This gives you the two numbers that matter most:

totalDocsExamined
totalKeysExamined

MongoDB’s official docs make this especially useful:

totalDocsExamined tells you how many documents MongoDB actually had to inspect during execution
totalKeysExamined tells you how many index entries it scanned

If those numbers are much larger than nReturned, your query is doing too much work.

How to read the results without overthinking it

1. Look for `COLLSCAN`

If the winning plan contains COLLSCAN, MongoDB is scanning the collection instead of using an index. That is the clearest sign that an appropriate index is missing.

2. Look for `IXSCAN`

If you see IXSCAN, that is better, but it does not automatically mean the query is healthy. MongoDB may still be scanning a large portion of the index or fetching too many documents afterward.

3. Compare examined vs returned

Use this simple mental model:

nReturned is what you wanted
totalDocsExamined is what MongoDB had to touch to get it

If a query returns 20 rows but examines 50,000 documents, the index is wrong, incomplete, or missing.

4. Check whether a `FETCH` is doing expensive cleanup

A common pattern is:

IXSCAN -> FETCH

That means MongoDB used an index to narrow things down, then had to fetch documents to finish filtering or return fields not covered by the index. Sometimes that is acceptable. Sometimes it means the index is only halfway useful.

The three most common missing-index cases

1. Slow filter queries

Example:

db.users.find({ email: "sam@example.com" })

If this query scans the collection, you probably need:

db.users.createIndex({ email: 1 })

This is the easy case: one exact-match field, one straightforward index.

2. Slow filter + sort queries

Example:

db.orders.find({ status: "open" }).sort({ createdAt: -1 })

A single-field index on status might help the filter, but MongoDB may still need to do extra sort work. This is where compound indexes matter.

The official MongoDB guidance here is the ESR rule:

Equality first
Sort next
Range last

For the example above, the better index is usually:

db.orders.createIndex({ status: 1, createdAt: -1 })

That gives MongoDB a way to find matching rows and return them in the right order from the index.

3. Slow range queries

Example:

db.orders.find({
  status: "open",
  total: { $gt: 500 }
}).sort({ createdAt: -1 })

This is where field order becomes critical. Following ESR, a strong candidate is:

db.orders.createIndex({ status: 1, createdAt: -1, total: 1 })

The equality field goes first, the sort field next, and the range field last.

Why? Because MongoDB can use equality matches to narrow the search early, use the index for sort support, and then apply the range condition after that.

A practical decision framework

When you are deciding whether an index is missing, use this order:

What does the query filter on exactly?
Does it sort?
Does it use ranges like $gt, $lt, or date windows?
Does it only need a few fields back?

That sequence usually gets you to the right index much faster than staring at every existing index in the collection.

Don’t create the first index that “works”

This is where teams get into trouble. A query improves, everyone moves on, and six months later the collection has 14 overlapping indexes that all cost write performance and RAM.

Before adding an index, check for:

an existing compound index that already covers the prefix you need
a sort order mismatch
a projection that could make the query covered
a low-selectivity field that is not worth indexing by itself

For example, an index on:

{ status: 1, createdAt: -1 }

may make a separate { status: 1 } index redundant for many workloads.

Covered queries: the gold standard

MongoDB can sometimes answer a query directly from the index without reading the underlying documents. That is a covered query.

The official explain guidance is simple here:

if the query is covered, you can see IXSCAN without a descendant FETCH
totalDocsExamined can be 0

Example:

db.users.find(
  { email: "sam@example.com" },
  { _id: 0, email: 1, plan: 1 }
)

If you run that query often, this index can be very efficient:

db.users.createIndex({ email: 1, plan: 1 })

Now MongoDB may be able to match and return the result from the index alone.

Sort problems are often index problems in disguise

MongoDB’s official sort docs are clear: if MongoDB cannot use an index to satisfy a sort, it has to sort in memory.

That means a query can have an index and still be slow if:

the sort fields are not in the index
the sort fields are in the wrong order
the query does not satisfy the prefix rules needed to use the sort portion of the compound index

This is one of the easiest ways to build an index that looks reasonable on paper but still underperforms in production.

Arrays, multikey indexes, and why things get weird

If the query filters on an array field, MongoDB uses a multikey index. These are powerful, but they can change how an index behaves, especially in compound patterns.

Use extra caution when:

the query sorts after filtering on array fields
multiple candidate fields are arrays
the index looks correct but explain still shows heavy examined counts

When array fields are involved, you should trust explain() more than intuition.

Indexes should match workload, not schema

A common anti-pattern is indexing fields because they look important in the document:

email
status
type
createdAt

That is schema-driven indexing. It feels safe, but it often creates bloat.

The better approach is workload-driven indexing:

which queries actually run often?
which ones are slow?
which ones affect user-facing latency or critical jobs?

If a field is important but never queried, it probably does not need an index.

Use the smallest index that solves the problem

This is the discipline most teams skip.

A good index:

reduces examined documents sharply
supports the actual sort pattern
avoids unnecessary extra fields
does not duplicate another index you already have

A bad index:

“kind of helps”
is wider than necessary
overlaps heavily with existing indexes
gets added without checking write cost

Every index adds maintenance overhead on inserts, updates, and deletes. Faster reads are not free.

A reliable workflow for finding missing indexes

Use this every time:

capture the exact slow query
run explain("executionStats")
look for COLLSCAN, large examined counts, or expensive sort behavior
inspect existing indexes before creating anything new
design the smallest candidate index using ESR where relevant
run the query again with explain("executionStats")
compare nReturned, totalDocsExamined, and totalKeysExamined

If the examined counts fall sharply and the plan becomes cleaner, you found the right fix.

How Spanna helps

Spanna makes this workflow much less annoying:

you can inspect query behavior without jumping between shell tabs
you can compare explain output with the collection’s current indexes
you can reason about filter, sort, and index shape in one place

That matters because missing-index work is rarely about raw MongoDB syntax. It is usually about diagnosing the gap between what the query asks for and what the index layout actually supports.

Summary

Finding missing indexes is not about adding indexes everywhere. It is about identifying where MongoDB is doing unnecessary work, then giving it a faster path.

Start with the real query. Use explain("executionStats"). Watch COLLSCAN, totalDocsExamined, and totalKeysExamined. Follow ESR for compound indexes. Avoid duplicates. Keep the fix as small as possible.

Do that consistently, and you will stop treating indexing as guesswork and start treating it like performance engineering.

What “missing index” usually looks like

Start with the real query, not your guess

explain() is the fastest truth source

How to read the results without overthinking it

1. Look for COLLSCAN

2. Look for IXSCAN

3. Compare examined vs returned

4. Check whether a FETCH is doing expensive cleanup

The three most common missing-index cases

1. Slow filter queries

2. Slow filter + sort queries

3. Slow range queries

A practical decision framework

Don’t create the first index that “works”

Covered queries: the gold standard

Sort problems are often index problems in disguise

Arrays, multikey indexes, and why things get weird

Indexes should match workload, not schema

Use the smallest index that solves the problem

A reliable workflow for finding missing indexes

How Spanna helps

Summary

`explain()` is the fastest truth source

1. Look for `COLLSCAN`

2. Look for `IXSCAN`

4. Check whether a `FETCH` is doing expensive cleanup