/Export Collections Safely

❯ man spanna/guides

Export collections safely

Learn how to export MongoDB collections safely with the right scope, format, and verification steps so you do not leak data or create unreliable handoff files.

/docs/export-collections-safely

Exporting a MongoDB collection sounds harmless until you export the wrong environment, dump far more data than intended, or hand someone a JSON file that quietly lost the types you cared about.

That is why safe export work is not just about getting data out. It is about controlling scope, preserving meaning, and making sure the file is fit for its next job.

First decide what the export is for

Not every export has the same goal.

You might be exporting to:

  • debug a production issue
  • seed a local environment
  • validate a migration
  • share sample data with another team
  • archive a filtered slice of a collection

The destination use case changes the correct export strategy.

For example:

  • a developer fixture export should be small and scrubbed
  • a migration validation export may need exact _id preservation
  • a backup-style export may need BSON tooling instead of JSON

If you skip this decision, it is easy to produce a file that is technically valid but operationally wrong.

The safest export rule: export less

The easiest way to export safely is to export less data.

That means:

  • fewer collections
  • fewer documents
  • fewer fields
  • less sensitive information

Most bad exports happen because someone defaults to “everything” when the actual need was “a small, representative subset.”

Start by confirming the source environment

Before running any export, confirm:

  1. cluster
  2. database
  3. collection
  4. whether the data is production, staging, or local

This sounds trivial, but it is the main control that prevents accidental data leakage.

If there is any chance you are on production, slow down and verify the connection string before you export.

Pick the right tool for the job

MongoDB’s tools are good, but they are not interchangeable.

Use mongoexport when:

  • you want JSON or CSV
  • you need a filtered or field-limited export
  • the file is for inspection, sharing, or selective transfer

Use mongodump when:

  • you want a backup-style export
  • you need higher-fidelity database preservation
  • you plan to restore into MongoDB with mongorestore

This distinction matters. MongoDB’s own tooling docs treat mongodump/mongorestore as the better fit for backup and restore style workflows, while mongoexport is better for JSON/CSV extraction and targeted data movement.

JSON format choice matters more than most people think

MongoDB uses Extended JSON in its tools.

The official mongoexport docs note that:

  • relaxed Extended JSON is the default
  • canonical Extended JSON is available with --jsonFormat=canonical

That choice affects how clearly BSON types survive in the exported file.

If you care about type fidelity for round-tripping back into MongoDB, canonical JSON is usually the safer choice:

mongoexport --uri "<connection-string>" \
  --db app \
  --collection orders \
  --jsonFormat=canonical \
  --out orders.json

If the export is mainly for reading, debugging, or lightweight sharing, relaxed JSON may be fine.

Use filters aggressively

A safe export is usually a filtered export.

Example:

mongoexport --uri "<connection-string>" \
  --db app \
  --collection orders \
  --query '{"status":"failed"}' \
  --out failed-orders.json

This is safer than dumping the full collection and hoping the recipient only looks at the relevant rows.

Filters help you:

  • reduce file size
  • reduce sensitive data exposure
  • make verification easier
  • generate files that are actually useful

Export only the fields you need

If the consumer only needs a few fields, export a few fields.

That is one of the easiest ways to reduce risk.

Example:

mongoexport --uri "<connection-string>" \
  --db app \
  --collection users \
  --fields "_id,email,plan,createdAt" \
  --out users-summary.json

That is far safer than exporting password-reset metadata, internal flags, or nested blobs nobody actually needs.

Watch sensitive data like a hawk

Safe export work is as much about omission as inclusion.

Before exporting, ask:

  • does this collection contain credentials, tokens, or secrets?
  • does it contain regulated personal data?
  • does the consumer really need production values?
  • should fields be omitted, masked, or transformed?

If the export is meant for debugging or local development, the safest file is often a sanitised one, not a faithful one.

Be careful with array and nested document size

Collections with large nested objects or arrays can produce exports that look manageable by row count but are huge in practice.

That matters because:

  • exports become harder to inspect
  • transfers take longer
  • imports become heavier
  • accidental over-sharing becomes easier

A filtered export with field selection is often the difference between a usable file and a liability.

Default output shape matters

MongoDB’s mongoexport writes one JSON document per MongoDB document by default. If you need one big JSON array instead, use --jsonArray.

Example:

mongoexport --uri "<connection-string>" \
  --db app \
  --collection products \
  --jsonArray \
  --out products.json

That matters because some downstream tools expect newline-delimited JSON, and others expect a single array payload.

Pick the format based on the next consumer, not personal preference.

Read preference matters on busy systems

MongoDB’s mongoexport docs note that the default read preference is primary.

That means exports can read from the primary unless you explicitly override it.

On sensitive or busy systems, that is worth thinking about. If your deployment and consistency requirements allow it, a secondary-friendly export strategy may be preferable for operational safety.

How to verify the export

Do not assume a file is good because it exists.

After export:

  1. check the file size
  2. inspect the first few records
  3. confirm the expected field set
  4. confirm the expected document count when relevant
  5. verify the source environment and filter used

If the file is supposed to round-trip back into MongoDB, also verify that the JSON format preserves the BSON types you care about.

Common safe-export workflows

Export a filtered subset

mongoexport --uri "<connection-string>" \
  --db app \
  --collection events \
  --query '{"type":"login"}' \
  --out login-events.json

Export a small handoff file

mongoexport --uri "<connection-string>" \
  --db app \
  --collection users \
  --fields "_id,email,plan" \
  --out users-handoff.json

Export with stronger type preservation

mongoexport --uri "<connection-string>" \
  --db app \
  --collection invoices \
  --jsonFormat=canonical \
  --out invoices.json

What safe export does not mean

Safe export does not mean:

  • “the command succeeded”
  • “the file is big, so it must be complete”
  • “it worked last time”
  • “the collection looked harmless”

Safe export means:

  • the scope was intentional
  • the format was appropriate
  • the data exposure risk was considered
  • the output was checked before reuse or sharing

How Spanna helps

Spanna is useful here because export safety depends on visibility:

  • inspect the collection before export
  • validate counts and shape
  • reason about which fields should stay out
  • compare exported slices against the source when needed

That makes it easier to produce a file that is not just valid, but appropriate for its actual purpose.

Summary

Export collections safely by making three decisions up front: what the file is for, how much data it really needs, and how much fidelity the next step requires.

Use mongoexport for selective JSON or CSV extraction. Use mongodump when you really need backup-style fidelity. Filter aggressively, export only necessary fields, think hard about sensitive data, and verify the file before you treat it as trustworthy.

# something missing or wrong? tell us · or open a PR