For Mongo Atlas
This page outlines the process of exporting queries and table/view metadata from Mongo Atlas, and delivering them to a Single Origin s3 bucket.
Export Queries
- Download mongod logs per these instructions
- Extract query logs:
zcat mongod.gz | grep -i "command" > query-logs.txt
- If query logs aren’t present, enable database profiling
- Using timestamp, bucket query logs by day, creating one file per day, e.g. query-logs-YYYY-MM-DD.txt
Export Collection and Schema Metadata
For each collection, we need:
- Name
- Schema definition
- Number of documents
- Size in bytes
- A list of index definitions
Save this script as dump_collection_metadata.js:
const dbName = db.getName();
// Load all collection metadata (includes schema validators)
const collInfos = db.getCollectionInfos({ type: "collection" });
collInfos.forEach((collInfo) => {
const name = collInfo.name;
const coll = db.getCollection(name);
// Collection stats
const stats = coll.stats();
// Index definitions
const indexes = coll.getIndexes().map((idx) => ({
name: idx.name,
key: idx.key,
unique: idx.unique || false,
sparse: idx.sparse || false,
partialFilterExpression: idx.partialFilterExpression || null
}));
// JSON Schema validation, if present
let jsonSchema = null;
if (
collInfo.options &&
collInfo.options.validator &&
collInfo.options.validator.$jsonSchema
) {
jsonSchema = collInfo.options.validator.$jsonSchema;
}
// Compose result
const result = {
db_name: dbName,
collection_name: name,
document_count: stats.count,
total_size_bytes: stats.size,
indexes: indexes,
json_schema: jsonSchema
};
print(JSON.stringify(result));
});
Run the script, dumping to a file named collection_metadata.json:
mongosh "mongodb+srv://<cluster_address>/<db_name>" --quiet --file dump_collection_metadata.js > collection_metadata.json
Export Views
Save script as dump_view_definitions.js:
const dbName = db.getName();
// Get all views in the database
const viewInfos = db.getCollectionInfos({ type: "view" });
viewInfos.forEach((view) => {
const result = {
db_name: dbName,
view_name: view.name,
source: view.options.viewOn,
pipeline: view.options.pipeline,
options: view.options.collation ? { collation: view.options.collation } : {}
};
print(JSON.stringify(result));
});
Run:
mongosh "mongodb+srv://<cluster_address>/<db_name>" --quiet --file dump_view_definitions.js > views.json
Deliver to Single Origin S3
Setup IAM role
Pre-requisite: AWS account and permission to create IAM roles
- Log in to the AWS Management Console
- Go to IAM > Roles and click Create role
- Select trusted entity type, choose custom trust policy
- Allow Single Origin to assume this custom role [1]
- Attach permission policy to write to Single Origin bucket [2]
- Name and create the role, e.g. DataExportToSingleOrigin
- Share the role ARN with Single Origin, e.g. arn:aws:iam::<client-account-id>:role/DataExportToSingleOrigin
[1]
{
"Version": "2025-07-03",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<single-origin-aws-account-id>:root"
},
"Action": "sts:AssumeRole"
}
]
}
[2]
{
"Version": "2025-07-03",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:PutObject", "s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::schema-parquet-{client-name}",
"arn:aws:s3:::schema-parquet-{client-name}/*"
]
}
]
}
Copy Files
As a user who assumes the role DataExportToSingleOrigin, it should now be possible to upload the data files for a day:
aws s3 cp query-logs-YYYY-MM-DD.txt collection_metadata.json views.json s3://schema-parquet-{client-name}/YYYY/MM/DD/
Updated about 14 hours ago