Backup and recovery of sharded clusters
Overview of Backup and Recovery for Sharded Clusters
Backup and recovery for MongoDB sharded clusters are critical aspects of database operations. The distributed nature of sharded clusters makes the backup and recovery process more complex than standalone deployments, requiring consideration of config servers, shard nodes, and data balancing states.
Backup Strategies for Sharded Clusters
Full Backup and Incremental Backup
Full backups are typically performed using the mongodump
tool and are suitable for small clusters or low-frequency backup scenarios. For large production environments, it is recommended to combine incremental backup strategies:
// Example command for full backup
mongodump --host cluster.example.com --port 27017 --out /backup/full
// Incremental backup using oplog
mongodump --host cluster.example.com --port 27017 --oplog --out /backup/incr
Filesystem Snapshot Backup
For environments using LVM or storage volume management, filesystem snapshots are a more efficient backup method:
- Stop the balancer:
sh.stopBalancer()
- Lock all shard instances:
db.fsyncLock()
- Create storage snapshots
- Unlock instances:
db.fsyncUnlock()
- Re-enable the balancer
Backing Up Config Servers
Config servers store cluster metadata and must remain consistent with data backups:
mongodump --host cfg1.example.com --port 27019 --out /backup/config
Recovery Methods for Sharded Clusters
Full Cluster Recovery
When rebuilding an entire cluster, the recovery process should follow this sequence:
- Restore config servers first:
mongorestore --host cfg1.example.com --port 27019 /backup/config
- Restore data for each shard:
mongorestore --host shard1.example.com --port 27018 /backup/shard1
- Finally, restore the mongos router configuration
Single Shard Recovery
Recovering a single shard requires special handling:
// 1. Remove the shard from the cluster
db.adminCommand({ removeShard: "shard1" });
// 2. Restore the shard's data
// 3. Re-add the shard to the cluster
sh.addShard("shard1.example.com:27018");
Point-in-Time Recovery
Use oplog to achieve recovery with second-level precision:
mongorestore --oplogReplay --oplogLimit "1654012800:1" /backup/full
Backup and Recovery Validation
Data Consistency Checks
After recovery, execute validation scripts:
// Compare document counts
function compareCounts(db1, db2, collection) {
const count1 = db1[collection].countDocuments({});
const count2 = db2[collection].countDocuments({});
return count1 === count2;
}
// Sample data comparison
function sampleCompare(db1, db2, collection, sampleSize) {
const docs1 = db1[collection].aggregate([{ $sample: { size: sampleSize } }]);
const docs2 = db2[collection].aggregate([{ $sample: { size: sampleSize } }]);
// Comparison logic...
}
Shard Key Distribution Validation
Ensure data distribution matches expectations after recovery:
// Check shard key distribution
db.adminCommand({ getShardDistribution: "database.collection" });
// Validate chunk distribution
db.getSiblingDB("config").chunks.find({ ns: "database.collection" });
Optimization Practices for Backup and Recovery
Parallel Backup Techniques
Large clusters can use parallel backup strategies:
# Parallel backup for multiple shards
for shard in shard1 shard2 shard3; do
mongodump --host $shard --out /backup/$shard &
done
wait
Automated Incremental Backup
Combine with cron for automated incremental backups:
0 2 * * * mongodump --host cluster.example.com --oplog --out /backup/incr/$(date +\%Y\%m\%d)
Backup Compression and Encryption
Use modern compression algorithms to reduce storage usage:
mongodump --gzip --out /backup/compressed
openssl enc -aes-256-cbc -in backup.tar -out backup.tar.enc
Handling Special Scenarios
Cluster Expansion During Backup
When adding new shards during backup:
- Record the expansion timestamp
- Perform a separate backup for the new shard
- During recovery, restore existing shards first, then handle the new shard
Cross-Version Backup and Recovery
Considerations for backup and recovery across different MongoDB versions:
- Direct recovery is possible between clusters with the same major version
- Cross-major version recovery requires upgrading the backup tool first
- Use the
--noIndexRestore
option to avoid index compatibility issues
Monitoring and Alert Configuration
Backup Status Monitoring
Configure monitoring to check backup integrity:
// Check the latest backup file
const lastBackup = fs.statSync("/backup/latest/backup.log");
if (Date.now() - lastBackup.mtime > 86400000) {
alert("Backup overdue");
}
Recovery Performance Metrics
Track key metrics for recovery operations:
// Record recovery time
const start = Date.now();
// Execute recovery operation...
const duration = (Date.now() - start)/1000;
db.metrics.insertOne({
event: "restore",
duration: duration,
sizeGB: backupSize
});
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn
上一篇:Oplog备份与恢复
下一篇:搜索项目历史(git grep)