Oplog backup and recovery
Introduction to Oplog
MongoDB's Oplog (operation log) is a special capped collection used to record all operations that modify the database state. As a core component of replica sets, it enables data synchronization and failure recovery. The Oplog is designed with idempotency principles, where each record represents a minimal granularity of data change operations.
How Oplog Works
The Oplog is stored in the oplog.rs
collection within the local
database. Its document structure includes the following key fields:
{
ts: Timestamp(1630000000, 1), // Operation timestamp
h: NumberLong("123456789"), // Unique operation identifier
v: 2, // Version number
op: "i", // Operation type (i=insert, u=update, d=delete)
ns: "test.users", // Namespace
o: { _id: 1, name: "Alice" }, // Operation document
o2: { _id: 1 } // Condition document for update operations
}
Common operation types include:
i
: Insert documentu
: Update documentd
: Delete documentc
: Execute commandn
: No-op (heartbeat detection)
Oplog Backup Strategies
Physical Backup Solution
Using mongodump for hot backup:
mongodump --host rs0/localhost:27017,localhost:27018 \
--db local --collection oplog.rs \
--out /backup/oplog_$(date +%Y%m%d)
Combining with LVM snapshots for atomic backup:
lvcreate --size 1G --snapshot --name mongo-snap /dev/vg0/mongo
mkdir /backup/snapshot
mount /dev/vg0/mongo-snap /backup/snapshot
mongodump --db local --collection oplog.rs --out /backup/snapshot
Logical Backup Solution
Creating an incremental backup script:
const { MongoClient } = require('mongodb');
async function backupOplog(lastTimestamp) {
const client = await MongoClient.connect('mongodb://rs0/localhost:27017');
const oplog = client.db('local').collection('oplog.rs');
const query = lastTimestamp ? { ts: { $gt: lastTimestamp } } : {};
const cursor = oplog.find(query).sort({ $natural: 1 });
const backupFile = `./oplog_backup_${Date.now()}.bson`;
const fs = require('fs');
const stream = fs.createWriteStream(backupFile);
while (await cursor.hasNext()) {
const doc = await cursor.next();
stream.write(bson.serialize(doc));
}
stream.end();
client.close();
}
Oplog Recovery Techniques
Full Recovery Process
- Stop all secondary nodes
- Execute a freeze operation on the primary node:
db.fsyncLock();
- Restore data using bsondump:
mongorestore --host rs0/localhost:27017 \
--db local --collection oplog.rs \
/backup/oplog_backup.bson
- Unfreeze:
db.fsyncUnlock();
Point-in-Time Recovery
Implementing precise recovery with replayOplog:
const oplog = db.getSiblingDB('local').oplog.rs;
const targetTime = new Timestamp(1630000000, 0);
oplog.find({ ts: { $lte: targetTime } }).forEach(doc => {
switch(doc.op) {
case 'i':
db.getCollection(doc.ns).insert(doc.o);
break;
case 'u':
db.getCollection(doc.ns).update(doc.o2, doc.o);
break;
case 'd':
db.getCollection(doc.ns).remove(doc.o);
break;
}
});
Production Environment Practices
Capacity Planning Recommendations
Oplog size calculation formula:
Required Oplog Size = (Write Rate × Longest Failure Recovery Time) × Safety Factor (1.5-2)
Example configuration:
// Set Oplog size to 5GB
db.adminCommand({
replSetResizeOplog: 1,
size: 5 * 1024
});
Monitoring and Alerts
Key monitoring metrics:
// Get Oplog status
const oplogStatus = db.getReplicationInfo();
printjson({
"usedSize": oplogStatus.usedMB + "MB",
"timeDiff": oplogStatus.timeDiffHours + "hours",
"firstEvent": oplogStatus.firstEvent,
"lastEvent": oplogStatus.lastEvent
});
// Set alert thresholds
const threshold = 24; // hours
if (oplogStatus.timeDiffHours < threshold) {
print("Warning: Oplog window is less than " + threshold + " hours");
}
Advanced Use Cases
Cross-Cluster Synchronization
Implementing real-time synchronization with Change Stream:
const pipeline = [{ $match: { operationType: { $in: ["insert", "update", "delete"] } }];
const changeStream = db.collection('users').watch(pipeline);
changeStream.on('change', (change) => {
const targetClient = new MongoClient('mongodb://target-cluster');
targetClient.connect().then(client => {
const coll = client.db('prod').collection('users');
switch(change.operationType) {
case 'insert':
return coll.insertOne(change.fullDocument);
case 'update':
return coll.updateOne(
{ _id: change.documentKey._id },
{ $set: change.updateDescription.updatedFields }
);
case 'delete':
return coll.deleteOne({ _id: change.documentKey._id });
}
});
});
Data Audit Implementation
Building an operation audit system:
// Create audit collection
db.createCollection("auditLog", {
capped: true,
size: 100000000,
max: 100000
});
// Oplog listener processor
const processOplog = async (oplogDoc) => {
if (!['i', 'u', 'd'].includes(oplogDoc.op)) return;
const auditEntry = {
timestamp: new Date(),
operation: oplogDoc.op,
namespace: oplogDoc.ns,
operator: db.currentOp().clientMetadata?.user,
details: {
documentKey: oplogDoc.o._id || oplogDoc.o2._id,
changes: oplogDoc.op === 'u' ? oplogDoc.o : null
}
};
await db.auditLog.insertOne(auditEntry);
};
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn
上一篇:增量备份与时间点恢复
下一篇:分片集群的备份与恢复