阿里云主机折上折
  • 微信号
Current Site:Index > Backup strategy (logical backup, physical backup)

Backup strategy (logical backup, physical backup)

Author:Chuan Chen 阅读数:62094人阅读 分类: MongoDB

Backup Strategy (Logical Backup, Physical Backup)

MongoDB's backup strategy mainly consists of two approaches: logical backup and physical backup. Logical backup exports data content, while physical backup directly copies underlying data files. Each method has its own advantages and disadvantages, making them suitable for different scenarios.

Logical Backup

Logical backup refers to exporting data in a logical structure format using database-provided tools, typically stored as JSON, BSON, or CSV. MongoDB provides two main tools for logical backup: mongodump and mongoexport.

mongodump

mongodump is MongoDB's official backup tool that exports database content in BSON format while preserving collection index information. Basic usage:

mongodump --host localhost --port 27017 --db myDatabase --out /backup/mongodb

This command backs up the myDatabase database to the /backup/mongodb directory. mongodump supports various parameters:

  • --collection: Specify a particular collection to back up
  • --query: Back up documents matching specific conditions
  • --gzip: Compress output files
  • --oplog: Used with replica sets to capture operation logs during backup

Example: Back up order data within a specific time range

mongodump --db ecommerce --collection orders \
--query '{createdAt: {$gte: ISODate("2023-01-01"), $lt: ISODate("2023-02-01")}}' \
--out /backup/january_orders

mongoexport

The mongoexport tool exports data as JSON or CSV format, making it suitable for interaction with other systems:

mongoexport --db myDatabase --collection users --out users.json

Key features:

  • Supports --fields parameter to select specific fields
  • Can specify --type=csv to export as CSV format
  • Output files are highly readable but don't preserve index information

Example: Export user email list as CSV

mongoexport --db myApp --collection users \
--fields=email,firstName,lastName \
--type=csv --out user_emails.csv

Pros and Cons of Logical Backup

Advantages:

  1. High portability - backup files can be migrated between different MongoDB versions
  2. Selective backup of specific collections or documents
  3. Backup files are human-readable for easy inspection
  4. Storage engine independent, suitable for all MongoDB deployments

Disadvantages:

  1. Slower backup and restore speeds, especially with large datasets
  2. Continuous database writes during backup may cause data inconsistency
  3. Doesn't include database user and role information (requires separate admin database backup)

Physical Backup

Physical backup involves directly copying MongoDB's data files, including WiredTiger storage engine files and log files. This method is typically faster than logical backup and more suitable for large databases.

Filesystem Snapshots

Most modern filesystems support snapshot functionality to create instantaneous copies of data files with minimal performance impact:

# Create snapshot on Linux LVM
lvcreate --size 10G --snapshot --name mongo_snapshot /dev/vg0/mongo_data

Key considerations:

  1. Flush all pending writes before snapshot: db.fsyncLock()
  2. Unlock after snapshot completion: db.fsyncUnlock()
  3. Ensure sufficient space for the snapshot

Replica Set Member Backup

For replica set deployments, data files can be copied directly from secondary members:

  1. Remove secondary member from replica set: rs.remove("secondary1:27017")
  2. Stop the mongod process
  3. Copy the data directory
  4. Rejoin the replica set

This method doesn't affect primary node performance but requires careful operation to avoid impacting replica set availability.

Cloud Service Backup

Cloud services like MongoDB Atlas provide automated physical backup functionality, typically implemented through storage volume snapshots:

// Atlas API example: Trigger on-demand backup
const fetch = require('node-fetch');

async function triggerBackup() {
  const response = await fetch(
    'https://cloud.mongodb.com/api/atlas/v1.0/groups/{groupId}/clusters/{clusterName}/backup/snapshots',
    {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${API_KEY}`
      },
      body: JSON.stringify({
        description: 'Monthly backup',
        retentionInDays: 30
      })
    }
  );
  return response.json();
}

Pros and Cons of Physical Backup

Advantages:

  1. Faster backup and restore speeds, especially for large databases
  2. Maintains data consistency, suitable for critical business systems
  3. Includes all database metadata, including user permissions
  4. Minimal performance impact during backup

Disadvantages:

  1. Backup files are typically larger, consuming more storage space
  2. Dependent on MongoDB version and storage engine
  3. Usually requires database downtime or locking to ensure consistency
  4. Cross-platform restoration may present issues

Backup Strategy Selection

Consider the following factors when choosing a backup strategy:

Data Volume

  • Small databases (<100GB): Logical backup is usually sufficient
  • Large databases: Physical backup is more efficient

Recovery Time Objective (RTO)

  • Requires fast recovery: Prioritize physical backup
  • Can tolerate longer recovery times: Logical backup is feasible

Storage Limitations

  • Limited storage space: Logical backup (especially compressed) saves space
  • Ample storage space: Physical backup is more convenient

Typical Hybrid Strategies

  1. Daily physical snapshots + hourly incremental logical backups
  2. Primary database physical backup + secondary logical backup
  3. Production environment physical backup + development/test environment logical backup

Backup Verification and Recovery Testing

Regardless of the backup strategy, regularly verifying backup validity is crucial:

Logical Backup Verification

# Recovery test
mongorestore --db test_restore --drop /backup/mongodb/myDatabase

# Data validation
mongo --eval "db.users.count()" test_restore

Physical Backup Verification

  1. Restore data files on a new instance
  2. Start mongod process
  3. Run consistency check:
db.runCommand({validate: "orders", full: true})

Automated Verification Script Example

const { execSync } = require('child_process');

function testRestore() {
  try {
    // Restore backup
    execSync('mongorestore --drop /backup/latest');
    
    // Connect to database for verification
    const conn = new Mongo('localhost:27017');
    const db = conn.getDB('myApp');
    const userCount = db.users.countDocuments();
    
    if (userCount === 0) {
      throw new Error('Restoration failed: User count is 0');
    }
    
    console.log(`Restoration verified successfully, found ${userCount} users`);
    return true;
  } catch (err) {
    console.error('Restoration test failed:', err);
    return false;
  }
}

Backup Security and Storage

Both backup methods require consideration of secure storage:

  1. Encrypt backups:

    mongodump --db sensitiveData --out - | openssl enc -aes-256-cbc -salt -out backup.enc
    
  2. Offsite storage:

    • Copy backups to cloud storage (AWS S3, Azure Blob, etc.)
    • Use rsync to synchronize with remote servers
  3. Backup retention policy:

    • Keep daily backups for the last 7 days
    • Keep monthly backups for the last 12 months
    • Permanent backups for critical time points
  4. Access control:

    • Restrict access to backup files
    • Create dedicated database users for backup operations
    db.createUser({
      user: "backupAdmin",
      pwd: "securePassword",
      roles: [{role: "backup", db: "admin"}]
    })
    

Monitoring and Alerts

A robust backup system requires monitoring and alert mechanisms:

  1. Monitor backup job execution status:

    # Check last backup file time
    find /backup/mongodb -name "*.bson" -type f -mtime -1 | wc -l
    
  2. Set up backup failure alerts:

    // Monitoring script example
    const lastBackupTime = fs.statSync('/backup/latest').mtime;
    const hoursSinceBackup = (Date.now() - lastBackupTime) / (1000 * 60 * 60);
    
    if (hoursSinceBackup > 24) {
      sendAlert('MongoDB backup not executed for over 24 hours');
    }
    
  3. Capacity monitoring:

    • Monitor backup storage space usage
    • Implement automatic cleanup of old backups

Special Scenario Handling

Certain special scenarios require particular attention to backup strategy:

Sharded Cluster Backup

  1. Stop balancer: sh.stopBalancer()
  2. Back up config servers
  3. Back up each shard individually
  4. Record shard metadata
  5. Restart balancer after recovery

Point-in-Time Recovery

Combine with oplog for second-level precise recovery:

mongodump --oplog --out /backup/with_oplog
mongorestore --oplogReplay /backup/with_oplog

Incremental Backup Strategy

  1. After initial full backup, periodically back up oplog
  2. During recovery, first restore full backup then replay oplog
  3. Use timestamps to mark backup positions:
    db.oplog.rs.find({ts: {$gt: Timestamp(1672531200, 1)}})
    

Performance Optimization Techniques

Performance optimization for large-scale database backups:

  1. Parallel backup of multiple collections:

    mongodump --numParallelCollections 4 --out /backup/parallel
    
  2. Exclude unnecessary system collections:

    mongodump --excludeCollection=system.* --out /backup/essential
    
  3. Adjust batch size:

    mongorestore --batchSize=1000 /backup/data
    
  4. Use SSD for temporary backup files:

    mongodump --out /ssd/temp_backup
    rsync -a /ssd/temp_backup /hdd/permanent_backup
    
  5. Network optimization:

    • Perform backups within the same data center
    • Use high-bandwidth network connections
    • Consider compressing data during transfer

本站部分内容来自互联网,一切版权均归源网站或源作者所有。

如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn

Front End Chuan

Front End Chuan, Chen Chuan's Code Teahouse 🍵, specializing in exorcising all kinds of stubborn bugs 💻. Daily serving baldness-warning-level development insights 🛠️, with a bonus of one-liners that'll make you laugh for ten years 🐟. Occasionally drops pixel-perfect romance brewed in a coffee cup ☕.