Data migration and version control
Basic Concepts of Data Migration
Data migration in database management refers to the process of transferring data from one storage system or structure to another. In Mongoose, data migration typically involves schema changes, data format conversions, or database engine replacements. Common scenarios include modifying field types, renaming fields, adding indexes, or data cleansing.
// Example: Simple data migration script
const migrateUserData = async () => {
const users = await User.find({});
for (const user of users) {
if (user.age && typeof user.age === 'string') {
user.age = parseInt(user.age);
await user.save();
}
}
};
Importance of Version Control
Database schema changes require strict version control for the following reasons:
- Avoiding conflicts during team collaboration
- Tracking historical change records
- Supporting rollback operations
- Maintaining consistency across development, testing, and production environments
Common version control solutions:
- Manually writing migration scripts
- Using specialized migration tools (e.g., migrate-mongoose)
- Combining with Git for version management
Mongoose Migration Strategies
Incremental Migration
Each schema change corresponds to an independent migration file, executed in chronological order:
// migrations/20230501-add-email-verification.js
module.exports = {
up: async (db) => {
await db.collection('users').updateMany(
{},
{ $set: { isEmailVerified: false } }
);
},
down: async (db) => {
await db.collection('users').updateMany(
{},
{ $unset: { isEmailVerified: "" } }
);
}
};
Schema Version Tagging
Add a version field to documents for easier tracking:
const userSchema = new mongoose.Schema({
// ...other fields
schemaVersion: {
type: Number,
default: 1,
required: true
}
});
Migration Tool Practices
Typical usage of migrate-mongoose:
- Install the package:
npm install migrate-mongoose
- Create migration configuration:
// migrator.js
const { Migrator } = require('migrate-mongoose');
const migrator = new Migrator({
dbConnectionUri: 'mongodb://localhost/mydb',
migrationsPath: './migrations'
});
- Execute migrations:
// Create a new migration
await migrator.create('add-user-profile');
// Execute pending migrations
await migrator.up('all');
// Roll back the most recent migration
await migrator.down('last');
Handling Complex Migration Scenarios
Large-Scale Data Migration
When dealing with millions of documents, consider:
- Batch processing:
const batchSize = 1000;
let skip = 0;
let hasMore = true;
while (hasMore) {
const users = await User.find().skip(skip).limit(batchSize);
if (users.length === 0) {
hasMore = false;
continue;
}
// Processing logic...
skip += batchSize;
}
- Using aggregation pipelines for direct operations:
await User.collection.aggregate([
{ $match: { status: 'inactive' } },
{ $set: { lastActive: new Date() } },
{ $merge: { into: 'users' } }
]);
Field Type Changes
Complete workflow for field type changes:
- Add a new field
- Write a migration script to transform data
- Verify data integrity
- Remove the old field
// Convert string-type price to number
await Product.updateMany(
{ price: { $type: 'string' } },
[{
$set: {
price: {
$convert: {
input: "$price",
to: "decimal",
onError: 0
}
}
}
}]
);
Testing and Validation
Comprehensive migration testing should include:
- Unit tests to verify migration logic
- Pre-production environment testing
- Performance benchmarking
- Rollback testing
// Testing migration scripts with Jest
describe('User migration', () => {
let testUser;
beforeAll(async () => {
testUser = await User.create({
name: 'Test',
age: '25' // Intentionally using a string
});
});
it('should convert age to number', async () => {
await migrateUserData();
const updated = await User.findById(testUser._id);
expect(typeof updated.age).toBe('number');
});
});
Production Environment Best Practices
- Maintain a migration log collection:
const migrationLogSchema = new mongoose.Schema({
name: String,
batch: Number,
runAt: {
type: Date,
default: Date.now
},
status: {
type: String,
enum: ['pending', 'success', 'failed']
}
});
- Implement zero-downtime migration strategies:
- Dual-write mechanism
- Shadow mode
- Blue-green deployment
- Monitor key metrics:
// Monitor migration performance
const start = Date.now();
await runMigration();
const duration = Date.now() - start;
monitor.record('migration_duration', {
name: 'user_profile_update',
duration,
docsAffected
});
Solutions to Common Issues
Handling Migration Conflicts
When multiple developers create migrations simultaneously:
- Use timestamp-prefixed naming:
202305011200-add-field.js
202305011230-remove-field.js
- Use locking mechanisms to prevent concurrent execution:
const lock = await Lock.findOne({ name: 'migration' });
if (lock && lock.status === 'running') {
throw new Error('Migration already in progress');
}
Ensuring Data Consistency
Verify data meets expectations post-migration:
// Example validation script
const verifyMigration = async () => {
const invalidDocs = await User.find({
$or: [
{ age: { $type: 'string' } },
{ email: { $exists: false } }
]
});
if (invalidDocs.length > 0) {
throw new Error(`Found ${invalidDocs.length} invalid documents`);
}
};
Automated Deployment Integration
Example of migration integration in CI/CD pipelines:
# .github/workflows/migrate.yml
name: Database Migration
on:
push:
branches: [ main ]
jobs:
migrate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- run: npm install
- run: npm run migrate:up
env:
DB_URI: ${{ secrets.PRODUCTION_DB_URI }}
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn
上一篇:日志与审计功能实现
下一篇:高并发场景下的优化策略