Index abuse and optimization suggestions
Index Abuse and Optimization Recommendations
Indexes are a key mechanism for improving query performance in MongoDB, but improper use can lead to write performance degradation, wasted storage space, and other issues. Proper index design requires balancing query efficiency with resource consumption to avoid common pitfalls.
Common Manifestations of Index Abuse
Over-Indexing
Creating too many indexes on the same collection is a typical problem. Each index consumes storage space and incurs additional overhead during writes. For example, a user collection might have the following indexes:
// Example of redundant indexes in a user collection
db.users.createIndex({ username: 1 }) // Unique index
db.users.createIndex({ username: 1, age: 1 }) // Composite index with a leading field
The leading field of the second composite index is already included in the first index. In this case, a query like {username: 'john'}
would hit both indexes, causing overhead for the index selector.
Inefficient Composite Indexes
Poorly ordered composite indexes can render indexes ineffective:
// Inefficient composite index order
db.orders.createIndex({ status: 1, created_at: 1 })
// The following query cannot fully utilize the index
db.orders.find({ created_at: { $gt: ISODate('2023-01-01') } })
High-selectivity fields should be placed first:
// Optimized index order
db.orders.createIndex({ created_at: 1, status: 1 })
Inadequate Index Coverage
When indexes do not cover all fields in common queries, it can lead to in-memory sorting:
// Product query example
db.products.find(
{ category: 'electronics', price: { $lt: 1000 } },
{ name: 1, price: 1 }
).sort({ rating: -1 })
// Existing index
db.products.createIndex({ category: 1, price: 1 })
In this case, the sorting field should be added to the index:
db.products.createIndex({
category: 1,
price: 1,
rating: -1
})
Index Optimization Practices
Index Selectivity Analysis
Use $indexStats
to evaluate index usage:
db.collection.aggregate([{ $indexStats: {} }])
Sample output showing hit counts and memory usage:
{
"name" : "category_1_price_1",
"accesses" : {
"ops" : NumberLong(2543),
"since" : ISODate("2023-06-01T00:00:00Z")
}
}
Query Pattern Recognition
Use explain()
to analyze query execution plans:
db.orders.find({
user_id: ObjectId("507f1f77bcf86cd799439011"),
status: "completed"
}).explain("executionStats")
Key metrics to focus on:
totalKeysExamined
: Number of index-scanned documentstotalDocsExamined
: Number of collection-scanned documentsexecutionTimeMillis
: Execution time in milliseconds
Index Merge Strategy
For OR queries, consider index merging for $or
expressions:
// Original query
db.articles.find({
$or: [
{ tags: "mongodb" },
{ view_count: { $gt: 10000 } }
]
})
// Optimization approach
db.articles.createIndex({ tags: 1 })
db.articles.createIndex({ view_count: -1 })
MongoDB automatically performs index merging (INDEX_MERGE), but ensure each branch has a corresponding index.
Special Scenario Index Strategies
Time-Series Data
Special optimizations for time-series data:
// Time-series collection creation
db.createCollection("sensor_data", {
timeseries: {
timeField: "timestamp",
metaField: "sensor_id",
granularity: "hours"
}
})
// Optimized query index
db.sensor_data.createIndex({
"metadata.sensor_type": 1,
timestamp: -1
})
Full-Text Search Optimization
Combining text indexes with filters:
// Create a text index
db.reviews.createIndex({
comments: "text",
product_id: 1
})
// Efficient query
db.reviews.find({
$text: { $search: "battery life" },
product_id: 12345
})
Multi-Key Index Pitfalls
Considerations for array field indexes:
// Example of potential index explosion
db.products.createIndex({ tags: 1 })
// Inserting documents with large arrays
db.products.insertOne({
name: "UltraBook",
tags: ["laptop", "ultraportable", "windows", "i7", "16gb", "ssd"...]
})
Each array element creates an index entry, causing the index size to balloon.
Index Maintenance and Monitoring
Index Rebuilding Strategy
Use collMod
to rebuild indexes and reduce fragmentation:
db.runCommand({
collMod: "orders",
index: {
keyPattern: { created_at: 1 },
hidden: true // Hide before rebuilding
}
})
Performance Benchmarking
Use benchRun
for load testing:
var ops = [
{
op: "find",
ns: "test.orders",
query: { status: "pending" }
}
]
db.adminCommand({ benchRun: ops, parallel: 10, seconds: 30 })
Storage Engine Tuning
Adjust WiredTiger cache size:
# mongod.conf configuration
storage:
wiredTiger:
engineConfig:
cacheSizeGB: 8 # Recommended 50-60% of physical memory
Advanced Index Design Patterns
Partial Indexes
Create indexes only for documents that meet specific conditions:
// Index only for active users
db.users.createIndex(
{ last_login: -1 },
{ partialFilterExpression: { status: "active" } }
)
Hashed Shard Key Indexes
Special considerations for sharded clusters:
// Hashed shard key index
sh.shardCollection("analytics.events", { _id: "hashed" })
// Range queries require additional indexes
db.events.createIndex({ created_at: -1 })
Time-to-Live (TTL) Indexes
Implement automatic data expiration:
// Session data expires after 24 hours
db.sessions.createIndex(
{ last_accessed: 1 },
{ expireAfterSeconds: 86400 }
)
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn
上一篇:数据建模常见误区
下一篇:分片键选择的最佳实践