阿里云主机折上折
  • 微信号
Current Site:Index > The data model of MongoDB (documents, collections, databases)

The data model of MongoDB (documents, collections, databases)

Author:Chuan Chen 阅读数:12691人阅读 分类: MongoDB

MongoDB Data Model (Documents, Collections, Databases)

As a representative NoSQL database, MongoDB adopts a flexible document data model, with core concepts including documents, collections, and databases. This design enables MongoDB to efficiently handle unstructured or semi-structured data while maintaining powerful query capabilities.

Document

A document is the most basic data unit in MongoDB, stored in BSON (Binary JSON) format. Each document consists of key-value pairs, similar to a JSON object but supporting richer data types.

// A typical MongoDB document example
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  name: "Zhang San",
  age: 28,
  address: {
    city: "Beijing",
    district: "Haidian District"
  },
  hobbies: ["programming", "swimming", "photography"],
  createdAt: new Date("2023-01-15")
}

Document characteristics:

  • The _id field is the unique identifier for the document; if not specified, it is automatically generated
  • Supports nested documents (e.g., the address field)
  • Can store arrays (e.g., the hobbies field)
  • Field values can be of various data types (string, number, date, etc.)

The document size limit is 16MB, suitable for storing most business data. For larger files, MongoDB provides the GridFS specification.

Collection

A collection is a container for a group of documents, similar to a table in relational databases but without a fixed schema.

// Example of different document structures in a user collection
[
  {
    _id: 1,
    username: "user1",
    email: "user1@example.com"
  },
  {
    _id: 2,
    username: "user2",
    profile: {
      firstName: "Li",
      lastName: "Si"
    },
    lastLogin: ISODate("2023-05-20T08:30:00Z")
  }
]

Key features of collections:

  • Dynamic schema: Documents in a collection can have different structures
  • Naming rules: Cannot contain special characters like null characters or the $ symbol
  • Index support: Indexes can be created on collections to improve query performance
  • Capped collections: Fixed-size collections, suitable for scenarios like logging

Collection operation examples (using MongoDB Shell):

// Create a collection
db.createCollection("users")

// View all collections
show collections

// Delete a collection
db.users.drop()

Database

A database is the top-level organizational structure in MongoDB, containing multiple collections. A single MongoDB instance can host multiple databases.

Key characteristics of databases:

  • Namespace isolation: Collections in different databases can have the same name
  • Independent access control: Different access permissions can be set for each database
  • Storage quotas: Storage limits can be set for databases

Common database commands:

// Switch to/create a database
use mydb

// View the current database
db

// View all databases
show dbs

// Delete the current database
db.dropDatabase()

Data Model Design Patterns

MongoDB provides various data model design methods, including:

  1. Embedded document pattern:
// Order with embedded order items
{
  _id: "ORD123",
  customer: "Zhang San",
  items: [
    { product: "phone", quantity: 1, price: 5999 },
    { product: "earphones", quantity: 2, price: 299 }
  ],
  total: 6597
}
  1. Reference pattern:
// User collection
{
  _id: "USER1",
  name: "Li Si"
}

// Order collection
{
  _id: "ORDER1",
  user: "USER1",  // Reference to user ID
  amount: 1000
}
  1. Hybrid pattern (embedded + reference):
// Blog post with comments
{
  _id: "POST1",
  title: "MongoDB Guide",
  content: "...",
  recentComments: [  // Embedded recent comments
    { user: "USER1", text: "Great article", date: ISODate(...) },
    { user: "USER2", text: "Very helpful", date: ISODate(...) }
  ],
  commentCount: 15  // Total comment count
}

Data Types

MongoDB supports a wide range of data types, including:

  • Basic types: String, Integer, Boolean, Double
  • Date types: Date, Timestamp
  • Object ID: ObjectId
  • Binary data: BinData
  • Special types: Null, Regular Expression, JavaScript code
  • Geospatial types: Point, LineString, Polygon
// Document containing various data types
{
  _id: ObjectId("5f8d8a7b2f4a1e3d6c9b8a7d"),
  name: "Sample Document",
  count: 42,
  active: true,
  price: 19.99,
  tags: ["mongodb", "database", "nosql"],
  createdAt: new Date(),
  location: {
    type: "Point",
    coordinates: [116.404, 39.915]
  },
  metadata: {
    version: "1.0",
    hash: BinData(0, "aGVsbG8gd29ybGQ=")
  }
}

Indexes and Data Model

Indexes significantly impact data model design. MongoDB supports various index types:

// Create a single-field index
db.users.createIndex({ username: 1 })

// Create a compound index
db.orders.createIndex({ customer: 1, date: -1 })

// Create a multikey index (for array fields)
db.products.createIndex({ tags: 1 })

// Create a text index
db.articles.createIndex({ content: "text" })

// Create a geospatial index
db.places.createIndex({ location: "2dsphere" })

Index design should consider query patterns to avoid excessive indexing that could degrade write performance.

Data Model Best Practices

  1. Design document structure based on query patterns:
// Read-optimized design
{
  _id: "PROD001",
  name: "Smartphone",
  price: 2999,
  inventory: {
    warehouse1: 50,
    warehouse2: 30
  },
  dailySales: [
    { date: ISODate("2023-05-01"), count: 12 },
    { date: ISODate("2023-05-02"), count: 8 }
  ]
}
  1. Handle one-to-many relationships:
// Few child documents: embed
{
  orderId: "ORD1001",
  items: [
    { product: "A", qty: 2 },
    { product: "B", qty: 1 }
  ]
}

// Many child documents: reference
{
  blogPostId: "POST100",
  title: "...",
  commentCount: 142,
  recentComments: [...]
}
  1. Consider document growth:
// Pre-allocate space to avoid document movement
{
  _id: "USER100",
  name: "Wang Wu",
  activityLog: new Array(100).fill(null)
}
  1. Data model considerations in sharded clusters:
// Choose an appropriate shard key
sh.shardCollection("mydb.orders", { customerId: 1, orderDate: 1 })

Data Model Evolution

MongoDB's schemaless design allows flexible data model evolution:

// Add new fields
db.products.updateMany(
  {},
  { $set: { lastUpdated: new Date() } }
)

// Rename fields
db.users.updateMany(
  {},
  { $rename: { "oldField": "newField" } }
)

// Migrate data formats
db.orders.aggregate([
  { $project: {
    customerId: 1,
    items: 1,
    total: { $sum: "$items.price" }
  }},
  { $out: "orders_new" }
])

Performance Considerations

Data model design directly impacts performance:

  1. Working set size: Ensure active data fits in memory
  2. Document size: Avoid overly large documents (close to the 16MB limit)
  3. Index coverage: Design queries that can be fully covered by indexes
  4. Write patterns: Consider batch insert and update efficiency
// Batch insert optimization
db.products.insertMany([
  { name: "Product1", price: 10 },
  { name: "Product2", price: 20 },
  // ...more documents
])

// Batch update optimization
db.orders.updateMany(
  { status: "pending" },
  { $set: { status: "processed" } }
)

Practical Application Example

E-commerce platform data model design:

// Product collection
{
  _id: "P1001",
  sku: "MBP-13-2023",
  name: "MacBook Pro 13-inch 2023",
  category: ["Electronics", "Laptops"],
  price: 12999,
  attributes: {
    cpu: "M2",
    ram: "16GB",
    storage: "512GB SSD"
  },
  inventory: 50,
  ratings: [
    { userId: "U1001", score: 5, comment: "Very satisfied" },
    { userId: "U1002", score: 4 }
  ],
  avgRating: 4.5
}

// Order collection
{
  _id: "O2001",
  customerId: "U1001",
  items: [
    { productId: "P1001", quantity: 1, price: 12999 },
    { productId: "P1002", quantity: 2, price: 199 }
  ],
  shipping: {
    address: "Chaoyang District, Beijing...",
    method: "express"
  },
  payment: {
    method: "credit_card",
    transactionId: "TXN123456"
  },
  status: "completed",
  createdAt: ISODate("2023-05-15T10:30:00Z"),
  updatedAt: ISODate("2023-05-18T14:15:00Z")
}

Advanced Data Model Techniques

  1. Polymorphic pattern:
// Different types of event documents
[
  {
    _id: "EVT001",
    type: "login",
    userId: "U1001",
    timestamp: ISODate(...),
    ipAddress: "192.168.1.1"
  },
  {
    _id: "EVT002",
    type: "purchase",
    userId: "U1002",
    timestamp: ISODate(...),
    orderId: "O2001",
    amount: 12999
  }
]
  1. Bucket pattern (for time-series data):
// Sensor data bucket
{
  _id: "SENSOR001_202305",
  sensorId: "SENSOR001",
  month: "202305",
  readings: [
    { timestamp: ISODate("2023-05-01T00:00:00Z"), value: 23.5 },
    { timestamp: ISODate("2023-05-01T00:05:00Z"), value: 23.7 },
    // ...more readings
  ],
  stats: {
    avgValue: 24.1,
    maxValue: 28.5,
    minValue: 22.3
  }
}
  1. Computed pattern (pre-aggregated data):
// Daily sales summary
{
  _id: "SALES_20230515",
  date: ISODate("2023-05-15"),
  totalSales: 125000,
  categorySales: {
    electronics: 80000,
    clothing: 30000,
    groceries: 15000
  },
  paymentMethods: {
    credit: 70000,
    debit: 40000,
    cash: 15000
  }
}

本站部分内容来自互联网,一切版权均归源网站或源作者所有。

如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn

Front End Chuan

Front End Chuan, Chen Chuan's Code Teahouse 🍵, specializing in exorcising all kinds of stubborn bugs 💻. Daily serving baldness-warning-level development insights 🛠️, with a bonus of one-liners that'll make you laugh for ten years 🐟. Occasionally drops pixel-perfect romance brewed in a coffee cup ☕.