One, foreword

MongoDB is a popular database that works without the constraints of any table Schema Schema. The data is stored in a JSON-like format and can contain different types of data structures. But in order to take full advantage of MongoDB, you must understand and follow some basic database design principles. Before explaining the design approach, we must first understand the structure of how MongoDB stores data.

Second, SQL concept and MongoDB concept comparison

Data structures are critical to a piece of software, and MongoDB references SQL databases in its conceptual model, but not in the same way.

2.1 How is Data stored in MongoDB

Unlike traditional RDBMS relational databases, MongoDB does not have the concepts of Table, row, and column. It stores data in the collection collections, document Documents, and fields. The following diagram illustrates the relationship between RDBMS analogies:

SQL concepts Directing a concept instructions
database database Database a database contains multiple collections (tables).
table collection A collection is equivalent to a table in SQL. A collection can hold multiple documents (rows). The difference is that the structure of the collection – (schema) is dynamic and does not require a strict table structure to be declared up front. More importantly, MongoDB does not do any schema validation on written data by default.
row document Document document is equivalent to rows in SQL. A document consists of multiple fields (columns) and is represented in THE BSON (JSON) format.
column field Field fields are equivalent to columns in SQL. Compared with ordinary columns, field types can be more flexible, such as supporting nested documents and arrays.

In addition, the types of fields in MongoDB are fixed, case-sensitive, and the fields in the document are ordered.

2.2 SQL Mapping

SQL concepts Directing a concept instructions
primary key _id _ID primary key. By default, MongoDB uses an _ID field to ensure document uniqueness.
foreign key reference Reference is a kind of foreign key. It is a kind of foreign key because reference does not implement any foreign key constraints, but is only a special type of associated query and transformation automatically carried out by the client driver.
view view MongoDB 3.4 starts to support views, not unlike SQL views, which are a layer of dynamically queried objects based on tables/collections, either virtual or physical (materialized views).
index index Index Indicates the index, which is the same as the SQL index.
join $lookup $lookup, which is an aggregation operator that can be used to implement functions similar to SQL-JOIN joins
transaction transaction Transaction transactions, starting with MongoDB version 4.0, provide support for transactions
group by aggregation Aggregation aggregation. MongoDB provides a powerful aggregation computing framework, among which Group BY is a kind of aggregation operation.

3. BSON data type

MongoDB documents can be represented using Javascript objects and are formatted jSON-based. However, JSON has its drawbacks, such as not being able to support specific data types like dates, so MongoDB actually uses an extended JSON called BSON(Binary JSON).

The data types supported by BSON include:

Four, database design skills and tricks

4.1 Normalized storage vs. nonnormalized Storage

Because MongoDB uses documents to store data, it’s important to understand the concepts of “normalized storage” and “de-normalized storage.”

Normalized storage: – Normalization means storing data into multiple collections and designing associations between them. Once the data is saved, it is easier to update the data. However, when reading data, the disadvantages of normalized storage become apparent. If you were to look up data from multiple collections, you would have to perform multiple queries, which would slow the reading of the data. (For example, storing web page title, author, and content in separate collections)

De-normalized storage: – This method stores several object data in a nested manner into a single document. It performs better when reading data, but slower when writing. This way of storing data will also take up more space. (For example, store page title, author, and content in the same collection)

So before choosing between storing data, evaluate how your application database will be used. If you have a data that does not need to be updated frequently, where instant consistency of updates is not important, but requires good performance when reading, then de-normalization may be a wise choice. (For example: the blog post of our blog, once saved by the author, is almost not frequently modified, but faced with frequent reading operations by readers)

If the document data in the database needs to be updated constantly, and you want good performance when writing, you may want to consider normalized storage. (E.g., business-class systems that require frequent data modification)