1. NoSql introduction and Overview

1.1) Great opportunities in the Internet era, why noSQL

1.1.1 The good old days of standalone MySQL

In the ’90s, the traffic to a website was so small that it was easy to handle a single database. At that time, there were more static web pages than dynamic interactive types of sites

Under the above architecture, what is the bottleneck of data storage?

1. Total data volume cannot be stored on a single machine. 2. Data index (B+ Tree) 3. Page views (mixed read/write) an instance cannot withstand if 1 or 3 of the above are met, evolve……

1.1.2 Memcached +MySQL+ vertical split

Later, with the increase of traffic, almost all sites using MySQL architecture began to have performance problems in the database, web applications are no longer just focused on function, but also in pursuit of performance. Programmers began to use caching technology to relieve the pressure of the database, optimize the structure and index of the database. At the beginning, file caching is popular to relieve the database pressure, but when the traffic continues to increase, multiple Web machines can not share through the file cache, a large number of small file cache also brought a relatively high IO pressure. At this point,Memcached is naturally a very fashionable technology product.

Memcached, as an independent distributed cache server, provides a high-performance shared cache service for multiple Web servers. On the Memcached server, multiple Memcached services are extended based on the hash algorithm. Then came consistent hashing to address the massive cache invalidation caused by rehashing by adding or subtracting cache servers

1.1.3 Primary and Secondary read/write separation of Mysql

As database write pressure increases, Memcached only relieves database read pressure. Most websites start to use master-slave replication technology to achieve read/write separation in order to improve read/write performance and scalability of the read/write library. Mysql’s master-slave mode became standard on the site at this time.

1.1.4 Split table and library + Horizontal split +mysql cluster

On the basis of Memcached cache, MySQL master-slave replication, read/write separation, the write pressure of MySQL master library begins to bottleneck, and the data volume continues to surge. Because MyISAM uses table lock, there will be serious lock problems under high concurrency. A large number of high-concurrency MySQL applications are starting to use the InnoDB engine instead of MyISAM.

At the same time, it became popular to use separate tables and libraries to ease the write pressure and data growth expansion problems. At this point, dividing tables and libraries became a hot technology, a hot interview question and a hot technical question discussed in the industry. It was around this time that MySQL came out with a table partition that was still not stable, which gave hope to companies with modest technology. Although MySQL launched the MySQL Cluster Cluster, the performance can not well meet the requirements of the Internet, but provides a very large guarantee of high reliability.

1.1.5 MySQL scalability Bottleneck

MySQL database also often stores some large text fields, resulting in a very large database table, which results in very slow database recovery, and it is not easy to quickly restore the database. For example, 10 million 4KB of text is close to 40GB of text. If you could leave this data out of MySQL, MySQL would become very small. Relational databases are powerful, but they can’t handle all application scenarios well. MySQL’s poor scalability (complex technology is required to achieve), big data IO pressure, difficult to change the table structure, are the current use of MySQL developers are facing problems.

1.1.6 what does it look like today?

1.1.7 为什么用NoSQL

Today, we can easily access and capture data through third-party platforms (such as Google,Facebook, etc.). User profiles, social networks, geolocation, user-generated data and user logs have multiplied.If we want to mine these user data, the SQL database is not suitable for these applicationsThe development of NoSQL database can also handle these large data well.

1.2) what is it

NoSQL(NoSQL = Not Only SQL), meaning “more than SQL”, refers to non-relational database. With the rise of the Internet web2.0 website, the traditional relational database has been unable to cope with the web2.0 website, especially the super large scale and high concurrency SNS type web2.0 pure dynamic website, exposing a lot of problems that are difficult to overcome, while the non-relational database has been very rapid development because of its own characteristics. NoSQL database is created to solve the challenges of large data sets and multiple data types, especially the problems of big data applications, including the storage of super-large data.

1.3) can do

1.3.1 easy extension

There are many types of NoSQL databases, but one common feature is to remove the relational features of relational databases. There is no relationship between the data, which makes it easy to scale. It also brings scalability at the architectural level.

1.3.2 High Performance with large data volume

NoSQL databases have very high read/write performance, especially under large data volumes. This benefits from its irrelevance and simple database structure. Generally, MySQL uses Query Cache, which is invalidated every time a table is updated. It is a large-granularity Cache, and the Cache performance is not high in applications with frequent web2.0 interactions. NoSQL’s Cache is record level, which is a fine-grained Cache, so NoSQL’s performance is much higher at this level

1.3.3 Diverse and flexible data models

NoSQL does not need to create fields for the data to be stored in advance and can store custom data formats at any time. In relational databases, adding and deleting fields is a very troublesome thing. If you have a very large table, adding fields is a nightmare

1.3.4 Traditional RDBMS VS NOSQL

RDBMS vs NoSQL

RDBMS

  • Highly organized structured data
  • Structured Query Language (SQL)
  • Data and relationships are stored in separate tables.
  • Data manipulation language, data definition language
  • Strict consistency
  • Based on the transaction

NoSQL

  • Stands for more than just SQL
  • There is no declarative query language
  • There are no predefined patterns

– Key-value pair storage, column storage, document storage, graphic database

  • Final consistency, not ACID properties
  • Unstructured and unpredictable data
  • The CAP theorem
  • High performance, high availability and scalability

1.4) go under

  • Redis
  • Memcache
  • Mongdb

1.5) how to play

  • KV
  • Cache
  • Persistence

.