This paper mainly introduces the theory of cache in large distributed system, common cache components and application scenarios.

1 Overview of Caching

2 Cache classification

There are four main types of caches

2.1 the CDN cache

Basic introduction

The basic principle of Content Delivery Network (CDN) is to widely adopt various cache servers and distribute these cache servers to relatively concentrated regions or networks where users visit websites. When users visit websites, The global load technology is used to direct the user’s access to the nearest cache server, which can directly respond to the user’s request

Application scenarios

Mainly cache static resources such as images and videos

Application of figure

advantages

2.2 Reverse proxy Caching

Basic introduction

The reverse proxy resides in the application server room and processes all requests to the WEB server. If the page requested by the user is buffered on the proxy server, the proxy server sends the buffered content directly to the user. If there is no buffer, a request is made to the WEB server to retrieve the data, which is cached locally and then sent to the user. This reduces the load on the WEB server by reducing the number of requests to the WEB server.

Application scenarios

Generally, only small static file resources such as CSS, JS, and images are cached

Application of figure

Open source implementation

2.3 Local Application Caching

Basic introduction

It refers to the cache component in the application. Its biggest advantage is that the application and cache are in the same process, the request cache is very fast, without too much network overhead, etc. It is suitable to use local cache in the scenario where a single application does not need cluster support or nodes do not need to notify each other in the case of cluster. At the same time, its disadvantage is that the cache is coupled with the application program, multiple applications cannot directly share the cache, each application or cluster node needs to maintain its own separate cache, which is a waste of memory.

Application scenarios

Cache common data such as dictionaries

The cache media

implementation

Programming direct implementation

Ehcache

Basic introduction

Ehcache is???? An open-source, standards-based cache that improves performance, offloads databases, and simplifies scalability. It is the most widely used Java-based cache because it is powerful, proven, full-featured, and integrated with other popular libraries and frameworks. Ehcache can scale from in-process cache to hybrid in-process/out-of-process deployment using terabyte cache

Application scenarios

Ehcache architecture diagram

Ehcache main features

Ehcache Cache data expiration policy

Ehcache obsolete data elimination mechanism

Lazy elimination mechanism: Each time data is put into the cache, it will save a time, and the TTL is compared with the set time to determine whether the data is expired

Guava Cache

2.4 Distributed Cache

Basic introduction

Guava Cache is a Cache tool in Google’s Open source Java reuse toolkit

Features and Functions

Application scenarios

Data structure diagram

Cache update strategy

Cache reclamation policy

2.4 Distributed Cache

A cache component or service is separated from an application. The biggest advantage of a cache component or service is that it is an independent application and is isolated from a local application. Multiple applications can directly share the cache.

Main Application Scenarios

Application scenario of distributed cache. PNG

Main Access mode

Here are two common open source implementations of distributed caching, Memcached and Redis

Memcached

Basic introduction

Memcached is a high-performance, distributed memory object caching system that can be used to store data in a variety of formats, including images, videos, files, and database retrieval results, by maintaining a single large hash table in memory. Simply put, the data is called into memory, and then read from memory, thus greatly improving the reading speed.

The characteristics of

Basic architecture

Cache data expiration policy

LRU (least recently used) expiration policy. When storing data items in Memcached, you can specify when they will expire in the cache. The default is permanent. When the Memcached server runs out of allocated memory, the stale data is replaced first and then with the most recently unused data.

Data obsolescence internal implementations

Lazy elimination mechanism: Each time data is put into the cache, it will save a time, and the TTL is compared with the set time to determine whether the data is expired

Distributed Cluster implementation

There is no “distributed” functionality on the server. Each server is a completely separate and isolated service. Memcached’s distribution is implemented by the client program

Memcached distributed cluster implementation

Redis

Basic introduction

Redis is a remote in-memory database (non-relational database) with strong performance, replication features and a unique data model for problem solving. It can store mappings between key-value pairs and five different types of values, persist key-value pair data stored in memory to hard disk, use replication features to extend read performance, and use client sharding to extend write performance. Built in replication, LUA scripting, LRU eviction, transactions and different levels of disk persistence, And provides high availability through Redis Sentinel and Automated partitioning (Cluster).

The data model

Data elimination strategy

Data obsolescence internal implementations

Redis data out of internal implementation. PNG

Persistent mode

The underlying implementation partially parses

Diagram of part of the start-up process

Part of the process of starting
Part of the operation diagram for server-side persistence

Part of the operation of server-side persistence
Low-level hash table implementation (progressive Rehash)

Initialize the dictionary

Added dictionary element diagrams

Rehash executes the process

Cache design Principles

Redis compared to Memcached

	Redis	Memcached
Supported data structures	Hash, list, set, ordered set	Pure kev – value
Persistence support	There are	There is no
High availability support	Redis naturally supports clustering, which enables active replication and read-write separation. The Sentinel cluster management tool is also available, enabling master/slave service monitoring and automatic failover, all of which are transparent to the client without application changes or human intervention	Need secondary development
Storage Value capacity	Maximum 512 m	Maximum 1 m
Memory allocation	Temporary space request may cause debris	Pre-allocate memory pools to manage memory, saving memory allocation time
Virtual memory usage	It has its own VM mechanism, which can theoretically store more data than the physical memory. When the amount of data exceeds the threshold, swap will be triggered to flush cold data to the disk	All data is stored in physical memory
A network model	The non-blocking IO multiplexing model provides sorting and aggregation functions other than KV storage. Complex CPU calculations can block the entire IO scheduling when performing these functions	Nonblocking IO multiplexing model
Horizontal scaling support	no	no
multithreading	Redis supports single threading	Memcached supports multi-threading, and Memcache is better than Redis in CPU utilization
Expiry policies	There are special threads to clear cached data	Lazy elimination mechanism: Each time data is put into the cache, it will save a time, and the TTL is compared with the set time to determine whether the data is expired
Stand-alone QPS	About 10 w	About 60 w
Source code readability	Clean and concise code	Can be considered too much scalability, multi-system compatibility, code is not clean
Applicable scenario	Complex data structure, persistence, high availability requirements, and large value storage contents	Pure KV, very large amount of data, very large amount of concurrent business

reference

Learning architecture from scratch — Alibaba’s Li Yunhua

Java Core Technology lecture 36 — Oracle Xiaofeng Yang

Analyzing Redis architecture design — God forbid

Memcached official document

Redis persistence mode RDB and AOF difference — 58 Shen Jian

Cache, are you really using it right? 58 Shen Jian –

Redis or memcached? 58 Shen Jian –

Caching those things — Meituan tech team

Redis cache design principles – Snow Feihong

Redis cache policy and primary key invalidation mechanism — Bing Yue

Reproduced description:

https://juejin.cn/post/6844903636770750472

– MORE | – MORE excellent articles

Send this article to the next person who asks you what the Java memory model is.
Distributed transaction solutions — Flexible transaction and Service patterns
That’s all you need to know about proxies in Java.
How Java code is compiled into machine instructions.

If you saw this, you enjoyed this article.

So please long press the QR code to follow Hollis

Forwarding moments is the biggest support for me.

In-depth Understanding of caching Architecture in distributed Systems (PART 1)

1 Overview of Caching

2 Cache classification

2.1 the CDN cache

Basic introduction

Application scenarios

Application of figure

advantages

2.2 Reverse proxy Caching

Basic introduction

Application scenarios

Application of figure

Open source implementation

2.3 Local Application Caching

Basic introduction

Application scenarios

The cache media

implementation

Programming direct implementation

Ehcache

Basic introduction

Application scenarios

Ehcache architecture diagram

Ehcache main features

Ehcache Cache data expiration policy

Ehcache obsolete data elimination mechanism

Guava Cache

2.4 Distributed Cache

Basic introduction

Features and Functions

Application scenarios

Data structure diagram

Cache update strategy

Cache reclamation policy

2.4 Distributed Cache

Memcached

Basic introduction

The characteristics of

Basic architecture

Cache data expiration policy

Data obsolescence internal implementations

Distributed Cluster implementation

Redis

Basic introduction

The data model

Data elimination strategy

Data obsolescence internal implementations

Persistent mode

The underlying implementation partially parses

Cache design Principles

Redis compared to Memcached

reference

Related Posts

6.14 Background Management Implementation Roadmap

Preprocessor prime (Personal template)

Why is user experience the most important? The most important! The most important!