• Data type support

    Redis Memcached
    Stirng Square root Square root
    Hash Square root
    List Square root
    Set Square root
    Sorted set Square root
  • Data persistence support

    Although Redis is a memory-based storage system, it itself supports persistence of in-memory data and provides two main persistence strategies: RDB snapshots and AOF logging. Memcached does not support data persistence.

    1) RDB snapshot

    Redis supports a persistence mechanism that stores a snapshot of the current data as a data file, called AN RDB snapshot. But how does a database that keeps writing generate snapshots? Redis makes use of the copy on write mechanism of fork commands. During snapshot generation, fork the current process out of a child process, then loop all data in the child process and write the data to an RDB file. The save command of Redis can be used to configure the timing of RDB snapshot generation. For example, a snapshot can be generated after 10 minutes, a snapshot can be generated after 1000 writes, or multiple rules can be implemented together. These rules are defined in the Redis configuration file. You can also SET the rules while Redis is running by using the Redis CONFIG SET command without rebooting Redis.

    The Redis RDB file does not break because its writes are performed in a new process. When a new RDB file is created, the Redis child writes the data to a temporary file and then uses the atomic RENAME system call to rename the temporary file to the RDB file. This way, Redis RDB files are always available in case of a failure. Redis RDB files are also part of the internal Redis master/slave synchronization implementation. RDB has its disadvantages, that is, once there is a problem with the database, the data saved in our RDB file is not new, and the data from the last RDB file generation to Redis shutdown period is lost. In some businesses, this is tolerable.

    2) AOF logs

    The full name of an AOF log is Append Only File, which is an appending log file. Unlike the binlog of a database, an AOF file is plain text, and its contents are simply Redis standard commands. Only commands that cause data to be modified are appended to the AOF file. With every command to modify data generating a log, the AOF file would get bigger and bigger, so Redis provided another feature called AOF rewrite. The function is to recreate an AOF file, where a record is operated on only once, unlike an old file, which may record multiple operations on the same value. Similar to RDB, it forks a process that iterates through the data and writes to a new AOF temporary file. During the process of writing a new file, all write operation logs are still written to the old AOF file and also recorded in the memory buffer. When the redo operation is complete, all the logs in the buffer are written to a temporary file at once. Then call the atomic rename command to replace the old AOF file with the new AOF file.

    AOF is a write file operation whose purpose is to write operation logs to disk, so it also encounters the flow of write operations we described above. After writing to AOF in Redis, the appendfsync option controls how long fsync is written to disk. The following appendfsync Settings become more secure.

    Appendfsync No When appendfsync is set to no, Redis does not actively call fsync to synchronize the AOF log content to disk, so it is entirely dependent on OS debugging. On most Linux operating systems, fsync is performed every 30 seconds to write the data in the buffer to disk.

    Appendfsync everySec When appendfsync is set to everysec, Redis defaults to making fsync calls every second to write data from the buffer to disk. But when the fsync call takes longer than 1 second. Redis will adopt a policy of delaying fsync for another second. That is, fsync will be performed two seconds later, and this time the fsync will be performed no matter how long it takes. Since the file descriptor is blocked during fsync, the current write operation is blocked. So the conclusion is that, in most cases, Redis will do fsync every second. In the worst case, a fsync operation is performed every two seconds. This operation, known as a group commit on most database systems, is to combine the data of multiple writes and write the log to disk at once.

    Appednfsync always When appendfsync is set to always, fsync is called once for every write operation and the data is most secure. Of course, performance is affected because fsync is executed every time.

    For general business requirements, it is recommended to use RDB for persistence because the overhead of RDB is much lower than that of AOF logs. AOF logs are recommended for applications that cannot afford to lose data.

  • Memory management mechanism

    In Redis, not all data is stored in memory at all times. This is one of the biggest differences from Memcached. When physical memory runs out, Redis can swap some long-unused values to disk. Redis only caches information about all keys. If Redis finds that the memory usage exceeds a certain threshold, the swap operation will be triggered. Redis calculates which key values need to be swapped according to swappability = age*log(size_in_memory). The values corresponding to these keys are persisted to disk and erased from memory. This feature allows Redis to hold data that exceeds the size of its machine’s own memory. Of course, the machine’s own memory must be able to hold all the keys, after all, these data will not swap. At the same time, when Redis swaps the data in the memory to the disk, the main thread that provides the service and the sub-thread that performs the swap operation will share the memory. Therefore, if the data needs to be updated, Redis will block the operation until the sub-thread completes the swap operation. When reading data from Redis, if the value of the read key is not in memory, Redis must load the corresponding data from swap and return it to the requester. There is the issue of an I/O thread pool. By default, Redis blocks until all swap files are loaded. This policy is suitable for batch operations when the number of clients is small. However, if you apply Redis to a large web application, this is obviously not sufficient for large concurrency. So Redis runs we set the size of the I/O thread pool to do concurrent operations on read requests that need to load the corresponding data from the swap file, reducing the blocking time.

    By default, Memcached uses Slab Allocation to manage memory. The main idea of Memcached is to divide the allocated memory into blocks of a specified length to store key-value data records of the corresponding length according to the preset size. This completely solves the memory fragmentation problem. Slab Allocation is only designed to store external data. That is, all key-value data will be stored in Slab Allocation. Other Memcached memory requests will be requested using the normal malloc/free method. Because the number and frequency of these requests will not affect the overall system performance, the principle of Slab Allocation is quite simple.

    For memory-based database systems such as Redis and Memcached, the efficiency of memory management is a key factor affecting system performance. The malloc/free function in traditional C language is the most commonly used method to allocate and free memory. However, this method has major drawbacks. First, for developers, malloc and free mismatch can easily cause memory leaks. Secondly, frequent invocation will cause a large number of memory fragments cannot be reclaimed and reused, reducing the memory utilization. Finally, as a system call, its overhead is much higher than that of a normal function call. Therefore, in order to improve the efficiency of memory management, efficient memory management solutions do not directly use malloc/free calls. Redis and Memcached both use their own memory management mechanisms, but they are implemented in very different ways. Here’s how they are managed.

  • Cluster Management Mechanism

    Memcached does not support distribution per se, so distributed storage of Memcached can only be implemented on the client side using distributed algorithms such as consistent hashing. The following diagram shows Memcached’s distributed storage implementation architecture. Before sending data to the Memcached cluster, the client uses a built-in distributed algorithm to calculate the destination node for the data. The data is then sent directly to the node for storage. However, when the client queries data, it also calculates the node where the data is queried and sends a query request to the node to obtain data.

    Redis already supports distributed storage. Redis Cluster is an advanced version of Redis that implements distributed and allows single point of failure. It has no central node and linear scalability. The following figure shows the distributed storage architecture of Redis Cluster, in which nodes communicate with each other through binary protocol and nodes communicate with clients through ASCII protocol. In terms of data placement strategy, Redis Cluster divides the value field of the whole key into 4096 hash slots. Each node can store one or more hash slots, that is to say, the maximum number of nodes supported by the current Redis Cluster is 4096. Redis Cluster also uses a simple distributed algorithm: crc16(key) % HASH_SLOTS_NUMBER.