J2Cache open source Chinese two-level cache practice

This article is compiled from sweet potato’s speech at the 58th OSC Source Innovation Conference [Fuzhou station].

Words: 1516

Estimated reading time: 8 minutes

Content in this paper,

J2Cache is a two-level cache framework developed by open source China, which is famous for its high performance. What’s the difference between Ehcache and Redis, and what’s the performance? Open source Chinese community founder sweet potato, bring you a detailed analysis.

Guest Speech Video Address:
t.cn/RSlMZEX

Open source China’s current situation

We are currently ranked about 800 globally, with over 800,000 IP per day, about 10 million PV, over 50 million dynamic requests.

Open Source China has several application strategies, perhaps not only open source China, we do the entire web site, caching applications have the following scenarios.

Object caching

Object caching is to retrieve the user’s details based on the user’s number. That’s pretty easy to understand.

List the cache

For example, what news we posted today, there will be a list page, the list is stored in an ID. If we want to change a news story, we can get its channel list according to the ID and then find the detailed content of the news. All you need to do is change one object cache, clear that object cache and update it.

Page fragment cache

Page fragments are like the home page with a lot of content, so you can save a certain piece of HTML to memory and output it quickly.

When cleaning the cache, we also have expired automatic cleanup, program cleanup, and manual cleanup.

Ehcache cache framework

Open Source China is developed in Java. Java has a well-known Ehcache framework for caching, which is a memory-based cache framework that is very fast. Because it can’t keep all the data in memory, it can put some of the data on disk, which is a two-level cache. It also supports a cache structure for multiple regions, with the user being a cache, and news posts and the like being able to set cache invalidation policies separately. Ehcache also provides a listening interface for caching data. A cached data is notified whenever there is a problem. Ehcache also supports cluster deployment.

J2Cache

Open Source China was founded in 2011 and launched its website in 2008. The site lasted for two or three years, but then the data grew so fast that it started to have problems. The first is that a single node cannot handle high concurrency. One of the scariest problems is that Java has to be rebooted every time the application is updated so frequently. Once restarted, the entire Ehcache is erased. So if you restart it and you flood it in, the database will basically crash pretty quickly.

So why don’t we choose Ehcache’s clustering solution? Because when we store data in one node, it also copies the data to other nodes in the form of network propagation. This can cause a lot of network overhead. Redis is not used because it reads cached data very slowly.

Our solution is to combine Ehcache and Redis to learn from each other. Try to fetch data from the local machine, and then fetch data from redis.

Ehcache+ Redis, J2Cache.

This combination ensures high performance. Data is basically fetched from Ehcache, which effectively alleviates the pressure of application cold start on database. There will not be a lot of data transfer between the application and Redis, because a lot of data transfer only exists during cold startup.

J2Cache data reading process

Every time you read data from Ehcache, it is read from Ehcache first, because Ehcache is in your memory. If there is any data, return it directly. If there is no data, read redis data through the network. If there is any data, insert it into Ehcache and return it. If redis does not have the same data, then the database is read, and its data is inserted into Ehcache and Redis at the same time, and the data is returned.

J2Cache data update process

Cleaning up data starts with cleaning up nodes. When other nodes receive this command, it clears the current Ehcache. In this way, clearing data from one node and broadcasting it to other nodes ensures that cached data is synchronized across the cluster.

Serialization library selection

Because cache data is transferred over the network to Redis, we require that all objects be serializable. We ended up using FST because it was fast, the serial number generated was small, and the point was that it wasn’t intrusive to your project.

That’s all I want to share today, thank you!

Bonus tickets!

Hot summer, IT tycoon said to give benefits, send cool! As the official live broadcast partner of GOPS Global Operations Conference from July 28th to 29th, we specially strive for free tickets (original price: RMB 1600)!

Obtaining method:

Scan the code and add this little sister to wechat (or add wechat: ITDKS666), and she will tell you what to do! (Note: GOPS)