This article from theTencent Cloud Technology Salon, the theme of this salon isPractice of personalized teaching technology in online education

Speaker: Zhou Jinmin | 2011 graduation to tencent, the current online education online education background center senior engineer, Linux backstage development work experience for many years, mainly responsible for tencent and penguins tutoring class two product background system architecture design and research management.

Today’s topic is divided into three parts. In the first part, I will introduce the two products of Tencent Classroom and Penguin Tutoring. Second, talk about the classroom live broadcast system, and Tencent cloud side of the specific practice cases. Thirdly, talk about the room system design scheme of online education and the optimization effect in the process of these years.

What is Tencent Classroom? (ke.qq.com)

What is Tencent Classroom? Tencent Classroom is an online education platform launched by Tencent, which gathers a large number of high-quality institutions and lecturers and covers a large number of high-quality courses. Including test certificate, IT teaching, English training and so on. This is a one-stop online teaching and interactive learning experience for teachers, which is characteristic of our classroom. We have a huge amount of resources and traffic. Currently, there are more than 40,000 registered institutions, including more than 200,000 online education courses. A variety of subject types can be selected. At present, the cumulative number of classes is 30 million, more than one million per week. At present, 60,000 people are online for a single course at the same time. Tencent classroom has a wealth of teaching tools, help institutions and teachers have a richer teaching management.

What is penguin Tutoring? (fudao.qq.com)

Penguin Tutoring is an online learning platform for primary, middle and high school students. It has a team of more than 200 teachers covering 10 grades and 2 million student users. These are the teachers from Qigui. They are all full-time teachers from tsinghua and Peking Universities. We used to be the top scorer in the college entrance examination of all subjects. Penguin tutoring has a complete and detailed teaching task, from pre-school to post-school, to help students improve completely. Qigui tutoring supports live 1V1 q&A tutoring, and professional after-school teachers correct students’ homework. Penguin tutoring has a variety of learning AIDS, such as online interactive response, the use of image recognition technology, online photo correction, and grab the red envelope interaction, to enhance students’ interest in learning. Tencent Classroom and Penguin Tutor support multi-terminal learning, which can be easily learned on a computer or mobile phone.

Overall technical framework of Tencent Classroom

The following is the technical architecture of the whole background of Tencent Classroom, including PCQQ, H5 terminal, APP terminal, PC independent version, Mac terminal. Channel layer, this is the corresponding channel. The access layer is a unified access layer that converts the protocols on each end into internal communication protocols. Logic layer. The whole background of our education class is divided into three parts. The first part is the institutional platform, including data, order payment and activity operation. The second piece is the live broadcast system. Third, the room system. Below are our core modules, which are the basic functions, like data center, order system, basic data, personal center. There are a variety of storage layers, Mysql, Ckv, Es, Redis.

The following is a case of Tencent online education combined with Tencent Cloud interactive live broadcasting technology. We do this specifically. Teacher live, access to Tencent cloud interactive live system, Tencent cloud private protocol UDP protocol. Students can interact with teachers in real time.

Tencent Cloud’s interactive live streaming system will forward audio and video streams to Tencent Cloud live streaming system by bypass. The access module of the live broadcast system receives the stream and does two things.

The first thing to do is to forward the stream from the interactive live streaming system to the global transcoding module. The global transcoding module supports customization of many parameters. For example, transcoding methods of various resolutions can be developed. Can add Tencent classroom LOGO watermark, and support video encryption, to ensure the security of video materials, intellectual property rights, and support a variety of protocols encapsulation.

Second, the global transcoding module will provide audio and video streaming to the cloud mixed streaming function, which is also the ability of the live streaming system. How does this function work? It supports a variety of preset mixed-stream modules. Our education side has several products, for example, we have PPT, picture in picture, students raise their hands on the mic. We will preset several mixed stream templates. When the teacher cuts the PPT in picture, or cuts the PPT in picture, or when a student raises his hand to go to the MAC, the client will initiate a change signal to the education service, which will adjust the interface of the cloud mixed stream service to change the mixed stream mode. According to the requirements of the mixed stream template, the cloud mixed stream module will pull the specified multi-channel audio and video stream from the live broadcast access module, and then combine the multi-channel stream into one path, which will be handed over to the global transcoding for re-transcoding. After transcoding is completed, it is handed over to the distribution mode, and the distribution module caches the audio and video files to the CDN module.

The CDN module of Tencent Cloud has more than 1000 acceleration nodes in China and more than 200 in the world. Because Tencent classroom side of the teachers, students all over the world. The acceleration of these CDN nodes can provide a stable and good broadcast quality for classroom live broadcast.

In the live broadcast, H5 and PcWeb terminal adopt the way of mixed streaming, which can reduce the bandwidth flow of mobile phones and has better compatibility and stability.

At the same time, Tencent cloud vod system will be recorded in real time. During recording, multiple recording file shards are generated. After that, fragments will be pulled back to the local area, and videos will be re-aligned. Layout and adjustment will be made, including resolution adjustment, and frames will be inserted to fill the stream. When there is a video interruption, a static picture with the class logo will be inserted to ensure the consistency of audio and video. Finally, the customized playback file is regenerated. So that students can have a consistent viewing experience when they watch the playback.

Finally, the recording method. The recording side of the student side also supports the way of recording and playing all the way or recording and playing many ways. PC end because the home bandwidth is stable enough, we use the way of recording and broadcasting. Passing through multiple ways in the client can do a lot of customization needs. For example, during playback, the PC client can dynamically adjust the switch between picture-in-picture and PPT, and adjust the resolution or layout.

The above is the practice of online education classroom combined with Tencent cloud audio and video products. We use a set of solutions provided by Tencent Cloud including interactive live broadcasting, live broadcasting, on-demand broadcasting, CDN acceleration and video encryption. With Tencent cloud’s full set of solutions, there is very little we need to do. Now Tencent’s internal CTO and the company’s general office are also vigorously promoting the development strategy of cloud technology as a whole. We also actively respond to the call of leaders and actively go cloud, which can greatly reduce the operation and maintenance costs of audio and video, and focus more energy on product polishing, so that teachers and students can have a better product experience.

Room System Architecture

For the architecture of the room system, see the flow below, which is divided into several modules. First, there is the access layer, where the inbound and outbound proxy services access client requests and forward them to the heartbeat service and member list for state storage. Then comes the logic layer, which includes many classroom interaction functions, such as students raising their hands, chat area, red envelope interaction and so on. Each interaction behavior will generate a message, which will be transmitted to other teachers and students through push proxy service.

We faced three major challenges during the development of the room system. Heartbeat service, member list, because the concurrency is very large, then how to achieve smooth expansion? How to ensure the reliability and availability of services, and how to do disaster recovery? In addition, how to ensure the universality and ease of use of message push service, and how to ensure the reliability of message? How to solve the problem of service overload caused by message storm when the number of concurrent messages is high? Here are our optimization practices for these three modules.

Heartbeat Service Optimization

Before the heartbeat service optimization, it adopts MSGQ access (MSGQ is the in-memory message queue developed by Tencent), and adopts the two-machine single-process model. This scheme is simple in practice and can meet the needs of fast online business in the early stage of the project. However, with the increasing number of users, it is no longer able to support the implementation of current business. It now has several problems. It is easy to lose messages during peak times, and there is a risk of losing messages when the system message volume suddenly increases, the MSGQ buffer queue reaches its maximum or the MSGQ service is abnormal. Second, the logic is complex and not universal, such as timeout check and multi-terminal login need custom development. Third, because the heartbeat service needs to save the current heartbeat status, the current synchronization solution cannot be expanded. Based on these three problems, we made new optimizations.

Here is the new version of the optimization. We divided the heartbeat service architecture into two layers, heart_proxy and heart_SVR, and routed through L5 service. What is L5? L5 is Tencent’s internal routing decision service.

The new heartbeat service solves four problems. 1. Service expansibility. 2. Ensure the reliability of services. 3. Universal design. 4, kicking detection, avoid false kicking.

  • Scalability, how do we do scalability? Our solution is to introduce a consistent hashing algorithm. Hash the request based on service bid+ roomID + user QQ, and route the request to the specified HearT_SVR for storage.
  • In terms of reliability, the L5 service detects the failure of a heartbeat storage node within one minute and removes the IP address of this node. The Proxy routes it to other normal nodes. In this way, the heart rate is maintained.
  • In terms of universal design, we have two products now, classroom and tutorial, and there may be other new products in the future. Each product involves the function of heartbeat, so we make the heartbeat service a universal service in advance and introduce the way of business number Bid, so that multiple products can apply the same set of heartbeat service, so as to solve the common problem of multiple products.
  • Kick detection, a very important point how to avoid false kicks? For example, when a student is in the middle of a class and suddenly gets kicked out of the class, that can lead to feedback from the student and create an unfriendly experience for the student. Therefore, when a heartbeat storage node detects that a user times out, it sends a reverse check to heart_proxy. The Heart_proxy broadcasts the query request to all Heart_SVR storage nodes to ensure the kicking security.

Member list service optimization

The initial version of the member list uses proxy plus CKV storage. CKV is an in-house key-value database developed by Tencent. The member list of each room is serialized by PB and then stored in CKV. When it needs to be read, it is read as a whole and then deserialized for use. This way stores several problems.

  • First, when the number of room users is too large, frequent access to the room will generate a large number of network IO. A user’s information is now 40Bytes, and if there are 10,000 people, the membership list is over 400 K. If it is the peak use time, a large amount of network IO, the network card becomes the bottleneck.
  • Second, CAS conflicts are serious. Frequent update, deletion, and modification of CKV services may cause a large number of CAS conflicts, which may affect service performance.
  • Third, the read and write performance is low. CKV get, long list deserialization time. Overall performance is less than 200qps, more serious timeout. At present, the number of our users has increased a lot, and the current architecture can no longer meet the current demand, so we have modified the new version.

What did we do with the new version? We investigated several storage options.

  • The first is Redis, which supports member lists and leaderboard storage. However, our business member list needs to have customized query requirements, such as query by version number, query by platform type, etc., and need to support paging. In this area, Redis data structure support is inadequate.
  • The second one is Grocery, which is a storage scheme developed by Tencent internally. It adopts the method of multi-order hash, which perfectly supports the storage and selection of current member lists. The problem is that its length is limited, it supports up to 5000 people, now QQ list is in use. However, the number of people in our single room has exceeded w level, so this scheme is not suitable for now.
  • The third is hierarchical ILST, which is used in microblogging scenarios and supports very long lists. It is mainly used for offline PAAS with high latency.
  • The fourth is PHXKV. PHXKV is a solution produced by wechat. It is based on PHXKV protocol with strong consistency and good performance. I am currently investigating this scheme.

At present, we finally use memory storage, multi-master synchronization design scheme. It has the following characteristics:

  • Full memory structure, using the secondary hash_map structure,c++ STL standard data structure, multi-main mode. Final consistent model, write on return, good performance, heartbeat to repair. Single set performance up to 3W QPS.
  • Expansibility: File storage according to service bid and transfer according to proxy to ensure expansibility.
  • Memory list data is dumped to virtual disks every 3 minutes to ensure fast recovery after a restart.
  • Consistency:
  • 1) Heartbeat repair to ensure final consistency
  • 2) At the same time to provide strong consistency of THE API capabilities, through the multi-read way to achieve.

Message push optimization

Before message push optimization, each logical service pulls its own list of members and formulates a corresponding push proxy for each channel. The disadvantages of this scheme are that the code is very redundant, there is no unified interface and strong coupling between modules. At present, this program cannot meet our needs for rapid iterative development, so we have reformed this program.

The new message push transformation scheme is mainly divided into three parts: push_proxy unified access layer, ckafka of Tencent Cloud is introduced as message buffer, and Redis is introduced as asynchronous message storage.

Push_proxy supports a variety of customized push modes: unicast, broadcast, assign a person, assign a role, and assign an end.

We have made performance optimization for push service:

  • PushProxy uses a process-level cache to cache large room (>2000 people) member lists with 2s timeout. Small room real-time member list, to ensure the real-time push. How to use cache. Global or process-level?
  • Message classification: Important and non-important messages. If there are few messages, push them directly. If there are too many messages, use the message merge mechanism. This is the MSG_Center capability, which can accumulate messages in real time, using a consolidated push beyond the threshold
  • In addition, the teacher client also provides direct ban ability, to prevent malicious users from sweeping the screen.
  • To avoid scrolling of chat messages and enhance user experience, the room is divided into small groups and large groups.

What do we do about DISASTER recovery degradation?

  • Under normal circumstances, teachers can take the initiative to silence.
  • In addition, global flow control is supported to limit the total amount of messages pushed downstream in time stamps (s). Whenever the QPS of pushProxy exceeds 3K /s, edu_MSg_center is fed back to reduce the chat message frequency to ensure the normal push of important messages.

Finally, message reliability practices:

  • Messages are pushed in real time, redis is saved asynchronously, and Kafka message queue capability is adopted to alleviate the pressure of diffuse write.
  • What if the client loses messages? Solution: Each user receives a push message with a strictly incremented MSGID. The client maintains a list of the largest MSGID received and missing MSGIds. Periodic 2s reports the lost MSGID list and the maximum msGID received, and the background returns the lost message list. In this way, the message loss problem is solved.

For more details, please click the following link: Zhou Jinmin: Technology practice of Tencent online education video interactive direct broadcast room. PDF


Question and answer

What is the difference between interactive and live streaming?

reading

Liu Lianxiang: the application of real-time audio and video in interactive scenes

Guo Zhuoxing: Examples of interactive classroom construction and application in related fields

Yang Ting: Tencent cloud online education solution sharing


Has been authorized by the author tencent cloud + community release, the original link: https://cloud.tencent.com/developer/article/1154667?fromSource=waitui

Welcome toTencent Cloud + communityOr pay attention to the wechat public account (QcloudCommunity), the first time to get more massive technical practice dry goods oh ~