We serve a lot of real-time interactive scenarios. In some scenarios, users not only need to interact in real time, but also need to record the interaction. So what are the characteristics of a good recording solution?

Before we answer that question, let’s talk about why customers use recording. Generally speaking, there are three main reasons for users to use the recording function:

1. Quality inspection. For example, in the context of education, it is necessary to check the quality of the course through playback recording; in the context of social broadcast or double recording of finance, it is necessary to retain the recorded video for compliance review.

(2). Such as education, medical, audio and video customer service and other scenarios, need to keep records to deal with possible disputes. In this scenario, the core appeal of the recording scheme is content integrity, and the loss of video even at the level of seconds cannot be tolerated.

3. The playback. For example, in educational scenarios, live scenarios, users want to watch playback. This is the main reason recording is used in most real-time interactive scenes.

So what is a good recording solution in this situation?

Recording schemes can be measured from five dimensions:

  • Recording effect: The real interactive scene needs to be restored, including audio and video, courseware, whiteboard, chat information and other elements. At the same time, it should not have any negative impact on the audio and video interactive experience of anchors.
  • Integration difficulty: As simple as possible, preferably without development.
  • Waiting time: The shorter the waiting time, the better. It is better to play back immediately after recording.
  • File compatibility: Any platform, any browser can play.
  • Convenience of file migration: The migration process, such as file download and upload, should be very simple and convenient for recording file management.

In order to solve the recording requirements of various scenes, there are two mainstream schemes.

I. Audio and video, white board and other elements are recorded separately, and then spliced and played back

The main idea is to record audio and video, whiteboard, courseware, PPT and chat content respectively, and then play back respectively after recording, and align the playback progress with the time stamp. The advantage of this scheme is that the whiteboard, courseware and chat content are all played back in the form of data, so the original real interaction effect can be retained. For example, PPT can be turned separately, which has good flexibility. But the disadvantages are also obvious:

1. Integration is difficult. Audio and video recording, whiteboard recording and chat content recording need to be developed at the same time. In particular, different elements need to be played back through timestamp alignment, so more development efforts are needed to achieve a very good synchronization effect.

2. Playback compatibility is limited. This method can only be played back by special players and does not work well with mainstream players.

3. Long waiting time. In order to solve playback compatibility problems, it is often necessary to convert the recording to a complete MP4 file offline, which takes a long time and incurs additional transcoding costs.

Two, local client screen recording

Whether it’s recording on a local client or streaming the screen to the cloud for recording through screen sharing, the essence is capturing the screen content on the user’s local client. The advantage of this scheme is that what you see is what you get, and the playback can be consistent with the real interactive scene. But its disadvantages are also quite obvious:

1. Local users’ RTC interaction experience is affected. Local capture screen contents terminal equipment will greatly consumption of computing resources, if you want to upload real-time, will occupy the host uplink bandwidth resources, these will affect the local user experience of audio and video calls can appear even caton, fuzzy and other serious consequences, it is difficult to accept for a real-time interactive scene of the fatal flaw.

2. The integration is difficult. Developers need to develop on the end, need to solve problems such as local storage of files, upload, often also need to deal with complex sound mixing issues, integration threshold is very high.

Is there a better alternative to these two main ideas?

Agora offers a third new idea: page recording

Page recording refers to the simultaneous recording of audio and video, whiteboard, courseware, and chat messages on the server through Web page rendering to restore real interactive scenes. The principle is as follows: The developer initiates a recording request through the RESTful API, and sends the URL of the page to be recorded to the Agora recording service in the form of request parameters. The Agora recording service opens the Web page, records the screen in real time, generates an MP4 file, and uploads it to the specified third-party cloud storage platform. To record a document on the specific page, click “Read original” to browse.

Judge the dimension according to the previous recording scheme and compare the page recording with the recording scheme we listed previously:

  • In the integration, through Restful API initiation request recording, easy to use.
  • The recording effect is wySIWYG. Audio and video, whiteboard, courseware and chat messages are all recorded at the same time without additional bandwidth and performance overhead. The recording process does not affect the RTC interactive experience of any anchor/audience.
  • After recording, MP4 files can be generated in real time, compatible with all mainstream players.
  • File download is very simple, easy to record file management.

Meanwhile, page recording has the ability to record any web page, so developers using WebRTC or other solutions to develop their own RTC functionality can also use it.