React Fiber scheduling ideas

Shiv ENOW large front end

Company official website: CVTE(Guangzhou Shiyuan Stock)

Team: ENOW team of CVTE software Platform Center for Future Education

The author:

preface

React Fiber uses RequestIdelCallback (rIC) for operation optimization and time sharding. Did you know how to schedule it? What is TDM? And where did this idea of scheduling come from? Are there lessons for us as developers?

We all know that in the browser, in the main thread, if you do a lot of work, it’s easy to drop frames or get stuck. The cause is that the browser uses the VSync notification page to re-render, but in the JS event frame, due to insufficient time, the task is blocked to re-render. Human eyes can see about 60 frames per second, so FPS =60 is generally considered to be the threshold of a good and smooth user experience. Generally, FPS <24 is considered to be slow for the user, since the human eye mainly sees 24 frames.

VSync signal

When one frame is drawn and the next frame is ready to be drawn, the display sends a vertical synchronization signal, or VSync. The monitor usually refreshes at a fixed rate, which is the frequency at which the VSync signal is generated.

Browser refresh rate (frames) In the browser, a frame should perform the following tasks:

Accept input events
Perform the event callback
Start a frame
performRAF(RequestAnimationFrame)
Page layout, style calculation
Apply colours to a drawing
performRIC(RequestIdelCallback)

Therefore, if the running time of a task is too long, the execution of the next frame of the task will be blocked, resulting in the phenomenon of stuttering and frame dropping.

Is there a similar phenomenon in the computer world?

single-core

In modern computers, multi-core/multi-CPU architectures usually exist. But we want to squeeze as much performance out of the core as possible, so consider optimizing for extreme scenarios, or emulating the single-threaded operation of JS in a browser: there is only a single core to handle multiple processes.

So how do you provide the illusion of many cpus?

When sharing

Having one process run only one timeslice and then switch to other processes provides the illusion of multiple virtual cpus, a practice known as time-sharing. A resource is allowed to be used by one entity for a short period of time, then by another entity for a short period of time, and so on.

Separation of concerns

Actually, resource sharing or switching processes costs performance, but we can keep the focus on how the sharing is implemented and keep the focus separate. Just as in CAP principle, in a distributed system, Consistency, Availability, and Partition tolerance are not compatible and usually only two. But because of our different concerns, we can unpack and prioritize what we need to focus on.

Of course, the simplest implementation we can think of is something like polling. There is no point in switching jobs when the time comes to move on to the next one. Pure time sharing actually belongs to NOOB CODE.

We need smarter strategies — algorithms that make certain decisions within the operating system. First: We make the following assumptions about the processes running in the operating system:

Each job runs the same amount of time;
All work arrives at the same time;
Once started, each job keeps running until it is finished;
All work is used onlyCPU;
The running time of each job is known.

And we introduce a performance indicator: turnaround time, and the calculation formula is T (turnaround time) = T (completion time) -t (arrival time).

First in, first out (FIFO)

If there are three jobs A, B, and C, each performing 10s, then the average turnaround time from A linear single-task perspective is (10 + 20 + 30) / 3 = 20s.

So let’s take it to the extreme, B and C are executed for 1s respectively, and A is executed for 100s, so the total turnaround time is (100 + 110 + 120) / 3 = 110s.

This problem is calledConvoy effect: Potential resources that take less time are ranked below heavy resource consumption.

Minimum Task First (SJF)

Could this be solved by prioritizing the shortest tasks first?

If you take the extreme example above, the average turnaround time becomes (10 + 20 + 120) / 3 = 50s.

When all tasks arrive at the same time, the shortest task first is the optimal algorithm. But in the real computer world, we can’t be sure that the next arriving task will be the shortest, and if we have to wait, it will probably take much less time than other algorithms.

Wouldn’t it be better if we used a preemptive approach?

Minimum completion time priority (STCF)

When the first task starts to run and the following two tasks arrive, we start to calculate the shortest completion time and schedule the task with the shortest completion time to the front for execution.

Average turnaround time is (10 + 20 + 120) / 3 = 50s.

It can be known that the preemptive shortest completion time first (STCF) algorithm can obtain better average turnaround time in tasks.

Yes, the shortest completion time is a good strategy based on the system. For users, however, our focus should be on interactivity. Similarly, in the front end, do we users care more about when your application is finished running? You should be more concerned with the smoothness of the interaction. So we need a new metric: response time, T (response time) = T (first run) -t (arrival time).

Based on the new metric, we find that the shortest completion time first algorithm is not friendly: the third task must wait until the first two tasks have all run. This is bad for the user experience.

So how do we build a response-time sensitive scheduler?

Round-robin

Instead of running a task until completion, run a task in a timeslice and then switch to the next task in the runqueue. The time slice length of an operating system is based on a multiple of the Timer Interrupt.

Timer Interrupt

In essence, a clock interrupt is just a periodic signal, entirely hardware behavior, that triggers the CPU to execute an interrupt service routine, but for convenience we will call this routine a clock interrupt.

Take the above example: the average RR response time is (0 + 1 + 2) / 3 = 1s; The average response time of SJF algorithm is (5 + 10 + 15) / 3 = 5s. RR has better performance in response time.

If the time slice is shortened to 0.5, then the average response time of RR will be (0 + 0.5 + 1) / 3 = 0.5s!

Yes, if, in theory, it is. However, in practice, there is actually a switching cost during process switching, so we need to weigh the length of the time slice to amortize the context switching cost.

With an overview of process scheduling in computers, let’s go back to React Fiber’s scheduling design at runtime.

React Fiber scheduling

React used to execute tasks linearly, traversing the VDOM recursively from the native execution stack. Pushing and popping tasks on the execution stack is essentially a first-in, first-out approach.

Stack ReconcilerIs an uninterruptible wayAnd the new schedulingFiber Reconciler“Are more intelligentIn Fiber, the core features can be summarized as follows:

Interruption and detachable of large tasks;
Use time slice to do time cutting;
Preemption of priority in task execution;

React Fiber’s runtime is essentially RequestIdelCallback (rIC) + priority preemption. We won’t go into that for the moment.

The logic for sharding using the VSync signal is actually the same as that for clock interrupts, which are triggered by signals from the hardware. Then do task splitting and priority scheduling on each time slice.

The process scheduling of the analog system is RR + STCF, which only replaces the priority of the shortest completion time with the priority of business needs. In a single time slice, the preemptive scheduling with the highest priority is sought.

React Fiber also has other optimization strategies, such as timeout mechanism, interruptible, suspend, restore, and Concurrent mode, which we will discuss in detail.

In fact, whether it is rotation or time sharding in React Fiber, the task execution time is longer than that of the simplest algorithm (without considering I/O and other optimizations), because there will be consumption during switching or calculation. However, based on the separation of concerns, We can focus on the features we most need to implement.

conclusion

Therefore, it is worth drawing lessons from that for large tasks, we need to start from our own concerns and find relatively reasonable solutions. It can take apart properly and perform individual decomposition tasks in a more intelligent way. And always standing on the shoulders of giants, looking at problems and looking for solutions.

Refer to the article

Introduction to Operating Systems – Virtualization OF cpus