This is the 9th day of my participation in the November Gwen Challenge. See details: The Last Gwen Challenge 2021.

The basic concept

Dedicated worker threads can be called background scripts. All aspects of a JavaScript thread, including lifecycle management, code paths, and input/output, are controlled by the scripts provided when the thread is initialized. The script can also request other scripts, but a thread always starts with a script.

Create a dedicated worker thread

The most common way to create a dedicated worker thread is to load a JavaScript file. The file path is provided to the Worker constructor, which asynchronously loads the script in the background and instantiates the chemical author thread. The file path passed to the constructor can take many forms.

The following code demonstrates how to create an empty dedicated worker thread:

Emptyworker.js // Empty js worker thread file main.js console.log(location.href); // "https://example.com/" const worker = new Worker(location.href + 'emptyWorker.js'); console.log(worker); // Worker {}Copy the code

This example is very simple, but covers a few basic concepts.

  • The emptyworker.js file is loaded from an absolute path. Depending on the structure of your application, using absolute urls is often redundant.
  • This file is loaded in the background, and the worker thread is initialized completely independent of main.js.
  • The Worker thread itself exists in a separate JavaScript environment, so main.js must be propped into the Worker object

Now communicates with the worker thread. In the example above, the object is assigned to the worker variable.

  • Although the corresponding Worker thread may not yet exist, the Worker object is already available in the original environment.

The previous example can be modified to use relative paths. However, this requires main.js to be in the same path as emptyworker.js:

const worker = new Worker('./emptyWorker.js');
console.log(worker); // Worker {}
Copy the code

Worker thread safety restrictions

Script files for worker threads can only be loaded from the same source as the parent page. Loading a worker thread script file from another source causes an error like this:

/ / try based on https://example.com/worker.js to create a Worker thread const sameOriginWorker = new Worker ('. / Worker. Js'); / / try based on https://untrusted.com/worker.js to create a worker thread const remoteOriginWorker = new Worker('https://untrusted.com/worker.js'); // Error: Uncaught DOMException: Failed to construct 'Worker': // Script at https://untrusted.com/main.js cannot be accessed // from origin https://example.comCopy the code

Note that worker threads cannot be created using non-homologous scripts, and that it does not affect the execution of scripts from other sources. Inside the worker thread, scripts from other sources can be loaded using importScripts()

Worker threads created based on load scripts are not restricted by the document’s content-safety policies because worker threads run in a different context from the parent document. However, if a script loaded by a worker thread has a globally unique identifier (as if loaded from a large binary file), it is subject to parent document content security policies.

Working with Worker objects

The Worker() constructor returns the Worker object as a connection point to communicate with the dedicated Worker thread you just created. It can be used to transfer information between worker threads and parent contexts, as well as to capture events emitted by dedicated worker threads. Be careful to manage each Worker object created using Worker(). The Worker thread is not garbage collected until it is terminated, nor can it programmatically recover a reference to the previous Worker object. The Worker object supports the following event handler properties.

  • Onerror: the handler assigned to this property is called when an ErrorEvent of type ErrorEvent occurs in the worker thread.
    • This event occurs when an error is thrown in the worker thread.
    • This event can also be handled by worker.addeventListener (‘error’, handler).
  • Onmessage: the point assigned to this property when a MessageEvent of type MessageEvent occurs in the worker thread

Program. – This event occurs when the worker thread sends a message to the parent context. – This event can also be handled using worker.addeventListener (‘message’, handler).

  • Onmessageerror: called to this genus when an error event of type MessageEvent occurs in the worker thread

Sex handlers. – This event occurs when the worker thread receives a message that cannot be deserialized. – This event can also be handled using worker.addeventListener (‘ messageError ‘, handler). The Worker object also supports the following methods.

  • PostMessage () : Used to send information to worker threads via asynchronous message events.
  • Terminate () : Terminates the worker thread immediately. Without giving worker threads a chance to clean up, scripts suddenly stop.

DedicatedWorkerGlobalScope

Inside the special worker threads, global scope is DedicatedWorkerGlobalScope instance. Because this inherits from WorkerGlobalScope, it contains all of its properties and methods. The worker thread can access the global scope through the self keyword.

globalScopeWorker.js

console.log('inside worker:', self);
Copy the code

main.js

const worker = new Worker('./globalScopeWorker.js');
console.log('created worker:', worker);
// created worker: Worker {}
// inside worker: DedicatedWorkerGlobalScope {}
Copy the code

As shown in this example, the console object in both the top-level script and the worker thread is written to the browser console, which is useful for debugging. Because Worker threads have a non-negligible start delay, the Worker thread’s log is printed after the main thread’s log even if the Worker object exists. Notice here that two separate JavaScript threads are both sending messages to a console object, which then serializes elimination and prints it out on the browser console. The browser receives messages from two different JavaScript threads and outputs them in the order it sees fit. For this reason, care must be taken when using logs to determine the order of operations in multithreaded applications. Based on WorkerGlobalScope DedicatedWorkerGlobalScope increased following properties and methods.

  • Name: An optional string identifier that can be supplied to the Worker constructor.
  • PostMessage () : The corresponding method to worker.postmessage (), used to go up and down from within the worker thread to the parent

Text sends a message.

  • Close () : corresponding to worker.terminate(), used to terminate the worker thread immediately. There is no line for workers

Procedures provide the opportunity to clean up, and scripts come to a sudden stop.

  • ImportScripts () : Used to import any number of scripts into the worker thread.

Dedicated worker threads and implicit MessagePorts

Dedicated Worker thread Worker object and DedicatedWorkerGlobalScope with some MessagePorts phase interface processing procedures and methods: Onmessage, onMessageError, close(), and postMessage(). This is no accident, as the dedicated worker thread implicitly uses MessagePorts to communicate between the two contexts.

Father in the context of a Worker object and DedicatedWorkerGlobalScope incorporates MessagePort actually, and in their own interface exposed the corresponding processing procedures and methods, respectively. In other words, messages are still sent through MessagePort, but not directly.

There are also inconsistencies, such as the start() and close() conventions. Dedicated worker threads automatically send queued messages, so start() is unnecessary. In addition, close() does not make sense in the context of dedicated worker threads, because closing MessagePort thus isolates worker threads. Therefore, calling close() inside the worker thread (or terminate() outside) not only closes the MessagePort, but also terminates the thread.

Lifecycle of dedicated worker threads

Calling the Worker() constructor is the beginning of the life of a dedicated Worker thread. When called, it initializes the request to the Worker thread script and returns the Worker object to the parent context. While the Worker object is immediately available in the parent context, the Worker thread associated with it may not have been created yet because of grid delays and initialization delays for the request script.

In general, specialized worker threads can be informally classified as being in three states: initializing, active, and terminated. These states are not visible to other contexts. While the Worker object may exist in the parent context, there is no way to determine whether the Worker thread is currently handling the initialization, active, or terminated state. In other words, Worker objects associated with an active dedicated Worker thread cannot be distinguished from Worker objects associated with a terminated dedicated Worker thread.

At initialization, messages to be sent to the worker thread can be queued, although the worker thread script has not yet been executed. These messages wait for the worker thread’s state to become active before they are added to its message queue. The following code demonstrates this process.

initializingWorker.js self.addEventListener('message', ({data}) => console.log(data)); main.js const worker = new Worker('./initializingWorker.js'); // The Worker may still be initialized // but the postMessage() data can handle worker.postmessage ('foo'); worker.postMessage('bar'); worker.postMessage('baz'); // foo // bar // bazCopy the code

Once created, dedicated worker threads exist for the entire life of the page unless terminated by themselves (self.close()) or externally (worker.terminate()). Even after the threaded script has finished running, the threaded environment still exists. As long as the Worker thread exists, the Worker object associated with it will not be garbage collected.

Both self-termination and external termination eventually perform the same worker thread termination routine. Consider the following example, where the worker thread performs self-termination in the middle of sending two messages:

closeWorker.js
self.postMessage('foo');
self.close();
self.postMessage('bar');
setTimeout(() => self.postMessage('baz'), 0);
main.js
const worker = new Worker('./closeWorker.js');
worker.onmessage = ({data}) => console.log(data);
// foo
// bar
Copy the code

Although close() was called, it is clear that the execution of the worker thread did not terminate immediately. Close () here notifies workers to cancel all tasks in the event loop and prevents further tasks from being added. That’s why “baz” doesn’t print. Worker threads do not need to perform synchronous stops, so “bars” processed in the parent context’s event loop are still printed. Let’s look at an example of external termination.

terminateWorker.js self.onmessage = ({data}) => console.log(data); main.js const worker = new Worker('./terminateWorker.js'); // Give 1000 milliseconds for the worker thread to initialize setTimeout(() => {worker.postmessage ('foo'); worker.terminate(); worker.postMessage('bar'); setTimeout(() => worker.postMessage('baz'), 0); }, 1000); // fooCopy the code

Here, the external first sends a postMessage with “foo” to the worker thread, which can be processed before the external terminates. Once terminate() is called, the worker thread’s message queue is cleaned up and locked, which is why only “foo” is printed.

Note that close() and terminate() are idempotent operations, and multiple calls are fine. These two methods simply mark the Worker as teardown, so multiple calls won’t hurt.

Throughout its life, a dedicated worker thread associates only one Web page (the Web worker thread specification calls this a document). Unless explicitly terminated, dedicated worker threads exist as long as the associated document exists. If the browser leaves the page (by navigating or closing tabs or closing Windows), it marks the worker threads associated with it as terminated, and their execution stops immediately.

Configuring Worker options

The Worker() constructor allows an optional configuration object as a second argument. The configuration object supports the following properties.

  • Name: String identifier that can be read in the worker thread with self.name.
  • Type: indicates how the loaded script is run. It can be “classic” or “module”. Classic “takes the script as a regular

The script executes, and “module” executes the script as a module.

  • Credentials: When type is “module”, how do you specify to obtain the worker thread modules associated with the transport credential data

The script. The value can be “omit”, “same-orign”, or “include”. These options are the same as the credentials options of fetch(). If type is “classic”, “omit” is the default.

Note that some modern browsers do not fully support module worker threads or may need to modify flags to do so.

Create worker threads within JavaScript lines

A worker thread needs to be created based on a script file, but that doesn’t mean the script has to be a remote resource. Dedicated worker threads can also be created in inline scripts via Blob object urls. This allows the chemical author thread to be started more quickly because there is no network latency. An example of creating a worker thread in a row is shown below.

// Create a string of JavaScript code to execute const workerScript = 'self.onMessage = ({data}) => console.log(data); `; Const workerScriptBlob = new Blob([workerScript]); // Const workerScriptBlob = new Blob([workerScript]); Const workerScriptBlobUrl = url.createobjecturl (workerScriptBlob); // createObjectURL based on Blob instance const workerScriptBlobUrl = url.createobjecturl (workerScriptBlob); Const worker = new worker (workerScriptBlobUrl); // Create a thread based on the object URL const worker = new worker (workerScriptBlobUrl); worker.postMessage('blob worker script'); // blob worker scriptCopy the code

In this example, the Blob is created from the script string, then the object URL is created from the Blob, and finally the object URL is passed to the Worker() constructor. This constructor also creates a dedicated worker thread.

Execute scripts dynamically in worker threads

Scripts in worker threads are not monolithic, but can be loaded and executed programmatically using the importScripts() method for arbitrary scripts. This method can be used with global Worker objects. This method loads the scripts and executes them synchronously in the order they are loaded. For example, the following example loads and executes two scripts:

main.js
const worker = new Worker('./worker.js');
// importing scripts
// scriptA executes
// scriptB executes
// scripts imported
scriptA.js
console.log('scriptA executes');

scriptB.js
console.log('scriptB executes');
worker.js
console.log('importing scripts');
importScripts('./scriptA.js');
importScripts('./scriptB.js');
console.log('scripts imported');
Copy the code

The importScripts() method can accept any number of scripts as arguments. There is no limit to the order in which the browser downloads them, but they are executed in exactly the order in the argument list. Therefore, the following code has the same effect as before:

console.log('importing scripts');
importScripts('./scriptA.js', './scriptB.js');
console.log('scripts imported');
Copy the code

Script loading is limited by regular CORS, but scripts from any source can be requested within the worker thread. The script import policy here is similar to using the generated

main.js
const worker = new Worker('./worker.js', {name: 'foo'});
// importing scripts in foo with bar
// scriptA executes in foo with bar
// scriptB executes in foo with bar
// scripts imported
scriptA.js
console.log(`scriptA executes in ${self.name} with ${globalToken}`);
scriptB.js
console.log(`scriptB executes in ${self.name} with ${globalToken}`);
worker.js
const globalToken = 'bar';
console.log(`importing scripts in ${self.name} with ${globalToken}`);
importScripts('./scriptA.js', './scriptB.js');
console.log('scripts imported'); 
Copy the code

Delegate tasks to child worker threads

Sometimes it may be necessary to create child worker threads within worker threads. When there are multiple CPU cores, parallel computation can be achieved by using multiple child worker threads. Think carefully before using multiple child threads to make sure that the investment in parallel computing really pays off, since running multiple child threads at the same time can be costly.

Creating child worker threads is the same as creating regular worker threads, except for path resolution. The script path of the child worker thread is resolved relative to the parent worker thread rather than to the web page. Take a look at the following example (note the extra JS directory) :

main.js
const worker = new Worker('./js/worker.js');
// worker
// subworker
js/worker.js
console.log('worker');
const worker = new Worker('./subworker.js');
js/subworker.js
console.log('subworker');
Copy the code

Scripts for the top-level worker thread and scripts for child worker threads must be loaded from the same source as the home page

Handling worker thread errors

If a worker thread script throws an error, the worker thread sandbox prevents it from interrupting the parent thread’s execution. As shown in the following example, the try/catch block does not catch an error:

main.js
try {
 const worker = new Worker('./worker.js');
 console.log('no error');
} catch(e) {
 console.log('caught error');
}
// no error
worker.js
throw Error('foo');
Copy the code

However, the corresponding error event still bubbles into the global context of the Worker thread and can therefore be accessed by setting an error event listener on the Worker object. Here’s an example:

main.js
const worker = new Worker('./worker.js');
worker.onerror = console.log;
// ErrorEvent {message: "Uncaught Error: foo"}
worker.js
throw Error('foo');
Copy the code

Communicates with dedicated worker threads

Communication with worker threads is all done through asynchronous messages, but these messages can take many forms.

Using postMessage ()

The simplest and most common form is to pass serialized messages using postMessage(). Let’s look at an example of calculating a factorial:

factorialWorker.js function factorial(n) { let result = 1; while(n) { result *= n--; } return result; } self.onmessage = ({data}) => { self.postMessage(`${data}! = ${factorial(data)}`); }; main.js const factorialWorker = new Worker('./factorialWorker.js'); factorialWorker.onmessage = ({data}) => console.log(data); factorialWorker.postMessage(5); factorialWorker.postMessage(7); factorialWorker.postMessage(10); / / 5! / / 7 = 120! = 5040 / / 10! = 3628800Copy the code

For simple messages, postMessage() is used to pass messages between the main thread and the worker thread, much like passing messages between two Windows. The main difference is that there is no targetOrigin restriction for window.prototype. postMessage, For WorkerGlobalScope. Prototype. PostMessage or Worker. Prototype. PostMessage has no effect. The reason for this convention is simple: the source of worker thread scripts is limited to the source of the home page, so there is no need to filter.

Use the MessageChannel

Whether the main thread or the worker thread, communicating via postMessage() involves calling methods on the global object and defining a temporary transport protocol. This process can be replaced by the Channel Messaging API, which explicitly establishes a communication Channel between two contexts.

The MessageChannel instance has two ports, each representing two communication endpoints. To allow the parent page and the worker thread to communicate via MessageChannel, you need to pass a port to the worker thread, as shown below:

Worker. js // Store global messagePort in listener let messagePort = null; function factorial(n) { let result = 1; while(n) { result *= n--; } return result; } // Add message handler self. onMessage = ({ports}) => {// Set port only once if (! MessagePort) {// Initialize the message sending port, // assign a value to the variable and reset the listener messagePort = ports[0]; self.onmessage = null; Messageport. onMessage = ({data}) => {messagePort.postMessage(' ${data}! =${factorial(data)}`); }; }}; main.js const channel = new MessageChannel(); const factorialWorker = new Worker('./worker.js'); / / the ` MessagePort ` object is sent to a worker thread / / worker threads handle initialization channel factorialWorker postMessage (null, [channel. Port1]); Channel.port2. onMessage = ({data}) => console.log(data); / / worker threads through the channel. The channel response port2. PostMessage (5); / / 5! = 120Copy the code

Using BroadcastChannel

Same-origin scripts can send and receive messages to and from each other via BroadcastChannel. This channel type is simpler to set up and does not require a MessageChannel transfer of cluttered ports. This can be done in the following ways:

main.js
const channel = new BroadcastChannel('worker_channel');
const worker = new Worker('./worker.js');
channel.onmessage = ({data}) => {
 console.log(`heard ${data} on page`);
}
setTimeout(() => channel.postMessage('foo'), 1000);
// heard foo in worker
// heard bar on page
worker.js
const channel = new BroadcastChannel('worker_channel'); 
channel.onmessage = ({data}) => {
 console.log(`heard ${data} in worker`);
 channel.postMessage('bar');
}
Copy the code

Here, the page waits a second before sending a message through the BroadcastChannel. Because this channel has no notion of port ownership, if no entity is listening to the channel, the broadcast message will not be processed. In this case, if there is no setTimeout(), the message has already been sent, but the message handler on the worker thread is not yet in place, due to the delay of the initial chemical author thread.

Structured cloning algorithm

Structured cloning algorithms can be used to share data between two independent contexts. This algorithm is implemented by the browser in the background and cannot be called directly. When an object is passed through postMessage(), the browser iterates over the object and makes a copy of it in the target context. The following types are supported by structured cloning algorithms.

  • All primitive types except Symbol
  • The Boolean object
  • String object
  • BDate
  • RegExp
  • Blob
  • File
  • FileList
  • ArrayBuffer
  • ArrayBufferView
  • ImageData
  • Array
  • Object
  • Map
  • Set

There are several points to note about structured cloning algorithms.

  • After replication, changes to the object in the source context are not propagated to objects in the target context.
  • Structured cloning algorithms can recognize circular references contained in an object and do not iterate over the object indefinitely.
  • Cloning an Error object, Function object, or DOM node throws an Error.
  • Structured cloning algorithms do not always create identical copies.
  • Object property descriptors, get methods, and set methods are not cloned and default values are used when necessary.
  • The prototype chain does not clone.
  • RegExp. Prototype. LastIndex properties not cloning

Structured cloning algorithm has computational cost when the object is complex. Therefore, in practice, excessive duplication should be avoided as much as possible.

Transferable object

Transferable objects are used to transfer ownership from one context to another. This feature is particularly useful in situations where it is not possible to replicate large amounts of data across contexts. Only the following types of objects are transferable:

  • ArrayBuffer
  • MessagePort
  • ImageBitmap
  • OffscreenCanvas

SharedArrayBuffer

Note that due to the Spectre and Meltdown vulnerabilities, SharedArrayBuffer was disabled in all major browsers in January 2018. Starting in 2019, some browsers are starting to gradually re-enable this feature. Neither cloned nor transferred, SharedArrayBuffer as an ArrayBuffer can be shared between different browser contexts. When passing SharedArrayBuffer to postMessage(), the browser only passes a reference to the original buffer. As a result, two different JavaScript contexts maintain separate references to the same memory block. Each context can modify this buffer just as it would a regular ArrayBuffer.

The thread pool

Because worker threads are expensive to enable, in some cases you might consider keeping a fixed number of threads active at all times and assigning tasks to them as needed. Worker threads are marked as busy while performing calculations. It is not ready to receive new tasks until it notifies the thread pool that it is free. These active threads are called “thread pools” or “worker thread pools.”

There is no definitive answer to the number of threads in the thread pool, but look at the number of cores available to the system as returned by the Navigator.Hardware Concurrency property. Since it is impossible to know the multithreading capabilities of each core, it is best to use this number as an upper limit for the size of the thread pool.

One strategy for using thread pools is for each thread to perform the same task, but which task to perform is controlled by several parameters. By using a task-specific thread pool, you can assign a fixed number of worker threads and provide them with parameters as needed. The worker line receives these parameters, performs time-consuming calculations, and returns the results to the thread pool. The thread pool can then assign additional work to the worker thread. The following example builds a relatively simple thread pool, but covers all the basic requirements of the above idea.

The first step is to define a TaskWorker class that extends the Worker class. The TaskWorker class is responsible for two things: keeping track of whether a thread is busy working, and managing information and events coming in and out of the thread. In addition, tasks passed to the worker thread are encapsulated in a term contract, which is then properly resolved and rejected

Hasty adoption of parallel computing is not necessarily the best approach. Tuning strategies for thread pools vary by computing task and system hardware.