Posts Tagged ‘web workers’

setInterval with 0ms delays within Web Workers

The 4ms minimum

Due to browser restrictions, you typically can’t have a setInterval call where the delay is set to 0, from MDN:

In modern browsers, setTimeout()/setInterval() calls are throttled to a minimum of once every 4 ms when successive calls are triggered due to callback nesting (where the nesting level is at least a certain depth), or after certain number of successive intervals.

I confirmed this by testing with the following code in Chrome and Firefox windows:

setInterval(() => { console.log(`now is ${Date.now()}`); }, 0);

In Firefox 72:

Firefox window setInterval with 0ms delay

In Chrome 79:

Chrome window setInterval with 0ms delay

The delay isn’t exact, but you can see that it typically comes out to around 4ms, as expected. However, things are a little different with web workers.

The delay with web workers

To see the timing behavior within a web worker, I used the following code for the worker:

const printNow = function() { console.log(`now is ${Date.now()}`); }; setInterval(printNow, 0); onmessage = function(_req) { };

… and created it via const worker = new Worker('worker.js');.

In Chrome, there’s no surprises, the behavior in the worker was similar to what it was in the window:

Chrome window setInterval with 0ms delay

Things get interesting in Firefox:

Firefox window setInterval with 0ms delay

Firefox starts grouping the log messages (the blue bubbles), as we get multiple calls to the function within the same millisecond. Firefox’s UI becomes unresponsive (which is weird and I didn’t expect as this is on an i7-4790K with 8 logical processors and there’s little interaction between the worker and the parent window), and there’s a very noticeable spike in CPU usage.

Takeaway

setInterval() needs a delay and you shouldn’t depend on the browser to set something reasonable. It would be nice if setInterval(.., 0) would tell the browser to execute as fast as reasonably possible, adjusting for UI responsiveness, power consumption, etc. but that’s clearly not happening here and as such it’s dangerous to have a call like this which may render the user’s browser unresponsive.

Prioritizing Web Worker Requests

Web workers handle incoming request messages via a function declared on the onmessage property of the worker. A, perhaps not so obvious, behavior here is that incoming requests are queued. If you’re doing something intensive within the worker (or the CPU is taxed b/c of other processes) the queuing behavior becomes more obvious, as you need to wait longer for a response from the worker due to the fact that previous requests need to be picked up and handled first. Here’s a simple worker that does some heavy lifting (at least for Chrome 79 on an i7-4790K):

const highLoadWork = function() { let x = 1000; for(let i=0; i<99999999; i++) { if(i % 2 === 0) { x += 1000; } else { x = Math.sqrt(x); } } return `hello, x=${x}`; }; onmessage = function(_req) { const requestNum = _req.data.requestNum; const workResult = highLoadWork(); postMessage( { "response": `responding to request ${requestNum}` } ); };

… and here’s what happens after making 12 requests to it in a loop:

I can’t say that this is a bad thing, this is generally sensible and what you’d expect to happen. That said, there are workloads where you may want to prioritize things differently. GraphPaper was one such case for me. Workers are handling things based on user interactions, the last request represents the current state of the world and is typically the only request that matters (any others from before can be thrown away). Unfortunately, this is no mechanism to interact with or re-prioritize messages in this underlying queue. However, you can offload the requests to a queue that the worker manages internally by itself. The onmessage() function simply puts the request data in a queue and we can use setInterval() to continuously call a function that pulls and processes requests from this queue. Here’s what the modified worker code looks like, where the latest request is prioritized (and previous ones are thrown away):

const highLoadWork = function() { let x = 1000; for(let i=0; i<99999999; i++) { if(i % 2 === 0) { x += 1000; } else { x = Math.sqrt(x); } } return `hello, x=${x}`; }; const requestQueue = []; const processRequestQueue = function() { if(requestQueue.length === 0) { return; } const lastRequest = requestQueue.pop(); requestQueue.length = 0; const requestNum = lastRequest.requestNum; const workResult = highLoadWork(); postMessage( { "response": `responding to request ${requestNum}` } ); }; setInterval(processRequestQueue, 4); onmessage = function(_req) { requestQueue.push(_req.data); };

… and here’s what happens after making 12 requests to it in a loop:

(sometimes, there are also cases where only request 12 was processed)

So this works pretty well, but there there are a few things to be aware of. The time it takes to post a message to the worker, is somewhere between 0ms – 1ms, plus the cost of copying any data that needs to be transferred to the worker. The setInterval() minimum is not really 0ms; the browser sets a reasonable minimum which you can probably expect to be between 4ms – 10ms, and this is in addition to the cost of posting the message to the worker (The code was updated to explicitly specify a 4ms delay, setInterval with a 0ms delay isn’t a good idea). What this means in practice is that there is additional latency before we begin processing a request, but compared to a scenario where we have to factor in waiting on all prior requests to finish processing (which is the point of doing this to begin with), I expect this method to win out in performance.

Finally, here’s a look at a GraphPaper stress test and how prioritizing the last request to the connector routing worker (which is responsible for generating the path between the 2 nodes) allows for a faster/less-laggy update:

No prioritization

Prioritize last request, eliminate prior

Encapsulating Web Workers

Constructing Web Workers

There’s generally 2 ways to construct a Web Worker…

Passing a URL to the Javascript file:

const myWorker = new Worker('worker.js');

Or, creating a URL with the Javascript code (as a string). This is done by creating a Blob from the string and passing the Blob to URL.createObjectURL:

const myWorker = new Worker( URL.createObjectURL(new Blob([...], {type: 'application/javascript'})) );

In Practice

With GraphPaper, I’ve used the former approach for the longest while, depending on the caller to construct and inject the worker into GraphPaper.Canvas:

const canvas = new GraphPaper.Canvas( document.getElementById('paper'), // div to use window, // parent window new Worker('../dist/connector-routing-worker.min.js') // required worker for connector routing );

This technically works but, in practice, there’s 2 issues here:

  • There’s usually a few hoops to go through for the caller to actually get the worker Javascript file in a location that is accessible by the web server. This could mean manually moving the file, additional configuration, additional tooling, etc.
  • GraphPaper.Canvas is responsible for dealing with whether a worker is used or not, which worker, how many workers are used, etc. These aren’t concerns that should bubble up to the caller. You could make a case that caller should have the flexibility to swap in a worker of their choice (a strategy pattern), that’s a fair point, but I’d argue that the strategy here is what the worker is executing not the worker itself and I haven’t figured out a good interface for what that looks like.

So, I worked to figure out how to construct the worker within GraphPaper.Canvas using URL.createObjectURL(), and this is where things got trickier. The GraphPaper codebase is ES6 and uses ES6 modules, I use rollup with babel to produce distribution files the primary ones being minified IIFE bundles (IIFE because browser support for ES6 modules is still very much lacking). One of these bundles is the code for the worker (dist/connector-routing-worker.js), which I’d need to:

  • Encapsulate it into a string that can be referenced within the source
  • Create a Blob from the string
  • Create a URL from the Blob using URL.createObjectURL()
  • Pass the URL to the Worker constructor, new Worker(url)

The latter steps are straightforward function calls, but the first is not clear cut.

Repackaging with Rollup

After producing the “distribution” code for the worker, what I needed was to encapsulate it into a string like this (the “worker-string-wrap”):

const workerStringWrap = ` const ConnectorRoutingWorkerJsString = \` ${workerCode} \`; export { ConnectorRoutingWorkerJsString }` ;

Writing that out to a file, I could then easily import it as just another ES6 module (and use the string to create a URL for the worker), then build and produce the distribution file for GraphPaper.

I first tried doing this with a nodejs script, but creating a rollup plugin proved a more elegant solution. Rollup plugins are aren’t too difficult to create but I did find the documentation a bit convoluted. Simply, rollup will execute certain functions (hooks) at appropriate points during the build process. The hook needed in this scenario is writeBundle, which can be used to get the code of the produced bundle and do something with it (in this case, write it out to a file).

// rollup-plugin-stringify-worker.js const fs = require('fs'); const stringifyWorkerPlugin = function (options) { return { name: 'stringifyWorkerPlugin', writeBundle(bundle) { console.log(`Creating stringified worker...`); // Note: options.srcBundleName and options.dest are expected args from the rollup config const workerCode = bundle[options.srcBundleName].code; const workerStringWrap = `const ConnectorRoutingWorkerJsString = \`${workerCode}\`; export { ConnectorRoutingWorkerJsString }`; fs.writeFile(options.dest, workerStringWrap, function(err) { // ... }); } }; }; export default stringifyWorkerPlugin;

The plugin is setup within a rollup config file:

import stringifyWorker from './build/rollup-plugin-stringify-worker'; // ... { input: 'src/Workers/ConnectorRoutingWorker.js', output: { format: 'iife', file: 'dist/workers/connector-routing-worker.min.js', name: 'ConnectorRoutingWorker', sourcemap: false, }, plugins: [ babel(babelConfig), stringifyWorker( { "srcBundleName": "connector-routing-worker.min.js", "dest": "src/Workers/ConnectorRoutingWorker.string.js" } ) ], }, // ...

Note that addtional config blocks for components that use ConnectorRoutingWorker.string.js (e.g. the GraphPaper distribution files), need to be placed after the block shown above.

The overall process looks like this:

Creating the Worker

The worker can now be created within the codebase as follows:

import {ConnectorRoutingWorkerJsString} from './Workers/ConnectorRoutingWorker.string'; // ... const workerUrl = URL.createObjectURL(new Blob([ ConnectorRoutingWorkerJsString ])); const connectorRoutingWorker = new Worker(workerUrl); // ...

The Future

Looking ahead, I don’t really see a good solution here. Better support for ES6 modules in the browser would be a step in the right direction, but what is really needed is a way to declare a web worker as a module and the ability to import and construct a Worker with that module.

Notes on web workers

Recently I’ve been working on parallelizing some HTML5 canvas image processing code with web workers and hit a few stumbling blocks, mostly due to some unexpected limits and misunderstandings about workers. Below are some notes I’ve jotted down while working.

  • There’s a limit on the number of web workers being executed, with Firefox I hit a limit of 20.
  • With Firefox, there’s no warning when you hit the limit, your worker simply doesn’t run. There’s no queuing with workers either, so you can’t fire off a worker and expect the browser to schedule it to run when you have less than the maximum number of workers executing.
  • I attempted to implement my own queuing mechanism, but this was a waste of time. You can’t precisely predict when the browser will delete the worker. I tried posting a message from the worker thread to the caller (via postMessage) at the end of the worker onmessage processing function, but this only accurately signals that the processing is done, not that the worker is terminated, and you therefore can’t predict when it’s possible to spawn off another worker.
    EDIT: Experimenting a bit more, I’ve discovered this to be incorrect. To actually terminate the worker, you can use the close method from within the worker, or the terminate method from the caller, so a queuing mechanism is a feasible solution to deal with browser limits on the number of executing web workers.
  • When posting a message from worker to caller (via postMessage), the handler function within the caller seems to be executed within the worker thread.
  • You can’t pass a function to a worker. You pass data between worker and caller by copying the data (serializing/de-serializing an object by structured cloning) or giving ownership of the object to the worker (transferable objects). This allows for a great deal of thread-safety, but in a language like Javascript where functions are first-class citizens, it’s hard not to see the elegance of being able to simply spawn off a worker from a function.
  • Given an upper limit on the number of executing web workers, a spawn-and-forget (or spawn-and-wait) model for concurrency is not practical. A thread pool pattern may be appropriate. In general, each worker should be thought of as a heavy-weight processing engine.
    EDIT:This is perhaps more true not because of browser limits on the number of workers, but the fact that workers were designed with the expectation of them being long-lived, with high performance and memory costs.

All my work was in Firefox, so the points above may not necessarily be true for other browsers.