Archive for April, 2013

Notes on web workers

Recently I’ve been working on parallelizing some HTML5 canvas image processing code with web workers and hit a few stumbling blocks, mostly due to some unexpected limits and misunderstandings about workers. Below are some notes I’ve jotted down while working.

  • There’s a limit on the number of web workers being executed, with Firefox I hit a limit of 20.
  • With Firefox, there’s no warning when you hit the limit, your worker simply doesn’t run. There’s no queuing with workers either, so you can’t fire off a worker and expect the browser to schedule it to run when you have less than the maximum number of workers executing.
  • I attempted to implement my own queuing mechanism, but this was a waste of time. You can’t precisely predict when the browser will delete the worker. I tried posting a message from the worker thread to the caller (via postMessage) at the end of the worker onmessage processing function, but this only accurately signals that the processing is done, not that the worker is terminated, and you therefore can’t predict when it’s possible to spawn off another worker.
    EDIT: Experimenting a bit more, I’ve discovered this to be incorrect. To actually terminate the worker, you can use the close method from within the worker, or the terminate method from the caller, so a queuing mechanism is a feasible solution to deal with browser limits on the number of executing web workers.
  • When posting a message from worker to caller (via postMessage), the handler function within the caller seems to be executed within the worker thread.
  • You can’t pass a function to a worker. You pass data between worker and caller by copying the data (serializing/de-serializing an object by structured cloning) or giving ownership of the object to the worker (transferable objects). This allows for a great deal of thread-safety, but in a language like Javascript where functions are first-class citizens, it’s hard not to see the elegance of being able to simply spawn off a worker from a function.
  • Given an upper limit on the number of executing web workers, a spawn-and-forget (or spawn-and-wait) model for concurrency is not practical. A thread pool pattern may be appropriate. In general, each worker should be thought of as a heavy-weight processing engine.
    EDIT:This is perhaps more true not because of browser limits on the number of workers, but the fact that workers were designed with the expectation of them being long-lived, with high performance and memory costs.

All my work was in Firefox, so the points above may not necessarily be true for other browsers.