Archive for the ‘Web Technologies’ Category

HTTPS for local development

How to local dev

Local developer environments can take many forms but for non-trivial web applications I’m still fond of Vagrant to spin up a VM that is “close enough” to production environments. Typically, I will create a Vagrant box and map a hostname to the IP of the box via an entry in the host system’s hosts file (I use vagrant-hostmap for this, as DHCP is used to avoid conflicts and that means the IP address change often).

Does HTTPS matter for local dev?

Probably not. I went down this particular rabbit hole as I was doing some experiments with server-side events and noticed there’s a limitation on open connection when not using HTTP/2.

So, I thought it would be a good idea to get HTTP/2 working in my local dev environment, which was using Apache and PHP. This is fairly straightforward to setup on Apache. However, the browser (Chrome) was not happy with. Currently, browsers only support HTTP/2 over TLS (h2) and there’s is no support for HTTP/2 Cleartext (h2c). There some good reasons for this.. but many of these reasons are addressing concerns on the public web, for other use-cases (e.g. local dev) there’s additional complexity to support h2 with minimal benefits.

Scripting

The code in the sections below is Bash for a provisioning script (bootstrap.sh), intended to run when the Vagrant is provisioned. The code is such that the environment of the box is reproducible (after destroying and re-creating) and there is no need to re-adjust configuration on the host machine (comes into play when dealing with certificates).

The Vagrantfile looks something like this:

$startScript = <<START_SCRIPT sudo service apache2 restart START_SCRIPT Vagrant.configure("2") do |config| config.vm.box = "ubuntu/mantic64" config.vm.provision :shell, path: "localdev/bootstrap.sh" config.vm.provision "shell", inline: $startScript, run: "always" config.vm.network "private_network", type: "dhcp" end

… with the bootstrap script and related assets (e.g. Apache conf files) within the localdev folder.

Supporting h2: disable prefork

First, disable the prefork module, it doesn’t play well with HTTP/2:

sudo a2dismod mpm_prefork

In prefork, mod_http2 will only process one request at at time per connection. But clients, such as browsers, will send many requests at the same time. If one of these takes long to process (or is a long polling one), the other requests will stall.

Supporting h2: enable HTTP/2 modules and configuration

Enable modules and configuration for HTTP/2:

sudo a2enmod mpm_event sudo a2enmod http2 sudo cp /vagrant/localdev/http2.conf /etc/apache2/conf-available/http2.conf sudo a2enconf http2

The http2.conf file contains the following:

<IfModule http2_module> Protocols h2 h2c http/1.1 H2Direct on </IfModule>

Required modules and configuration are now enabled. For a given VirtualHost definition, you can add Protocols h2 h2c http/1.1.

e.g.:

<VirtualHost *:443> ServerAdmin webmaster@localhost DocumentRoot /var/www/html Protocols h2 h2c http/1.1 ...

Supporting h2: minica

For HTTPS TLS certificate generation, minica is awesome and simple to use. However, it is only distributed as source, so you’ll need to install git to checkout the repo and golang to compile the source.

sudo apt-get -y install git sudo apt-get -y install golang-go sudo mkdir /certs sudo mkdir /tools cd /tools git clone https://github.com/jsha/minica.git cd /tools/minica go env -w GO111MODULE=auto go build -buildvcs=false

Here a /tools folder is created, the minica repo is cloned within, and the source is compiled.

A /certs folder is also created as a location for generated certificates.

Supporting h2: generate root CA certificate

Once compiled, run minica to generate a root CA certificate (minica.pem) and key (minica-key.pem), which can be installed on the host to avoid browsers showing that the certificate is invalid or untrusted. This shouldn’t be done in the Vagrant box provisioning script, as you’ll get new root certificates every time the box is provisioned. Instead, SSH into the box, run, and copy the files to a secure location.. or run minica somewhere else (e.g. host machine or another machine). ./minica --domains "test.localdev.com"

minica requires a domain argument but the need is really for the minica.pem and minica-key.pem files at this point, not the site certificate. Generation of the site certificate is something that should be done within the provisioning script.

Supporting h2: generate site certificate

Within the provisioning script:

SITE_DOMAIN="localdev.com" WILDCARD_CERT="*.localdev.com" CERT_FOLDER="_.$SITE_DOMAIN"; CA_CERT="/vagrant/localdev/secrets/minica.pem" CA_KEY="/vagrant/localdev/secrets/minica-key.pem" ./minica --ca-cert "$CA_CERT" --ca-key "$CA_KEY" --domains "$WILDCARD_CERT" cp -R "/tools/minica/$CERT_FOLDER" "/certs/$CERT_FOLDER"

There’s a 2 things to note here:

  • To avoid having to generate new certificates for different subdomains, a wildcard cert is generated (*.localdev.com)
  • An existing root CA certificate, and key, is expected

Supporting h2: use site certificate

Update the VirtualHost definition to reference the site certificate and key:

<VirtualHost *:443> ServerAdmin webmaster@localhost DocumentRoot /var/www/html Protocols h2 h2c http/1.1 SSLEngine on SSLCertificateFile /certs/_.localdev.com/cert.pem SSLCertificateKeyFile /certs/_.localdev.com/key.pem ...

Trusting minica certs on the host

Finally, to avoid the browser showing a warning that the site’s certificate is invalid or insecure, add the root certificate as a trusted root CA authority on the host machine. Martin Widmann has some nice documentation on how to do this for different operating systems.

Verify h2 protocol in the browser

With the site up, Chrome and Firefox dev tools should now show that traffic is being served via the h2 protocol.

The Protocol column isn’t visible by default, so you may have to enable it (right click on visible header → select Protocol).

MutationObserver limitations

The observers

MutationObserver, along with its siblings ResizeObserver and IntersectionObserver, are great tools for working with the DOM. I don’t think there’s necessarily broad usage of these observers in frontend development but, when building applications that need insight into lower-level DOM changes, they are powerful interfaces and allow you to avoid the typical/hacky solution of polling for changes.

Interface

All the observer classes share a common, fairly simple interface. There’s just a few key aspects needed to use them:

  • The constructor takes a callback, which is called whenever the DOM state change corresponding to the class (mutation, resize, etc.) is observed.
  • The observe() method takes a target DOM element to observer

On an observed state change (e.g. mutation) the given callback is called with an appropriate record (e.g. MutationRecord)

Missing context

The records surfaced on an observed state change contain information about the change but nothing around who or what triggered the change. Looking at a comparable scenario, this is typically the case for most DOM events as well, but it’s generally a non-issue because you can reasonably assume that the event was triggered by the user interacting with the browser. When it comes to changes detected by an observer, there’s a bit more ambiguity as to how the change came about and you can’t always assume the change was from the end-user.

Note: yes, you an programmatically force DOM events to be emitted as well (e.g. element.click()) but I think this is almost always an anti-pattern

For example, let’s say you have an application where you allow users to enter content into a contenteditable <div> but you’d also programmatically surface and incorporate content coming from the server (from other internet users using the application). Ideally, you could use a MutationObserver on the <div> to detect changes and see if there’s new content that needs to be sent to the server, but you’d need to distinguish:

  • What changes are coming from the user interacting with the browser
  • What changes are being made programmatically (i.e. coming from other server / coming from other users)

Unfortunately, you can’t make this distinction with the information surfaced in a MutationRecord.

While DOM events don’t necessarily map to an actor model, I tend to think what’s conceptually missing here is knowing the originating actor of the events/message and, in any user-facing system, there’s going to be at least 2 actors:

  • The system
  • The end-user interacting with the system

Once you’re within a system dealing with mutating state, knowing who the originating actor is incredibly valuable information.

Hacking around this

I’m prototyping a hacky, but reasonable, solution for ScratchGraph:

Overall, this seems to work but I hate patterns like this where I have to purposely introduce latency.

A look at 2D vs WebGL canvas performance

I did some quick benchmarking with canvas-image-transformer, looking at the performance between directly manipulating pixels on a 2D canvas versus using a fragment shader on a WebGL canvas. For testing, I used a grayscale transformation as it can be done with a simple weighted sum (R*0.2126 + G*0.7152 + B*0.0722) and there’s a high degree of parity between the fragment shader code and the code for pixel operations on a 2D canvas.

Converting to grayscale

Pixel operations on the 2D canvas are as follows:

for(var i=0; i<pixels.data.length; i+=4) { var grayPixel = parseInt(((0.2126*(pixels.data[i]/255.0)) + (0.7152*(pixels.data[i+1]/255.0)) + (0.0722*(pixels.data[i+2]/255.0))) * 255.0); pixels.data[i] = grayPixel; pixels.data[i + 1] = grayPixel; pixels.data[i + 2] = grayPixel; }

The corresponding fragment shader for the WebGL canvas is as follows:

precision mediump float; uniform sampler2D uSampler; varying vec2 vTextureCoord; void main(void) { vec4 src = texture2D( uSampler, ( vTextureCoord ) ); float grayPx = src.r*0.2126 + src.g*0.7152 + src.b*0.0722; gl_FragColor = vec4(grayPx, grayPx, grayPx, 1); }

Performance comparisons in Chrome

Here’s the setup for comparing performance of the 2 method:

  • Input was a 3864×3864 image of the Crab Nebula, rendered onto a 2D canvas (note that time to render onto the 2D canvas is not considered in the data points below)
  • Output is the 2D canvas that the input image was render on
  • CPU was an AMD Ryzen 7 5700X
  • GPU was a RTX 2060
  • OS is Windows 10 Build 19044
  • Browser is Chrome 108.0.5359.125
  • Hard refresh on page load to bypass any browser-level caching
  • Transformation via WebGL approach for 25 iterations
  • Transformation via 2D canvas approach for 25 iterations

Visually, this is what’s being done:

canvas-image-transformer grayscale conversion

I tried to eliminate as much background noise as possible from the result; that is, eliminating anything that may have a impact on CPU or GPU usage: closing other applications that may have significant usage, not having any other tabs open in the browser, and not having DevTools open when image processing was being done. That said, I was not rigorous about this and the numbers presented are to show overall/high-level behavior and performance; they’re not necessarily representative of what peak performance would be on the machine or browser.

It’s also worth noting that canvas-image-transformer doesn’t attempt to do any sort of caching in the first iteration (i.e. textures are re-created, shaders are re-compiled, etc. on each iteration), so we shouldn’t expect large variances in performance from one iteration to the next.

Graphing the data points for each approach, for each iteration, I got the following (note that what’s presented is just the data for 1 test run; I did test multiple times and consistently saw the same behavior but, for simplicity, I just graphed the values from 1 test run):

canvas-image-transformer performance data

So, the data points for the first iteration are interesting.

  • On the 2d canvas, the transformation initially takes 371.8ms
  • On the webgl2, the transformation initially takes 506.5ms

That’s a massive gap in performance between the 2 methods, with the 2d canvas method being significantly faster. I would have expected the WebGL approach to be faster here as, generally, graphics-related things would be faster with a lower-level GPU interface, but that’s clearly not the case here.

For subsequent iterations, we can see that performance improves and normalizes for both approaches, with significantly better performance using the WebGL approach; however, why don’t we see this sort of performance during the first iteration? Profiling the code, I noticed I was consistently seeing the majority of execution time spent on texImage2D() during the first iteration:

gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA8, gl.RGBA, gl.UNSIGNED_BYTE, srcCanvas);

Looking at the execution time of texImage2D() across iterations, we get the following:

canvas-image-transformer texImage2D execution time

We see 486.9ms spent in texImage2D() during the first iteration but then execution time drops to only ~0.1ms in subsequent iterations. It’s clear that loading data into a texture is the most costly operation on the first iteration, however it looks like there’s some sort of caching mechanism, likely in Chrome’s GPU component, that essentially eliminates this cost on subsequent iterations.

In an attempt to optimize the call in the first iteration, I briefly looked into potential optimizations to the texImage2D() call but didn’t find much. There’s no mipmap creation or doing any sort of format conversion here, so we’re just bound by how quickly we can get the pixels into VRAM.

Normal refresh (after previous page load)

There’s a bit more nuance here that’s worth touching on. Looking at just the first iteration in Chrome, after normal/soft refreshes, we see some interesting behavior:

canvas-image-transformer performance data, first iteration only
  • For the 2d canvas, the first iteration transformation times look the same as when doing a hard refresh
  • For the WebGL canvas, we’re getting the transformation times we saw after the first iteration when doing a soft refresh!

It looks like Chrome’s texture caching mechanism is in play and preserves cache entries across soft page refreshes.

What about Firefox and other browsers?

I would expect most Webkit-based browsers would have similar behavior to what’s in Chrome and some quick testing in Edge confirms this.

Firefox is a different beast. Testing in Firefox 108.0.2, we see the following transformation times:

canvas-image-transformer performance data

Performance, overall, is much more consistent than in Chrome, but not always better.

  • For the 2d canvas method, performance is simply worse; on the first iteration we see transformations take 150+ milliseconds more than in Chrome, and on subsequent iterations the performance gap is even wider.
  • For the WebGL method, our first iteration performance is significantly better than Chrome, reduced by more than 175 milliseconds. However, on subsequent iterations we don’t see the drastic performance improvement we see in Chrome.

For the 2d canvas method, it’s hard to say why it performs so differently than Chrome. However, for the WebGL method, a bit of profiling led to some interesting insights. In Firefox, the execution time of texImage2D() is consistent across iterations, hovering ~40ms; this means it performs significantly better than Chrome’s worst case (first iteration) and significantly worse than Chrome’s best case (non-first iteration where execution time is below 0.1ms), as shown below.

canvas-image-transformer performance data

The other significant aspect to Firefox’s performance is in the performance of the Canvas drawImage() call, in drawing from a WebGL canvas to a 2D canvas. At the tail end of the transformation process, canvas-image-transformer does the following:

const srcCtx = srcCanvas.getContext('2d'); srcCtx.drawImage(glCanvas, 0, 0, srcCanvas.width, srcCanvas.height);

Basically, it’s taking what’s on the WebGL canvas and writing it out to the input/source canvas, which is a 2D canvas. In Chrome this is a very fast operation, typically less that 2ms, in Firefox I see this typically going above 200ms.

canvas-image-transformer performance data

Firefox consistency

Finally, looking at transformation times across soft refreshes, we see Firefox performance is very consistent for both the 2D canvas and WebGL method:

canvas-image-transformer performance data

However, I did encounter a case where WebGL performance was more erratic. This was testing when I had a lot of tabs open and I suspect there was some contention for GPU resources.

Takeaways

There’s perhaps a number of small insights here depending on use-case and audience, but there’s 2 significant high-level takeaways for me:

  • GPUs are very fast at parallel processing but loading data to be processed and retrieving the processed data can be expensive operations
  • It’s worthwhile to measure things; I was fairly surprised by the different performance profiles between Firefox and Chrome

Pushing computation to the front: client-side compression

Client → Server Compression

Content from a web server being automatically gzipped (via apache, nginx, etc.) and transferred to the browser isn’t anything new, but there’s really nothing in the way of compression when going in the other direction (i.e. transferring content from the client to the server). This is not too surprising, as most client payloads are small bits of textual content and/or binary content that is already well compressed (e.g. JPEG images), where there’s little gain from compression and you’re likely to just waste CPU cycles doing it. That said, when your frontend client is a space for content creation, you’re potentially going to run into cases where you’re sending a lot of uncompressed data to the server.

Use-case: ScratchGraph Export

ScratchGraph has an export feature that essentially renders the page (minus UI components) as a string of HTML. This string packaged along with some metadata and sent to the server, which sends it to a service running puppeteer, that renders the HTML string to either an image or a PDF. The overall process looks something like this:

ScratchGraph Export Flow

The HTML string being sent to the server is relatively large, a couple of MBs, due to:

  • The CSS styles (particularly due to external resources being pulled in and inlined as base64 URLs)
  • The user simply having lots of content

To be fair, it’s usually the former rather than the latter, and optimizing to avoid the inlining of resources (the intent of which was to try and do exports entirely in the browser) would have a greater impact in reducing the amount of data being transferred to the server. However, for the purposes of this blog post (and also because it leads to a more complex discussion on how the application architecture can/should evolve and what this feature looks like in the future), we’re going to sidestep that discussion and focus on what benefits data compression may offer.

Compression with pako

I was more than ready to implement a compression algorithm, but was happy to discover pako, which does zlib compression. Compressing (i.e. deflating) with pako is very simple, below I encode the HTML string to UTF8 via TextEncoder.encode() (this is because I want UTF8, this isn’t a requirement of pako), which returns a Uint8Array, then use that as the input for pako.deflate(), which also returns a Uint8Array.

const staticHtmlUtf8Arr = (new TextEncoder()).encode(html); const compressedStaticHtmlUtf8Arr = pako.deflate(staticHtmlUtf8Arr);

Here’s what that looks like in practice, exporting the diagram shown above:

ScratchGraph Export, with pako compression, results

That’s fairly significant, as the data size has been reduced by 1,237,266 bytes (42.77%)!

The final bit for the frontend is sending this to the server. I use a FormData object for the XHR call and, for the compressed data, I put append it as a Blob:

formData.append( "compressedStaticHtml", new Blob([compressedStaticHtmlUtf8Arr], {type: 'application/zlib'}), "compressedStaticHtml" );

Handling the compressed data server-side with PHP

PHP support zlib compression/decompression via the zlib module. The only additional logic needed server-side is calling gzuncompress() to decompressed the compressed data.

$staticHtml = gzuncompress(file_get_contents($compressedStaticHtmlFile->getFilePath()));

Note that $compressedStaticHtmlFile is an object representing a file pulled from the request (note that FormData will append a Blob in the same manner as a file, so server-side, you’re dealing with the data as a file). The File.getFilePath() method here is simply returning the path for the uploaded file.

Limitations

Compressing and decompressing data will cost CPU cycles and, for zlib and most algorithms, this will scale with the size of the data. So considerations around what the client-side system looks like and the size of the data need to be taken into account. In addition, compression within a browser’s main thread can lead to UI events, reflow, and repaint being blocked (i.e. the page becomes unresponsive). If the compression time is significant, performing it within a web worker instead would be a better path.

Copy & pasting non-textual objects in the browser

Interacting with the clipboard

A proper way to deal with clipboard access has been a dream for web developers for years. There’s out-of-the-box browser support, via keyboard shortcuts and content menu commands, for <input>, <textarea>, or elements with the contenteditable attribute. However, for more complex interactions or dealing with application-defined, non-textual objects (e.g. something composed of multiple DOM elements that is manipulable by the user, such as the ScratchGraph notes and connectors shown below) we need to start looking at the available Javascript APIs.

ScratchGraph entities and connectors

Clipboard API vs document.execCommand() + paste event

The bad news is that document.execCommand() (using the “cut”, “copy” commands) is still the de-facto way of writing to the clipboard, despite this method being deemed obsolete. The good news is that there does seems to be good progress, in terms of stability and browser implementation, of the Clipboard API.

For reading from the clipboard, the paste event seems to be the best way to go, however there is the inherent limitation that this event will only be triggered by “paste actions” from the browser’s interface (e.g. the use hitting Ctrl + V). Again, there is good news in that the Clipboard API would allow more flexibility here and there is good progress towards browser support.

Despite the good news around the Clipboard API, given the state of where things are now, in September 2020, using document.execCommand() and the paste event seems to be the way to go; the Clipboard API, as well as the corresponding permissions via the Permissions API, are still in the process of being implemented in browsers. However, most of what’s in this post will (hopefully) still be valid with the Clipboard API, with the clipboard interaction code being more robust and the more hacky bits being thrown away.

Copying to the clipboard with document.execCommand(“copy”)

Selecting plain text and copying it to the clipboard is straightforward with an <input> or <textarea>:

  • Call the select() method on the element
  • Call document.execCommand("copy");

We can make a generic method to copy arbitrary text to the clipboard by programmatically creating a <textarea>, setting its value to the text we want, performing the above operations to copy its contents to the clipboard, and finally removing the created <textarea>.

const Clipboard = { copyText: function(_text) { const textAreaElem = document.createElement('textarea'); textAreaElem.textContent = _text; document.body.appendChild(textAreaElem); textAreaElem.select(); document.execCommand("copy"); document.body.removeChild(textAreaElem); } };

Now, using this method, if we can serialize an object to some sort of textual format, we can put it on the clipboard. JSON is a good option, as it’s easy to work with in Javascript.

Object → JSON → Plain text → Clipboard

In ScratchGraph, entities like notes, connectors, etc. have a toJSON() method, which returns a JSON serialized representation of the entity, e.g.:

this.toJSON = function() { const serializedObj = { "id": self.getId(), "owner_id": self.ownerId, "sheet_id": self.sheetId, "position_x": self.getX(), "position_y": self.getY(), ... }; return serializedObj; }

When a user initiates a request to copy something, say by hitting Ctrl + C, we iterate over all the selected entities, get the serialized JSON for each, and build an array of JSON objects. Next we construct a JSON object containing that array, along with some metadata around context (application name, version, etc.). Finally, we JSON.stringify this object to get a plain text representation and use the Clipboard.copyText() method to write to the clipboard.

document.addEventListener('keydown', function(e) { // Copy selected entities on Ctrl + C if(e.ctrlKey && e.key === 'c') { const entitiesSelected = currentGroupTransformationContainer.getEntities(); const entitiesJsonArr = []; entitiesSelected.forEach(function(e) { entitiesJsonArr.push(e.toJSON()); }); ... const strForClipboard = JSON.stringify({ "application": "scratchgraph", "version": "1.0", "entities": entitiesJsonArr, ... }); Clipboard.copyText(strForClipboard); } });

Pasting from the clipboard

To read what’s on the clipboard, we can listen for and implement a handler on the paste event.

While you can listen for the paste event on any DOM element, I’ve found it tricky to isolate to specific elements because the element that has focus isn’t always obvious and I’ve run into situations where an offscreen <input> gets focus and the clipboard content simply gets pasted into that input. It’s more reliable to listen on the document and determine if and where to paste something based on the metadata embedded when we copied data to the clipboard.

Sketching out what we need to deal with, we get the following:

  • Listen for paste events on the document
  • Check if there’s plain-text content being pasted
  • Try to parse the plain-text content as JSON
  • Check whatever metadata is in the object to see if it’s something copied from our application and it’s something we can read/interpret.
  • If it’s something we can handle, suppress any default behavior and do what is needed to clone and create new objects.

This is simplified for clarity, but the code in ScratchGraph looks something like this:

document.addEventListener('paste', function(e) { if(typeof e.clipboardData === 'undefined' || typeof e.clipboardData.items === 'undefined') { return; } const items = e.clipboardData.items; for (let i=0; i<items.length; ++i) { if (items[i].kind === 'string' && items[i].type === "text/plain") { try { const clipboardJson = JSON.parse(e.clipboardData.getData('text/plain')); if(clipboardJson.application === "scratchgraph") { e.preventDefault(); createFromClipboardJson(clipboardJson); } } catch(err) { } } } });

The createFromClipboardJson() method handles the application-specific logic of reading the data and creating copies. In ScratchGraph, I’m dealing with entities, so I don’t actually deserialize, I just read the bits of data needed to be make a clone and create something with a new ID (i.e. new entity). However, YMMV, based on the type of objects you’re dealing with, how your application handles data, and/or how you deal with state.

Limitations and future work

As I mentioned, there are limitations here when it comes to reading or writing from/to the clipboard. The paste event will only be triggered by interactions supported by the browser, so creating something like a button to paste content isn’t possible. document.execCommand("copy") is now considered obsolete and the method presented to allow copying arbitrary bits of text is pretty hacky, though it is versatile in that you can bind the method to application-specific interactions (e.g. a button to copy content). A further limitation here is that data can only be copied as plain-text and we’re not actually encoding any type information; this manifests in some non-ideal behavior, where what’s put on the clipboard can be pasted into any application that accepts plain-text.

The Clipboard API looks to be a promising solution to these limitations. I’m hoping to revisit this in the near future to update the clipboard interaction logic to use the API and have an all-round cleaner and more robust solution.

Pushing computation to the front: video snapshots

Video and the Canvas API

The Canvas API is surprising versatile. The image parameter of the CanvasRenderingContext2D.drawImage() method will accept images from a number of different sources including an HTMLVideoElement. I touched on this a bit in a previous post about processing the data from video streams, however HTMLVideoElement can also handle loading and rendering video files, with all modern browsers capable of tackling the non-trivial tasks of decoding and rendering H.264 MP4 or VP8/VP9 WebM content (and, of course, you get all the benefits of the client’s GPU hardware that the browser takes advantage of). This opens up the possibility of capturing frames from video files which can be used for preview images, poster images, or substituting in an image when video playback isn’t possible (e.g. for a print layout, which is the issue I’ve run into with ScratchGraph).

Setting up the HTMLVideoElement

This is fairly standard, here we’ll load an H.264 MP4 with the filename “test.mp4”:

const video = document.createElement('video'); const videoSource = document.createElement('source'); videoSource.setAttribute('type', 'video/mp4'); videoSource.setAttribute('src', 'test.mp4'); video.appendChild(videoSource);

For reference, here’s the test video:

Next, we want to seek to a point in the video where we want to capture the frame and also bind to an event that’ll tell us when we’re able to read the frame data from the HTMLVideoElement. The seeked event works well. The other potentially viable option is the loadeddata event, but I ran into some issues here, which I’ll describe later.

video.addEventListener('seeked', function(e) { // capture the video frame at the point seeked to... }); // seek to 2s video.currentTime = 2;

Render the frame onto a canvas

The Canvas API makes this really easy and the process mirrors what’s described in the post on thumbnail generation:

/** * * @param {HTMLVideoElement} video * @param {Number} newWidth * @param {Number} newHeight * @param {Boolean} proportionalScale * @returns {Canvas} */ videoFrameToCanvas: function(video, newWidth, newHeight, proportionalScale) { if(proportionalScale) { if(video.videoWidth > video.videoHeight) { newHeight = newHeight * (video.videoHeight / video.videoWidth); } else if(video.height > video.videoWidth) { newWidth = newWidth * (video.videoWidth / video.videoHeight); } else {} } const canvas = document.createElement('canvas'); canvas.width = newWidth; canvas.height = newHeight; const canvasCtx = canvas.getContext('2d'); canvasCtx.drawImage(video, 0, 0, newWidth, newHeight); return canvas; }

I added this method to the canvas-image-transformer library; referencing the method we can now flesh out the seeked event handler. For this test, we’ll also render out what’s on the canvas to an <img> element in the document to see what’s been captured.

video.addEventListener('seeked', function(e) { // capture the video frame at the point seeked to const frameOnCanvas = CanvasImageTransformer.videoFrameToCanvas(video, 500, 500, true); document.getElementById('testImage').src = frameOnCanvas.toDataURL(); });

frameOnCanvas is a canvas with the captured frame, and here’s what it looks like transformed & rendered into an <img> element:

canvas-image-transformer-test-video-frame-capture

Issues

  • Something not immediately obvious is that the seeked event is not fired if video.currentTime = 0 (i.e. you want to seek to the first frame of a video). However, you can use a very small time value (e.g. video.currentTime = 0.000000001), which will typically seek to the first frame in most cases. That said, it is a hacky/non-elegant solution.
  • There are cross-browser issues with the loadeddata event. In Firefox, you will only get a frame capture if you don’t seek. If you do attempt to seek, you’ll get a empty frame and the canvas will have a transparent image. Conversely, in Chrome (and other Webkit-based browsers), you will only get a frame if you do seek. The standard states that the event should be fired when “the user agent can render the media data at the current playback position for the first time” which seem to indicate an implementation flaw in both browsers.
  • The test video was taken on my phone and the frames themselves are upsided-down, this is typical with smartphone videos as it’s expected that playback will take into account metadata indicating orientation. In Firefox, this isn’t taken into account when using CanvasRenderingContext2D.drawImage() with HTMLVideoElement, so you get an upsided-down image on the canvas.

Alternatives & limitations

I couldn’t think of a ton of options for decoding H.264 or VP8/VP9. If you’re looking to create something yourself, a server-side service invoking FFmpeg seems like the best option. I played around with Puppeteer, but Puppeteer comes with Chromium, which lacks the audio and video support you get out-of-the box with Chrome. Although, installing and using Chrome server-side with Puppeteer has potential.

There are also third-party services which can handle video decoding and transcoding, and those are solid server-side options.

As with thumbnail generation, here again we’re looking at workloads that have potential to be moved to the frontend, where you have hardware better suited for graphics work and the possibility of reducing backend complexity. On the other hand, the same limitations comes into play, as you have less control over the execution environment and no clear path for backfill or migration needs.

Pushing computation to the front: thumbnail generation

Frontend possibilities

As the APIs brought forward by HTML5 about a decade ago have matured and the devices running web browsers have continued to improve in computational power, looking at what’s possible on the frontend and the ability to bring backend computations to the frontend has been increasingly interesting to me. Such architectures would see each user’s browsers as a worker for certain tasks and could simply backend systems, as those tasks are pushed forward to the client. Using Canvas for image processing tasks is one area that interesting and that I’ve had success with.

For Mural, I did the following Medium-esque image preload effect, the basis of which is generating a tiny (16×16) thumbnail which is loaded with the page. That thumbnail is blurred via CSS filter, and transitions to the full-resolution image once it’s loaded. The thumbnail itself is generated entirely on the frontend when a card is created and saved alongside the card data.

In this post, I’ll run though generating and handling that 16×16 thumbnail. This is fairly straightforward use of the Canvas API, but it does highlight how frontend clients can be utilized for operations typically relegated to server-side systems.

The image processing code presented is encapsulated in the canvas-image-transformer library.

<img> → <canvas>

A precursor for any sort of image processing is getting the image data into a <canvas>. The <img> element and corresponding HTMLImageElement interface don’t provide any sort of pixel-level read/write functionality, whereas the <canvas> element and corresponding HTMLCanvasElement interface does. This transformation is pretty straightforward:

The code is as follows (an interesting thing to note here is that this can all be done without injecting anything into the DOM or rendering anything onto the screen):

const img = new Image(); img.onload = function() { const canvas = document.createElement('canvas'); canvas.width = img.width; canvas.height = img.height; const canvasCtx = canvas.getContext('2d'); canvasCtx.drawImage(img, 0, 0, img.width, img.width); // the image has now been rendered onto canvas } img.src = "https://some-image-url";

Resizing an image

Resizing is trivial, as it can be handled directly via arguments to CanvasRenderingContext2D.drawImage(). Adding in a bit of math to do proportional scaling (i.e. preserve aspect ratio), we can wrap the transformation logic into the following method:

/** * * @param {HTMLImageElement} img * @param {Number} newWidth * @param {Number} newHeight * @param {Boolean} proportionalScale * @returns {Canvas} */ imageToCanvas: function(img, newWidth, newHeight, proportionalScale) { if(proportionalScale) { if(img.width > img.height) { newHeight = newHeight * (img.height / img.width); } else if(img.height > img.width) { newWidth = newWidth * (img.width / img.height); } else {} } var canvas = document.createElement('canvas'); canvas.width = newWidth; canvas.height = newHeight; var canvasCtx = canvas.getContext('2d'); canvasCtx.drawImage(img, 0, 0, newWidth, newHeight); return canvas; }

Getting the transformed image from the canvas

My goto method for getting the data off a canvas and into a more interoperable form is to use the HTMLCanvasElement.toDataURL() method, which allows easily getting the image as a PNG or JPEG. I do have mixed feeling about data-URIs; they’re great for the web, because so much of the web is textually based, but they’re also horribly bloated and inefficient. In any case, I think interoperability and ease-of-use usually wins out (esp. here where we’re dealing with a 16×16 thumbnail and the data-uri is relatively lightweight) and getting a data-uri is generally the best solution.

Using CanvasRenderingContext2D.getImageData() to get the raw pixel from a canvas is also an option but, for a lot of use-cases, you’d likely need to compress and/or package the data in some way to make use of it.

Save the transformed image

With a data-uri, saving the image is pretty straightforward. Send it to the server via some HTTP method (POST, PUT, etc.) and save it. For a 16×16 PNG the data-uri textual representation is small enough that we can put it directly in a relational database and not worry about a conversion to binary.

Alternatives & limitations

The status quo alternative is having this sort of image manipulation logic encapsulated within some backend component (method, microservice, etc.) and, to be fair, such systems work well. There’s also some very concrete benefits:

  • You are aware of and have control over the environment in which the image processing is done, so you’re isolated from browser quirks or issues stemming from a user’s computing environment.
  • You have an easier path for any sort of backfill (e.g. how do you generate thumbnails for images previously uploaded?) or migration needs (e.g. how can you move to a different sized thumbnail?); you can’t just run though rows in a database and make a call to get what you need.

However, something worth looking at is that backend systems and server-side environments are typically not optimized for any sort of graphics workload, as processing is centered around CPU cores. In contrast, the majority of frontend environments have access to a GPU, even fairly cheap phone have some sort of GPU that is better suited for “embarassing parallel”-esque graphics operations, the performance benefits of which you get for free with the Canvas API in all modern browsers.

In Chrome, see the output of chrome://gpu:

chrome settings, canvas hardware acceleration

Scale, complexity and cost also come into play. Thinking of frontend clients as computational nodes can change the architecture of systems. The need for server-side resources (hardware, VMs, containers, etc.) is eliminated. Scaling concerns are also, to a large extent, eliminated or radically changed as operations are pushed forward to the client.

Future work

What’s presented here is just scratching the surface of what’s possible with Canvas. WebGL also presents as a ton of possibilities and abstraction layers like gpu.js are really interesting. Overall, it’s exciting to see the web frontend evolve beyond a mechanism for user input and into a layer in which substantive computation can be done.

Embedding content with oEmbed

History

Embedding external content has been a feature of the web since the introduction of the iframes ages ago. However, embedding as a business strategy didn’t seem to be a thing until sometime around the late 2000s or the early 2010s, as social networking became big business, blogging became really popular, and there was concern over walled gardens. In this environment, embedding became a component for growth, and it was no doubt successful for now behemoths like Youtube and Twitter (another component was adding social networking to sites, à la Google+ or Dunder Mifflin Infinity). Almost any site dealing in content had an embed feature or an embedded “widget.” Even Grovo, where I worked, was on this train as well with the Grovo Widget, though I was not involved in its development. It some cases this made sense and provided utility, in many others it was just copying what seemed to be working for others in ecosystem without regard for product or overall business strategy.

It seems like around this time oEmbed was drafted and Embedly was founded.

oEmbed

My view on how embeds were done was based on what you see in most UIs: you get a embed code (which is a snippet of HTML, likely encapsulated within an iframe), you paste that into a page, and the browser does the rest.

I learned about oEmbed working on Mural. oEmbed is an interesting protocol which allows consumers to request data representing what a resource should look like in an embedded context, given the URL of that resource. The overall flow looks something like this:

Figuring out what the oEmbed endpoint is involves looking at a <link> element from the resource’s HTML page. Unfortunately this does mean you have the overhead and complexity of downloading and parsing an HTML document to get the endpoint. An alternative is pulling from the list of providers in the oEmbed repo and looking at the oEmbed endpoints and allowed URL schemes for resources.

In either case, the result for the end-user is that by simply providing the URL for a resource, the appropriate content for an embed can be provided. Here’s what this looks like in Squarespace:

Embedly

Unfortunately, the video above is a lie. While everything described around oEmbed would allow for a flow like that, Squarespace (and a surprising number of other popular sites, like Medium) outsource handling of oEmbed to Embedly.

Embedly seems to do a few other things, but primarily it seems to proxy oEmbed content. What’s the value-add? According to Embedly:

We take care of every step of the process: retrieving information about a URL, checking it against malware registries, extracting content, making additional API calls to providers that support them, parsing RSS feeds, and performing validation. We save you time so that you can focus on making your app great.

The only aspect there I find compelling for the price tag is “checking it against malware registries,” but there’s little info on what level of protection they’re actually providing there.

I dealt with Embedly directly when working on Mural and it was a frustrating experience. First, note that if you’re not registered as an Embedly provider, sites that proxy through Embedly will return incorrect oEmbed data. On Squarespace, I noticed the Embedly response would have a type of “link” and not provide any of the data to do an HTML embed (so what’s shown in the video above will not happen, and it will appear as if the resource is not embeddable). Registering as a provider involves filling out a form with some endpoint information and example URLs, easy enough, but I had to wait weeks with no responses or status updates. For a request put in on Jan. 31, 2018, the Embedly integration was not done until Apr. 9th, 2018. A terrible experience overall and surprising for a company that is (a) owned by Medium and (b) seems to be a critical dependency for so many sites.

The short of it is, if as a provider, users can’t embed your content on a site, see if you need to register as an Embedly provider. If you do, good luck.

The Future

The excitement and prominence around blogging seems to have died down and this seems to have corresponded with the excitement around embedding content diminishing as well. That said, I think being able to embed content is still, and will continue to be, a powerful mechanism on the web. Even with its flaws oEmbed works nicely in this landscape. I can’t say the same for Embedly.

Serving mjs files with nginx

In my previous post on ES modules, I mentioned adopting the mjs file extension for modules files. Unfortunately, there’s no entry for the mjs extension in nginx’s default mime.types file, meaning it’ll be served as application/octet-stream instead of application/javascript. The MIME type for mjs files can be set by explicitly including mime.types in a server, http, or location block & adding a types block with the MIME type and file extension:

server { include mime.types; types { application/javascript mjs; } ... }

Writing and testing ES modules

The toolchain

ES modules are one of the more exciting additions to the Javascript language. Effectively being able to break-off and modularize has continually led to better code and development practices in my experience. For server-side Javascript, Node and its associated module system took hold, but there was nothing comparable for browsers. However, Node-based toolchains to produce frontend code also became a thing, doing rollup, transpilation, minification, etc. This wasn’t necessarily a bad thing, and came with some noted benefits such as better support for unit testing scaffolds and integration into CI pipelines, but it also set a stage for increasingly complex toolchains and an ecosystem whereby frontend components were Node-based server-side components first, and transformed into frontend components after. The latter, in turn, led to a state where you were always working with a toolchain (it wasn’t just for CI or producing optimized distributables) as there was no path to directly load these components in a browser, and I’d argue also led to a state where components were being composed with an increasing and ridiculous number of dependencies, as the burden of resolving and flattening dependencies fell to to the toolchain.

ES modules aren’t any sort of silver bullet here, but it does show a future where some of this complexity can be rolled back, the toolchain only needs to handle and be invoked for specific cases, and there is less impedance in working with frontend code. We’re not there yet, but I’m hopeful and as such I’ve adopted ES modules in projects where I’ve been able to.

A look at GraphPaper

GraphPaper was the first project where I committed to ES modules for the codebase. At the time, support for ES modules was limited in both Node and browsers (typically incomplete support and put behind a feature flag), so for both development work and producing distributables, I used rollup.js + Babel to produce an IIFE module. This worked well, though it’s a pain to do a build every for every code change I want to see in the browser. I also remember this being pretty easy to setup initially, but the package.json became more convoluted with Babel 6, when everything was split into smaller packages (I understand the rationale, but the developer experience is horrible and pushes the burden of understanding various babel components to consumers).

Structuring the module

Structuring was fairly simple. Everything that was meant to be accessible by consumers was declared in a single file (GraphPaper.js), declared via export statements (i.e. it was just a file with a bunch of export * from … statements). This file also served as the input for rollup.js:

{ input: 'src/GraphPaper.js', output: { format: 'iife', file: 'dist/graphpaper.min.js', name: 'GraphPaper', sourcemap: true }, plugins: [ babel(babelConfig), ], }

In a modern browser, it would be possible to import the GraphPaper ES module directly, like so:

<script type="module"> import * as GraphPaper from '../src/GraphPaper.js'; ... </script>

However, the way dependent web workers are built and encapsulated makes this impossible (explained here).

Another problem, but one that’s fixable, is that I wrote import and export statements without file extensions. For browsers, this leads to an issue as the browser will just request what’s declared in the statement and not append any file extension (so a statement like import { LineSet } from './LineSet'; results in a request to the server for ./LineSet, not ./LineSet.js, as the file is named. Moving forward, the recommendation to use the .mjs extension and explicitly specifying the extension in import statements seems to be a good idea; in addition to addressing the issue with browser requests, when working in Node .mjs files will automatically be treated as ES modules.

Pushing aside the web worker issue, we can see a future where rollup.js isn’t necessary, or at least not necessary for producing browser-compatible (e.g. IIFE) modules. It’s role can be limited to concatenation and orchestrating optimizations (e.g. minification) for distributables. Similarly, for Babel, it’s role can be reduced or eliminated. As support for newer ES features (particularly those in ES6 and ES7) continues to improve across systems (browsers, Node, etc.), and users adopt these systems, transpilation won’t be as necessary. The exception is the case where developers want to use the very latest ES features, but I think we’re quickly approaching a point of diminishing returns here, especially relative to the cost of toolchain complexity.

Testing with jasmine-es6, moving to Ava

For testing, I found jasmine-es6 to be one of the simpler ways to test ES modules at the time. Ava existed, but I remember running into issues getting it working. I remember also toying with Jest at some point and also running into issues. In the end, jasmine-es6 worked well, I never had issues importing and writing tests for a module. Here’s a sample test from the codebase:

import { Point } from '../src/Point' import { Line } from '../src/Line' import { LineSet } from '../src/LineSet' describe("LineSet constructor", function() { it("creates LineSet from Float64Array coordinates", function() { const typedArray = Float64Array.from([1,2,3,4,5,6,7,8]); const ls = new LineSet(typedArray); const lineSetArray = ls.toArray(); expect(ls.count()).toBe(2); expect(lineSetArray[0].isEqual(new Line(new Point(1, 2), new Point(3, 4)))).toBe(true); expect(lineSetArray[1].isEqual(new Line(new Point(5, 6), new Point(7, 8)))).toBe(true); }); });

jasmine-es6 has and continues to work really well despite being deprecated. I’ll likely adopt and reformat the tests to Ava at some point the future. I’ve played around with it again recently and it was a much smoother experience, it’s also better supported and I like the simpler syntax around tests more-so than the Jasmine syntax. I’m looking to do this when Node has stable support for ES modules, as this would mean not worrying about pulling in and configuring Babel for running tests (though it’ll likely still be around for rollup.js).

Takeaways

Overall, it’s been fairly smooth working with ES modules and it looks like things will only improve in the future. Equally exciting is the potential reduction in toolchain complexity that comes with better support for ES modules.

  • Support for ES modules continues to improve across libraries, browsers, and Node
  • It’s probably a good idea to use the .mjs file extension
  • Rollup.js is still needed for now to make browser-compatible (e.g. IIFE) modules, but will likely take on a more limited role in the future (concatenation & minification)
  • Better support for ES6 and ES7 features across the board will mean that Babel, and transpilation in general, won’t be as necessary