Aug 8 2010 · Web Technologies
Generating website thumbnails was a pretty hot topic a few years ago with the rise of services like websnapr, thumbalizr, and, perhaps the most well-known, snap shots. The popularity of doing this has faded somewhat, most likely due to the realizations that services like snap shots are not all that helpful and more of an annoyance than anything else.
I did my own thumbnail generator app a few years ago for a CentOS system. It was painful. I believe I was using gtkmozembed and just going through hell to resolve all the dependencies to get things running on a command line system.
With the WebKit stuff I recently put up, I decided to revisit the idea under .NET, and things were surprisingly easy. This is due to the fact that WinForms 2.0 has the Control.DrawToBitmap() method which makes capturing an image of the client area of a control relatively easy. I should mention, .NET 2.0 has the WebBrowser control, but I could never get the Control.DrawToBitmap() method to work; I always got a blank (white) image. The WebBrowser control is also undesirable because it’s based on IE6, so its certainly far from up-to-date or accurate rendering, even more-so than the outdated version of WebKit I’m using.
So let’s say we’re running a WebKit instance in a Panel (panel1), we can capture what’s displayed as follows,
Bitmap bmpOut = new Bitmap(panel1.Width, panel1.Height);
Here’s an example of a capture:
There is one critical limitation; only the visible portion of the web page is captured. However, if a component to scroll and capture different portions of the page were implemented, and the final image was pieced together from these portions, a full page would be captured.
Another limitation is there there is no way to tell when the page is finished loading. Figuring this out requires digging into the depths of the WebKit codebase, I’m certainly not up for it, and it’s not for the faint of heart.