Two years ago I inherited jsdom, a project of breathtaking scope. In essence, it aims to simulate a browser in JavaScript, by implementing JavaScript versions of a wide variety of web standards (much more than just the DOM). While maintaining jsdom for the last two years, and eventually bringing it to a 1.0 release, I learned an incredible amount about the web. I want to share with you what I’ve learned: the history of the standards and implementations of them that make up the web; the interaction between seemingly-disparate parts of the platform; and all about the strange APIs we’ve come to know and love. You should walk away from this talk with a new appreciation for how browsers work—and how, through the ongoing effort of a community of contributors and package maintainers, we’ve been able to recreate one in pure JavaScript.
2. WHAT IS JSDOM?
A JavaScript implementation of the DOM*,
for use with Node.js**
* Actually much more than just the DOM
** Actually it runs in more places than just Node.js
20. TMPVAR/JSDOM#950
“In the browser, you can do:
window.document.querySelectorAll([
'link[type="text/xml"]',
'link[type="application/rss+xml"]',
'link[type="application/atom+xml"]'
]);
This doesn't work in jsdom.”
It’s my believe that to truly, deeply understand something … you have to re-implement it in JavaScript.
That’s why I want to tell you about the most rewarding open-source project I’ve been involved in since joining the JavaScript community: a project called jsdom.
The tagline of jsdom is that it’s a JavaScript implementation of the DOM, for use in Node.js. (Click to show asterisks.)This isn’t entirely accurate, and we’ll talk more about the details soon, but the basic idea is there. We’re starting in an environment divorced from the web, with just the basics of JavaScript to guide us: objects, functions, arrays, etc. And we want to produce a simulacrum of the many APIs that make a web page work: things like window, and document, and querySelector, and XMLHttpRequest. We want to produce such an accurate emulation, that we can then run code meant for a real browser, against our implementation. In essense, we are reimplementing much of the browser, in JavaScript.The original reason for creating jsdom was for server-side rendering. You could run the JavaScript that composes a page out of data and templates and all that, against this pure-JavaScript environment, and get a bunch of HTML out at the end. You’d then send this down the wire to render in the browser, and then run that same JavaScript in the browser against the already-rendered DOM.But jsdom has grown far beyond that.
I got involved with jsdom a couple years ago, mainly using it for testing. I was writing a Windows 8 app in HTML and JS, and it turns out that there’s no way to run automated unit tests in Windows 8. Crazy, right? So I found this project called jsdom, that provided a virtual environment in which I could run my unit tests. I submitted a few pull requests, then a few more, and pretty soon I was hooked. I’ve been helping to maintain the project ever since, with involvement from some 124 other contributors as well.
Here’s an example of perhaps the simplest use of jsdom: using it as a web scraper. You can see how we load a URL into the jsdom environment, then also load our own script---jQuery. One everything has loaded into the environment, we can manipulate the window object to do things like select elements using jQuery and count or manipulate them.
Note that all of this is taking place purely in Node.js---in JavaScript. There’s no instance of Chrome being booted up; no Selenium server in the background. The process of building up a virtual tree of nodes into a document, and into a fully-functional window object, is taking place entirely in terms of JavaScript objects provided by the jsdom library. Pretty cool, right?
Here’s an example that’s a bit more complicated. It shows how you can use jsdom for testing.
In this case, we’re testing some code that’s meant to run in the browser via browserify. First we create the JavaScript for the browserify bundle, just as a string of JS. Then we use jsdom to create an in-memory window from a basic index.html page. Then we manipulate this window: we create a <script> element, and insert the output from browserify into that script element. Finally we test to see if the script modified the contents of our window’s <body> in the expected manner.
I use jsdom for these kind of tests in many of the libraries I write. In this way it can replace costly out-of-process tools like PhantomJS or Selenium.
Finally, here’s an example of using jsdom in real-time as part of a server pipeline. This is a quick-and-dirty HTTP server that will take any URL you give it, and use jsdom to flip all the images, in real time. It does this by using jsdom’s canvas functionality to rotate the image, then get the result as a data URL, which it then modifies the DOM to point to. And indeed…
It works!
… These examples give you some idea of some of the more common uses of having an in-JavaScript DOM: scraping, testing, and real-time manipulation. People have built more complicated tools on top of jsdom, for example:
Zombie.js, which takes the idea of an in-memory window one step further to give you an in-memory browser, where you can submit forms, click links, and more.
Or Facebook’s Jest testing framework, which uses jsdom to run your tests lightning-fast against a mocked browser environment.
The actual “DOM” standard is just a single document, a living standard hosted at dom.spec.whatwg.org. It defines only the basics, really: events, node trees, attributes, mutation observers, and documents.
You might think creating jsdom was just a matter of translating this spec into code.
It turns out that to create a useful document and window object, you need a lot more than just the DOM Standard and its node trees. You also need HTML, for the definition of all the elements that will appear in that DOM. You need the spec for parsing the DOM, in order to construct the node tree from a string. You need the spec for serializing, in order to make innerHTML work. And of course you need things like XHR and URL parsing. What’s more, even though jsdom doesn’t do any actual layout calculations (yet?), you do need some CSS implemented. CSS selectors, of course, so that querySelector works, but also the CSS object model, so that when things animate or hide or show, you can reflect that in the .style property.
jsdom was coded before the modern DOM and HTML standards existed. Instead, we have this code structure that tries to build itself up in steps: first DOM level 1, then… well… (Slides)It turns out browsers never actually implemented a lot of this stuff. Much of it was crazy XML things; others were just very obscure; etc. The modern DOM and HTML standards codify what browsers actually implement, in a single spec, and they have a note about how a bunch of stuff is obsolete and should be removed from implementations that still have it. So jsdom is actually built in this same layered way, reflecting the historical way specs grew up, and needs a good clean-up effort.
That’s just a small taste of the unfortunate truth about the standards underlying the web platform, but we’ll stop there for today.
Anyway, the good news is we’re fixing that; in jsdom 2.0, we’re starting the process of squashing everything down into a single implementation, instead of the layer cake.
As part of this effort, we’re starting to run the web-platform-tests suite against jsdom. Web-platform-tests is a cross-vendor test suite meant to cover, essentially, all of the web platform: DOM, HTML, CSS, Shadow DOM, Service Worker … it’s all there.
It’s not perfect by any means. Coverage is pretty spotty … there’s only two tests for all of <select>, for example. It’s generally understaffed, and it’s sometimes hard to get vendors to contribute their tests to the suite. But it’s still a really cool project, and I’m excited for jsdom to start participating in it---hopefully we can become one of the vendors contributing our tests back, so that they’re run in all the “real” web browsers out there. Right now we’re just running a tiny subset, but our policy is that new features should be implemented by pulling in the appropriate web-platform-test.
Finally, I want to give a shout out to the other projects that make jsdom possible. We depend on these guys for some of the trickiest and most important parts of jsdom. They’re all maintained by separate people, outside of the jsdom team, who have taken the time to produce a faithful implementation of the relevant part of the web platform. I’ve been consistently impressed by how professional the maintainers of these projects are, and the extent to which they’re willing to work with jsdom.
For example, we recently had a great collaboration with the author of parse5, our HTML parser, to get support for the <template> tag into jsdom. And when we wanted to make jsdom browserifyable, so that you could run it in a web worker, we discussed some of the tricks we needed with the author of cssstyle, and were able to get that up and running pretty quickly. It’s been really great.
I want to finish up by talking about the future of jsdom, but I’m going to do it in a bit of a roundabout way. I promise it’ll all connect up.
First I want to share with you this interesting issue that was opened on jsdom pretty recently. (read the slide)
I was pretty puzzled by this, as I’d never heard of querySelectorAll taking an array before. So I went to look at the spec:
The thing to notice is that querySelectorAll is specced as taking a “DOMString,” which to us mere mortals just means a JavaScript string.
But what’s important to realize about the spec is that when it says something takes a DOMString, it means that any argument you pass it gets converted to a string. So from this perspective, the behavior makes sense:
When you convert an array to a string, it does a join with commas. So that’s why you can pass an array to querySelectorAll, and it’ll still work.
It turns out that this is a pretty general problem, and it occurs because we’ve been mostly ignoring the way specs are written. Specs are written in this horrible language called “WebIDL,” with weird concepts like “interfaces” and “readonly attributes.”What browsers actually do, given this, is they write code generation tools to take WebIDL and turn it into C++ implementations of the DOM APIs, with all the type conversions and such baked in. This is the kind of step we’re missing from jsdom: something that takes the machine-readable WebIDL language written in the specs, and auto-generates the correct type conversions and so on.
In fact, you can go further than this in some cases. You can generate the entire HTMLHRElement class from its IDL definition. This definition, translated into JavaScript speak, is just saying that it has a number of getters and setters which “reflect” the values of the corresponding HTML attribute. It really just means this:
What I’ve done here, is “generated” a JavaScript implementation of HTMLHRElement from its WebIDL definition. Just like how browser vendors generate C++ implementations from WebIDL! This way we’re guaranteed to get all the correct semantics, as specified in the appropriate standard, and not create any oversights.Well of course, as I said at the beginning, to truly understand something, like how browsers generate their implementations of the DOM from IDL, you have to reimplement it in JavaScript.
So that’s what I’ve been doing.
These tools are actually built in the service of a separate project of mine, HTML as Custom Elements, which I’d love to talk with you about. The basic idea is that they’re able to take as input WebIDL, with all its weird features, and convert them into straight JavaScript classes. In other words, they are an automated version of the translation that I did by hand in the last two slides. The first generates the class; the second is responsible for the details of reflecting WebIDL properties to HTML attributes; and the last is responsible for doing all the type conversions, some of which are actually pretty complicated.But the main takeaway here, is to realize that me and the jsdom team are moving ever forward, becoming more and more a real browser, and learning more and more along the way.
So that’s it! I’d encourage you to come contribute to jsdom with me, and learn how the web platform works … by reimplementing it in JavaScript.