Streams are a fundamental programming primitive for representing the flow of data through your system. It's time we brought this powerful tool to the web. What if we could stream data from a HTTP request, through a web worker that transforms it, and then into a <video> tag? Over the last year, I've been working on the WHATWG streams specification, which builds upon the lessons learned in Node.js, to provide a suitable abstraction for needs of the extensible web.
I'll discuss briefly why streams are important, what they enable, and the role we envision them playing in the future of the web platform. Mostly, though, I want to help you understand streams, at a deep level. In the course of writing this specification, I've learned a lot about streams, and I want to share that knowledge with you. At the core, they are a very simple and beautiful abstraction. I think we've done a good job capturing that abstraction, and producing an API the web can be proud of. I'd love to tell you all about it.
6. sources of streaming data
AJAX
web sockets
files
web workers
IndexedDB
web rtc
webcam
microphone
geolocation.watchPosition
the human finger
setInterval
web audio sources
10. transformations of streaming data
CSV parser
the html parser
JSON to HTML via template
string encoding/decoding
audio/video codecs
encryption/decryption
gzip
web workers
17. the future
var ws = new WebSocket("http://example.com/analytics");
var events = new EventStream(document, "click");
events
.pipeThrough(new EventThrottler())
.pipeThrough(new EventsAsJSON())
.pipeTo(ws.input);
Let's talk for a minute about a very special book. This is the Principia Mathematica, by Isaac Newton. It's one of the most important works in the history of science.Among the many brilliant concepts introduced in the Principia, from calculus to orbital mechanics, underling it all was one central abstraction. Newton, more so than any before him, was able to see to the core of the universe and understand one of the fundamental ideas that allow us to describe it. This fundamental idea of Newton's was the idea of a force.
The concept of force is so powerful because it explains and unifies so much of the world around us. From gravity to electromagnetism to subatomic processes, everything we experience fits within the framework of forces acting on objects.Before Newton so clearly articulated the force concept, and formalized it with his rules of motion, the many phenomena of the natural world seemed separate: special cases, each with their own rules and laws. How do the planets orbit the sun? What is lightning? How do chemical reactions transpire?But once the underlying primitive of force was made clear, we were able to understand the world around us in a much more comprehensive manner. We think about the actors of the system, and the forces between them. And at this level of abstraction, science progressed rapidly, bringing us to the world we have today.
It's because of this kind of abstract thinking, on the level of fundamental primitives like forces, that we can understand and manipulate the universe at the level we do today.Once you know the core concepts of a system, you can build higher-level concepts on top of them, or bend them to your will. You can accomplish things that would have seemed superhuman beforehand—but are now obvious, or even easy.
Which brings us to streams. Because of course, in programming, just as in real life, it's the underlying primitives—the fundamental abstractions—that give us the real power. We can unify large portions of the programming landscape under streams, and in doing so accomplish things easily and naturally which before might not even have occurred to us.So let’s talk about how exactly streams do this for us…
Data comes out of a readable stream; you can read data from it. Whether it be any of the sources we mentioned before, or even just a bunch of flappy birds: the idea is to capture a streaming source of data as a concept.
Data goes in to a writable stream; you write data to it. Any of the sinks we mentioned can be encapsulated in the notion of a writable stream.
Transform streams are just both together: a writable stream that data goes in to, and a readable stream that data comes out of.In this case the transformation is very simple: it’s synchronous, so we can read immediately after writing, and it’s one-for-one, meaning each thing that goes in results in something coming out. In reality things are usually more complicated, e.g. a compressor will have much less data coming out than going in, and most transform streams take time to process their input before producing output.
The most fundamental thing you can do with streams is pipe them to each other. In essence, this is the operation of reading from one and then writing to the other. But inside this seemingly-simple operation, we encapsulate a lot of complexity: matching the flow rates, propagating errors or other signals, and so on. But once we have that in hand…
Here we see a more complex pipe chain, where data flows in one form from the original readable stream, being transformed twice before ultimately ending up in a writable stream.pipeThrough is a two-line sugar method that builds on pipeTo and the { input, output } structure of transform streams.So what does this look like, in practice? Well…
What myself and others been working on for the last few months is streams for the web. We’re putting together a spec, and a reference implementation, for streams that could go in your browser, and solve all those problems I mentioned earlier.Most importantly, we’re trying very hard to draw upon the experience of Node. We get the benefit of a clean slate, so we can produce nicer APIs, but we want to make sure to incorporate all the important features of Node’s streams, and do better where possible. Isaac and others have been advising us on what can be fixed, simplified, or added, and we’ve been heavily drawing on their experience.Note that we’re doing our spec development in the open, on GitHub! Now if you go there, you’ll notice there’s still a lot of open issues: this is very much a work in progress. But it’s shaping up really well, with a few implementers on board already.
So. That’s the high-level overview of what’s gong on. But I want to spend what time we have left on some more detailed stuff. Because it turns out that, when you go to write a spec for something, you end up having to really understand it---to dig deep into areas that before you just glossed over, and made assumptions about. I’ve found this process really fascinating, and I want to share some of what I’ve learned along the way with you.
Consider a readable stream, wrapping some underlying source. There are two types: push sources, and pull sourcesA push source, like a TCP socket, will be constantly generating data. Like an EEA pull source, like a file, requires you to read from it: seek, read a specific length, etc.We want to unify both of these into a single abstraction, the readable stream. Readable streams can then present either a push or a pull interface: on(‘data’) for push, or read() for pull.The problem with on(‘data’) is that you lose data if you aren’t listening! That was Node’s original streams1 mistake. So we want a pull model; it is much more user-friendly.
Now, let’s think about that losing data problem, and how we’ve solved it. When a push source underlies our readable stream, we’re going to keep getting data. We need to keep it ready, and not throw it away, for when somebody calls read() on the stream.So every readable stream carries around a buffer with it, containing all the data that’s come in so far but hasn’t been read.We can even use that buffer in the pull source case. Instead of using it to store data that’s being pushed at us, we can pull data into the buffer ahead of time, so that it’s ready to be read quickly when someone calls read(). This is a nice performance improvement over waiting until we are read from to go out and do our expensive disk access, for example.
Of course, this naturally leads you to a problem: what if your buffer is getting “too full”? That is, what if nobody is reading from it, for a long period of time? Or maybe they’re reading from it pretty slowly. Like, what if you’re piping a fast filesystem stream to a slow server? Or a webcam stream to a peer on a slow mobile connection?The answer for this is called backpressure. It means, when your buffer is too full, you send some signal to the underlying push source, saying “stop sending me so much data.” Or, for a pull source, you just stop pulling so much data. It might not comply immediately, in which case you have to keep the data anyway---throwing away data is bad! But communicating this “pause” signal is crucial.
There’s one more interesting piece in this whole puzzle. Which is, “how full is too full?”The way we usually think of this is in terms of something called a high-water mark. We let the stream’s buffer fill up until it reaches a certain point, at which we send the pause signal to the underlying source. Then we wait for the buffer to be drained, e.g. for someone to read all of the data we have. Once it’s all drained, we send a “resume” signal, and the buffer starts getting full up again.We’re still figuring out if this is exactly the best approach. Other approaches can be more complicated, involving e.g. low-water marks that let you resume before fully draining, or they can have no water marks at all, and just constantly send pause/resume signals. It’s a bit tricky.
Writable streams also have buffers, but of course for a different reason than readable streams. The problem we’re solving for writable streams is that: most sinks expect a single write at a time, and expect those writes in order. So if someone writes two chunks of data to us, we need to be able to wait until the first chunk finishes before sending in the second chunk. While we’re waiting, we store those chunks in the buffer.What’s useful about that, is that we can use the full-ness of this buffer to communicate backward in a pipe chain. That is: if the writable stream being piped to is full, then we should stop reading from the readable stream and wait for the writable stream to drain its buffer first. It all fits together!
OK. So. That is the most intricate part of understanding streams, and why they’re important. It’s all about backpressure, buffering, and the pipe chains.Take a deep breath.Now for some easier stuff.
One of the other interesting questions that came up was, can streams be entirely generic? Or do we have to have byte streams, or string streams, or “object mode” like Node.js has, or other such switches?Well, the answer is “yes, but be careful.” In particular, you need to be careful with your high-water marks. It’s easy to say that your high-water mark is a megabyte, but … is it really meaningful to say that it’s 16 objects? How big are these objects? What do they hold? If it’s 16 <img>s, that might be too high. If it’s 16 prices streaming from the server, then that might be too low. You need to think carefully about this. This is one of the reasons we’re not sure high water marks is the best idea after all.One thing you don’t want to do is mix up strings and bytes in the same stream. Node streams have this kind of confusion built-in, where you can set the encoding for a byte stream and suddenly it becomes a string stream, and it’s horrible. Thing about Unicode characters getting cut off in the middle, and so on. Bad news. Lesson learned.
If you think hard about all the systems we’re dealing with here, it turns out that, although in the general case they’re async, in reality much of the time the data is available synchronously.For example, when you read from a stream representing a file on disk, in general that will involve going out to the disk---an async operation. But much of the time, that operation could actually complete synchronously---your stream might be holding the data in its buffer already, or the OS might have cached that file into memory because it is accessed a lot, or any other such thing.Similarly, if you do a write, often you’re actually writing to an in-memory representation of the file, which the OS will flush at some later, scheduled time.So it’s very important that your basic reading and writing APIs present the ability to read and write data synchronously, when possible. It’s tempting to make everything simpler, and just be async always. But this introduces artificial delays into the system, which are especially bad if you introduce them at every step in the pipe chain, as they transform your best cases from a smooth flow of data into a kind of stutter-stop.
What happens if your stream encounters an unrecoverable error reading or writing?Well, generally this means that the stream is no good and should be thrown away. All buffered data is thrown away; any further attempts to read or write fail; etc.But what about if it’s in a pipe chain?
This is where a relatively new concept comes in: the abort and cancel signals.These are not present in Node streams, but forms of them are present in many of the experimental user-space streams I mentioned. And we’ve refined them already in our work, so much so that I haven’t even updated the spec yet---I just have an open issue with my thoughts.The idea is that you can abort a writable stream, saying, “stop writing, throw everything away, this whole thing was a failure.” And you can cancel a readable stream, saying “for whatever reason, I don’t need you anymore; stop reading, clean up, and go home.” These are slightly different.Let’s see how these two signals play out in a pipe chain, in error situations:
The web is under attack. We’re too close to really see this, but consider how many startups these days build native apps before web apps---how much attention and user interest is captured in those walled gardens, those app stores.Our platform may have the most momentum right now, but it also has real problems---problems that could start our slow backslide into the same slow oblivion that’s greeted Java-on-the-desktop, Flash, and .NET.But I believe an open ecosystem can win, by leveraging its strengths.This includes obvious things---like shareable URLs; or auto-updating sites that don't require centralized approval; or excellent search engines much better than those of any app store. But it also includes our community, and how they contribute to our platform.This is why I’m a co-signer of the Extensible Web Manifesto. It’s a new approach to building our web platform, and it makes two points very dear to my heart. First, that we should focus on adding low-level primitives---the unifying concepts, like force, or streams, that I opened this talk with. Second, that we should use iteration by the community to inform our higher-level APIs. The streams work is an example of this process as well, given its heritage. It’s a perfect fit for our new extensible web.So this is why I am so excited about streams: it’s an important piece of the larger puzzle, of extending the web forward as best we can.