1 | What is a Stream?

A Stream is a flow of data of unknown size, with the data being accessible piece by piece (chunk by chunk) throughout time. A Stream can represent a source of data, a destination of data, or both. In a sense, a Stream is the opposite of a Buffer, which is a block of data of known size, with the data being accessible all at once.

Internally, a Stream uses a Buffer to store temporarily each chunk of data. When a Stream uses a Buffer to "pull out" chunks of data from the Stream, the Stream is a Readable Stream (a source of data that can be read). When a Stream uses a Buffer to "push in" chunks of data to the Stream, the Stream is a Writable Stream (a destination for data that can be written).

All Streams are instances of the Stream class or of a Stream subclass.

Streams are also Emitters: they can emit events and we can invoke event handlers to deal with the emitted events.

Note: almost all Node.js applications, no matter how simple, use Streams in some manner. So it is fundamental to master Streams if you want to master Node.js.

2 | Why use Streams?

When implementing an algorithm to process data, it is almost always easiest and more simple to read all the data into memory, do the processing, and then write the data out. However, this can be problematic when dealing with large files or dealing with multiple files at the same time (in a concurrent way).

An optimized way to deal with this is to use streaming algorithms: the data flows into our program, is processed and then flows out of our program, continuously, piece by piece. Our program processes data in small chunks so that the full set of data is never held in memory at once.

Streams basically provide two major advantages over using other data handling methods:

You should use streams in your code WHENEVER you can to use the full power of Node.js.

3 | Piping

Streams are not a concept unique to Node.js. They were introduced in the Unix operating system decades ago, and programs can interact with each other passing streams through the pipe operator (|). One fundamental concept to remember is that Streams that represent sources of data should be piped to Streams representing destinations of data. More on this in the next chapters.

4 | Types of Streams

There are four fundamental types of Streams within Node.js:

We will explore Readable Streams, Writable Streams and Transform Streams in other articles. Duplex Streams are often built-in objects, so you will not have to create Duplex Streams from scratch very often.

5 | Warning

Streams are a difficult topic, and they are quite difficult to introduce in a linear way. In the next article, I expose the theoretical side of things before providing code examples. Reading these chapters is not easy, as I will talk about specific events and methods without being able to provide examples right away. You probably will need to read this article several times, and that is perfectly normal. I spent myself dozens of hours trying to figure out all of this. I hope that my articles will make you proficient faster.

The difficulty of Streams is that there are often several ways to do things. I made a significant effort in trying to list all the possible options each time, even if I only detail the recommended ones, from my perspective. Whenever you feel lost, just come back to these lists of options or jump to the code examples to get a sense of how things work.

Winter is coming, but I have warm coats for you.

Author: Dimitri Alamkan
Initial publication date:
Last updated: