Node.js Socket Backpressure in Paused Mode

As a lot of these articles go, I was curious about something and could not find the answer online. In this case, how a Node.js Socket in paused mode handles back pressure from a client that attempts to write a large amount of data.

I could find nothing in the documentation about what happens on slow reads. But after reading a statement in the Stream documentation about backpressure issues writes, I became curious:

While calling write() on a stream that is not draining is allowed, Node.js will buffer all written chunks until maximum memory usage occurs, at which point it will abort unconditionally. Even before it aborts, high memory usage will cause poor garbage collector performance and high RSS. Since TCP sockets may never drain if the remote peer does not read the data, writing a socket that is not draining may lead to a remotely exploitable vulnerability.

The main issue is that if a recipient of TCP data is slow, a mismanaged sender can be forced into an out of memory situation. Not good.

So what happens if you are the recipient and your reads are slower than the incoming data being sent? One would assume that the slow reads would create downstream backpressure on the sender (as outlined above in the warning).

But, I was curious about what happens when the socket is reading in paused mode instead of flowing mode. Will this create issues?

Paused Mode

First lets' take a look at Sockets in paused mode.

Readable streams operate in one of two modes: flowing and paused ref.

  • In flowing mode, data is read from the underlying system automatically and provided to an application as quickly as possible using events.

  • In paused mode, the stream.read() method must be called explicitly to read chunks of data from the stream.

By default, streams are in paused mode. Switching to flowing mode happens in 3 ways:

  1. Attaching a data event
  2. Calling stream.pipe
  3. Calling the stream.resume method

Most consumers of streams are familiar with the first two methods, which means that most people are used to working with streams in flowing mode.

Paused mode is a bit different. It gives direct control over how data is read from the stream by requiring we call stream.read() method to read data from the buffer. We are given the readable event as a mechanism to know when new data becomes available.

A pithy example:

let server = new net.Server(socket => {  
  socket.on('readable', () => {
    let newData = socket.read();
    // do something with data
  });
});

Lastly, the stream.read() accepts a size argument. If there is not enough data in the internal Buffer to return the requested size, the call to stream.read() will return null.

This means, instead of the fire hose that is the flowing mode, we can have controlled reading of the incoming Stream data.

In the context of a Socket, paused mode is interesting when we are attempting to implement a wire protocol and have specific packet formats we are looking to retrieve. For instance, it is common to receive data in two parts:

  1. an integer telling us how large the payload is
  2. a correspondingly sized payload

With socket.read() we can read 2-bytes first, then know exactly how much we need to read off the Buffer for the payload. Subsequent calls to socket.read() will return null until the full payload is retrieved.

Knowing how paused mode works, what happens if we have a slower read process than the write process? Does our socket buffer data and eventually become memory constrained? Or does it buffer to a point and then exert backpressure to the sender?

Slow Read Experiment

I set up an experiment where a client attempts to write a large amount of data to a server. In this case, it's a 200MB JSON file. It sends the data in one big chunk.

const net = require('net');  
const fs = require('fs');

function run() {  
  let socket = net.connect({ host: 'localhost', port: 9000 });
  let data = fs.readFileSync('./big.json'); // 200mb json file
  socket.end(data);
}
run();  

The client code creates a socket to our server and then writes the contents of a large file to it. I wanted to be sure nothing special was going on, so instead of creating a read stream for the file and piping it to the socket, I'm writing it as one large chunk directly from memory.

Server

The server is where the real experiment happens. The goal is to slowly read data from the paused stream and watch memory to see what happens.

const net = require('net');


let server = new net.Server(socket => {  
  // listen for new data in paused mode
  socket.on('readable', async () => {
    try {
      let processed = 0;
      let data;

      // read 100 bytes of data at a time
      // stop reading when we have nothing left!
      while ((data = socket.read(100))) {
        // do something with data...
        // or just pause for 1ms to keep things slow
        await wait(1);

        // periodically write how much data we've processed
        processed += 100;
        if (processed % 100000 === 0) console.log(processed / 1000 + 'kB'); // log every 100kB
      }
    } catch (ex) {
      console.error(ex);
    }
  });
});
server.listen(9000);  

The server listens for connections and will attach a readable event to listen for new incoming data.

It will then slowly iterate over the data by reading 100 byte chunks. We make the read loop slow by adding a 1ms pause for each read of 100 bytes. This ensures that the server is reading slower than the client is sending data.

Lastly, to measure memory, we add a timer that displays the RSS memory usage every second.

setInterval(() => console.log('rss', process.memoryUsage().rss / 1024 / 1024), 1000);  

Results

The results are not surprising: we only buffer a small amount of data to memory.

rss 19.89453125  
rss 20.67578125  
100kB  
rss 24.453125  
rss 25.49609375  
200kB  
rss 25.94140625  
300kB  
rss 26.015625  
rss 26.05078125  
400kB  
rss 26.7578125  
500kB  
rss 26.7578125  
600kB  
rss 26.7578125  
rss 26.76171875  
700kB  
rss 26.76171875  
800kB  
rss 26.76171875  
900kB  
rss 26.765625  
rss 26.765625  
1000kB  
rss 26.765625  
1100kB  
rss 26.765625  
1200kB  
rss 26.765625  
rss 26.765625  

As you can see after, ever after almost 20 seconds, the RSS has not drastically increased even though we've only read 1200kB of data. The client is certainly capable of sending data faster than that.

Interestingly, if the client is prematurely terminated, the server will continue to read another ~1MB of data and then stop. This means that the read socket is buffering ~1MB of data and exerts backpressure on the sender.

This backpressure is influenced by the the TCP flow control mechanism that will keep a fast sender from overwhelming a slower receiver. That part is outside the scope of this article, but feel free to read more about TCP flow control if you want a better understanding of how that works.

Conclusion

This little experiment demonstrates how slow reads by a socket can create backpressure in the network, but without causing issues on the slow recipient itself.

comments powered by Disqus