You need alerts for your alerts: the case of the leaking file descriptors

dxchester@programming.dev · 2 years ago

You need alerts for your alerts: the case of the leaking file descriptors

Max-P@lemmy.max-p.me · 2 years ago

But why should we have to close the socket manually, and why only when a certain number of bytes has been received? Based on the documentation here it certainly seems like it should automatically destroy its file descriptor when the server is done sending data.

My suspicion is, the client here never reads all the data that’s being sent by the server, so the server is there waiting to finish sending the page and on the client nobody’s reading it, so the connection never closes because the request isn’t completely finished. The TLS callbacks are probably still working, just not making any progress on the actual data stream. So it sits there for a really long time, until something times out, if anything times out there.

That’d make sense if the check stops reading after the HTTP headers. It reads the chunks it needs and then just stops processing because it doesn’t need the rest of the data, and the TLS wrapper is probably preventing node from collecting the socket itself. The TLS stream happily processes TLS layer pings and keeps trying to push data out but nobody’s reading the Stream anymore. Node can’t release the underlying TCP connection because the TLS wrapper is still handling it, and it can’t collect the wrapper because the TCP connection is still piped into it.

It probably works with small payloads because it’s small enough to be read in one chunk, so even if the TLS wrapper isn’t fully read, the underlying TCP connection is, so the server gets to close the connection, the fd is closed, and node is able to collect the TCP connection object and then the wrapper is fully orphaned and can also be collected.

Perhyte@lemmy.world · 2 years ago

This was my immediate suspicion as well, as soon as I read that it would leak for a GET but not a HEAD.

ThermoToaster@exng.meme · 2 years ago

Magic word being watchdogalert

Corngood@lemmy.ml · 2 years ago

I feel like node’s async model makes it really easy to cause a bug like this, and really difficult to track it down.

It was left to the OS to catch the leak, because the program was written in such a way that it was able to run a gazillion of these tasks concurrently.

CodeMonkey@programming.dev · 2 years ago

I always feel a little paranoid when I explicitly close transactions, connections, and files (for quick running scripts, the OS will close the file when my process exits and for long running applications, the garbage collector will close it when the object leaves the scope). Then I read a blog post like this an remember that it is always better to explicitly free resources when I am done with them.