Mert Tosun
← Posts
File Streaming in Node.js: Large Files, Memory, and Performance

File Streaming in Node.js: Large Files, Memory, and Performance

Blog AuthorSoftware

Calling fs.readFile on a multi-gigabyte file invites the process to allocate everything at once. Streams move data in chunks: memory stays bounded, latency to first byte drops, and producers can slow down when consumers are full — that mechanism is backpressure.

This post covers the essentials of file streams in Node.js and common production patterns.

What Is a Stream?

In Node.js, a stream is a channel for data flowing in bounded chunks. The main types:

Type Role
Readable Source (e.g. file, HTTP response)
Writable Sink (e.g. disk, socket)
Duplex Both directions (e.g. TCP)
Transform Read, transform, write (e.g. gzip)

For files, the workhorses are fs.createReadStream and fs.createWriteStream.

Prefer createReadStream over readFile

import fs from 'node:fs';

// Bad: entire file in RAM
// const buf = fs.readFileSync('huge.bin');

const rs = fs.createReadStream('huge.bin', { highWaterMark: 64 * 1024 });
rs.on('data', (chunk) => {
  // chunk: Buffer or string (if encoding is set)
});
rs.on('end', () => console.log('done'));
rs.on('error', (err) => console.error(err));

highWaterMark influences internal buffering; defaults are fine for most cases.

pipe and pipeline

The legacy pattern readable.pipe(writable) still works; the modern recommendation is stream.promises.pipeline or stream.pipeline for error propagation and clean shutdown:

import fs from 'node:fs';
import { pipeline } from 'node:stream/promises';
import zlib from 'node:zlib';

await pipeline(
  fs.createReadStream('input.log'),
  zlib.createGzip(),
  fs.createWriteStream('input.log.gz')
);

pipeline forwards errors and tries to tear down streams safely.

Backpressure

If the reader is slow and the writer is fast, the Writable buffer fills. Node.js pauses the Readable temporarily — that avoids memory blowups. pipe handles this; if you use write() manually, respect its return value and listen for drain.

With HTTP

When sending a large file to a client:

import fs from 'node:fs';
import http from 'node:http';

http.createServer((req, res) => {
  const stream = fs.createReadStream('report.pdf');
  res.setHeader('Content-Type', 'application/pdf');
  stream.pipe(res);
  stream.on('error', () => res.destroy());
}).listen(3000);

In production, prefer pipeline(fs.createReadStream(...), res) so errors and cleanup live in one place.

Compression Pipelines

For log archival or static assets:

await pipeline(
  fs.createReadStream('data.json'),
  zlib.createGzip(),
  fs.createWriteStream('data.json.gz')
);

For line-by-line parsing, combine readline.createInterface with streams or add a stream.Transform stage.

Error Handling

  • Use pipeline or stream.finished; relying only on pipe without error handlers can leave orphaned streams.
  • On HTTP errors, close the response with res.destroy(err) when appropriate.

When to Stream

  • File size is unknown or larger than comfortable RAM
  • You need early consumption (process bytes as they arrive)
  • You chain transformations (encrypt, compress, parse)

When to Buffer

  • Small config files
  • Tests where you need the whole payload atomically

Summary

Streams are central to Node.js I/O efficiency. For file + HTTP + compression pipelines, use pipeline to centralize errors; prefer streaming over readFile for large copies.

Happy coding.