File Streaming in Node.js: Large Files, Memory, and Performance
Calling fs.readFile on a multi-gigabyte file invites the process to allocate everything at once. Streams move data in chunks: memory stays bounded, latency to first byte drops, and producers can slow down when consumers are full — that mechanism is backpressure.
This post covers the essentials of file streams in Node.js and common production patterns.
What Is a Stream?
In Node.js, a stream is a channel for data flowing in bounded chunks. The main types:
| Type | Role |
|---|---|
| Readable | Source (e.g. file, HTTP response) |
| Writable | Sink (e.g. disk, socket) |
| Duplex | Both directions (e.g. TCP) |
| Transform | Read, transform, write (e.g. gzip) |
For files, the workhorses are fs.createReadStream and fs.createWriteStream.
Prefer createReadStream over readFile
import fs from 'node:fs';
// Bad: entire file in RAM
// const buf = fs.readFileSync('huge.bin');
const rs = fs.createReadStream('huge.bin', { highWaterMark: 64 * 1024 });
rs.on('data', (chunk) => {
// chunk: Buffer or string (if encoding is set)
});
rs.on('end', () => console.log('done'));
rs.on('error', (err) => console.error(err));
highWaterMark influences internal buffering; defaults are fine for most cases.
pipe and pipeline
The legacy pattern readable.pipe(writable) still works; the modern recommendation is stream.promises.pipeline or stream.pipeline for error propagation and clean shutdown:
import fs from 'node:fs';
import { pipeline } from 'node:stream/promises';
import zlib from 'node:zlib';
await pipeline(
fs.createReadStream('input.log'),
zlib.createGzip(),
fs.createWriteStream('input.log.gz')
);
pipeline forwards errors and tries to tear down streams safely.
Backpressure
If the reader is slow and the writer is fast, the Writable buffer fills. Node.js pauses the Readable temporarily — that avoids memory blowups. pipe handles this; if you use write() manually, respect its return value and listen for drain.
With HTTP
When sending a large file to a client:
import fs from 'node:fs';
import http from 'node:http';
http.createServer((req, res) => {
const stream = fs.createReadStream('report.pdf');
res.setHeader('Content-Type', 'application/pdf');
stream.pipe(res);
stream.on('error', () => res.destroy());
}).listen(3000);
In production, prefer pipeline(fs.createReadStream(...), res) so errors and cleanup live in one place.
Compression Pipelines
For log archival or static assets:
await pipeline(
fs.createReadStream('data.json'),
zlib.createGzip(),
fs.createWriteStream('data.json.gz')
);
For line-by-line parsing, combine readline.createInterface with streams or add a stream.Transform stage.
Error Handling
- Use
pipelineorstream.finished; relying only onpipewithouterrorhandlers can leave orphaned streams. - On HTTP errors, close the response with
res.destroy(err)when appropriate.
When to Stream
- File size is unknown or larger than comfortable RAM
- You need early consumption (process bytes as they arrive)
- You chain transformations (encrypt, compress, parse)
When to Buffer
- Small config files
- Tests where you need the whole payload atomically
Summary
Streams are central to Node.js I/O efficiency. For file + HTTP + compression pipelines, use pipeline to centralize errors; prefer streaming over readFile for large copies.
Happy coding.
Related posts
Go vs Node.js: Which One for Which Service?
A detailed comparison of Go and Node.js across performance, developer productivity, operations, and service-type fit, with a practical framework for architecture decisions.
Go Memory and CPU Profiling: Listen to Your Runtime
Learn how to diagnose CPU hotspots and memory pressure in Go services with pprof, flame graphs, and benchmark-driven profiling.
The Power of Go Routines: Practical Concurrency Patterns
Learn core goroutine patterns with practical examples: worker pool, fan-out/fan-in, pipeline, cancellation, and concurrency limits.