Scanning Node.js Readable Streams with pompelmi
Most file scanning happens after a file is already on disk. But many Node.js
pipelines never write to disk at all: S3 getObject returns a
Readable stream, HTTP responses are streams, and custom transform pipelines
can pipe data from one service to another without a single writeFileSync.
Scanning these in-flight bytes with pompelmi v1.3 required either buffering the
whole stream into a Buffer (memory overhead) or writing a temp file
yourself and cleaning it up (error-prone boilerplate). pompelmi v1.4.0 ships
scanStream() to handle both cases cleanly.
The solution: scanStream(stream, [options])
scanStream() accepts any Node.js Readable and returns
the same three typed verdict Symbols as scan() and scanBuffer():
const { scanStream, Verdict } = require('pompelmi');
const { Readable } = require('stream');
// Useful for S3 getObject, HTTP downloads, or any piped source
const stream = s3.getObject({ Bucket, Key }).createReadStream();
const result = await scanStream(stream);
if (result === Verdict.Malicious) throw new Error('Malware detected.');
if (result === Verdict.ScanError) console.warn('Scan incomplete.');
One validation error surfaces immediately as a rejected Promise:
stream must be a Readable— when the argument is not a Node.jsReadableinstance
If the stream itself emits an 'error' event during scanning,
that error is propagated as-is.
TCP mode vs local mode
scanStream() behaves differently depending on whether you provide
a host or port option.
TCP mode — no disk I/O
When host or port is set, the stream is piped
directly to a running clamd daemon using the ClamAV
INSTREAM protocol. Each 'data' chunk is sent to
clamd prefixed with a 4-byte big-endian length header, terminated with four
zero bytes. No data is written to disk at any point — the bytes travel from
your Readable straight to clamd over TCP.
// clamd sidecar in Docker Compose or Kubernetes
const result = await scanStream(stream, {
host: '127.0.0.1',
port: 3310,
timeout: 30_000,
});
This is ideal for serverless functions, read-only containers, and pipelines that process S3 objects or HTTP downloads without touching the filesystem.
Local mode — temp file, auto-cleaned
Without host or port, pompelmi pipes the stream to
a randomly-named temp file under os.tmpdir(), calls
clamscan on it, and deletes the file in a finally
block — cleanup happens whether the scan succeeds, returns an error verdict,
or throws.
// Local clamscan — temp file created and deleted automatically const result = await scanStream(stream);
Scanning an S3 object stream
The AWS SDK v3 returns a Readable from GetObjectCommand.
Pass it directly to scanStream() — no buffering, no temp file.
const { S3Client, GetObjectCommand } = require('@aws-sdk/client-s3');
const { scanStream, Verdict } = require('pompelmi');
const s3 = new S3Client({ region: 'us-east-1' });
async function scanS3Object(bucket, key) {
const { Body } = await s3.send(new GetObjectCommand({ Bucket: bucket, Key: key }));
const result = await scanStream(Body, {
host: '127.0.0.1',
port: 3310,
});
if (result === Verdict.Malicious) {
throw new Error(`Malware detected in s3://${bucket}/${key}`);
}
return result; // Verdict.Clean or Verdict.ScanError
}
Scanning an HTTP download stream
Node's https.get provides a Readable directly in the callback.
Scan it before writing to disk or forwarding downstream.
const https = require('https');
const { scanStream, Verdict } = require('pompelmi');
function scanHttpUrl(url) {
return new Promise((resolve, reject) => {
https.get(url, async (response) => {
try {
const result = await scanStream(response, {
host: '127.0.0.1',
port: 3310,
});
resolve(result);
} catch (err) {
reject(err);
}
}).on('error', reject);
});
}
const result = await scanHttpUrl('https://example.com/upload.pdf');
if (result === Verdict.Malicious) throw new Error('Malware detected in download.');
Full error handling
const { scanStream, Verdict } = require('pompelmi');
async function safeScanStream(stream) {
try {
const result = await scanStream(stream, {
host: process.env.CLAMD_HOST,
port: Number(process.env.CLAMD_PORT) || 3310,
});
if (result === Verdict.ScanError) {
console.warn('Scan incomplete — rejecting as precaution.');
return null;
}
return result; // Verdict.Clean or Verdict.Malicious
} catch (err) {
// Not a Readable, stream errored, clamd unreachable, etc.
console.error('Scan threw:', err.message);
return null;
}
}