Recursively Scanning a Directory for Malware with pompelmi
Most antivirus integrations in Node.js are request-scoped: a file arrives, you scan it, you accept or reject it. But several real-world scenarios need something different — you want to scan an entire directory tree, not a single file:
- Batch processing — a worker queue processes uploaded archives that were already unpacked to disk. You need to scan all extracted files before moving them to permanent storage.
-
Scheduled background scans — you run a nightly cron job
that rescans an
/uploadsdirectory to catch threats that slipped through signature updates. - Quarantine workflows — after scanning, you want to move infected files to a separate directory rather than deleting them, so they can be reviewed later.
pompelmi v1.5.0 ships scanDirectory() to cover all of these patterns
with a single call.
The API
scanDirectory(dirPath, [options]) recursively walks dirPath
using Node's built-in fs.readdirSync({ recursive: true }) (Node 18+),
scans every file concurrently with Promise.all, and returns an object
with three arrays:
clean— absolute paths of files with no threats foundmalicious— absolute paths of files with a matched signatureerrors— absolute paths of files that could not be scanned
Per-file scan failures are collected into errors rather than aborting
the whole scan. The function only throws for top-level validation errors (wrong argument
type, directory not found).
scanDirectory(
dirPath: string,
options?: { host?: string; port?: number; timeout?: number }
): Promise<{ clean: string[], malicious: string[], errors: string[] }>
Basic usage
const { scanDirectory } = require('pompelmi');
const results = await scanDirectory('/uploads');
console.log('Clean:', results.clean);
console.log('Malicious:', results.malicious);
console.log('Errors:', results.errors);
The same host / port / timeout options
accepted by scan() are forwarded to every per-file scan. To use a
remote clamd sidecar instead of the local clamscan
binary:
const results = await scanDirectory('/uploads', {
host: '127.0.0.1',
port: 3310,
timeout: 30_000,
});
Quarantine workflow
Instead of immediately deleting malicious files, move them to a quarantine directory so they can be reviewed or submitted to a threat-intelligence feed before final disposal.
const fs = require('fs');
const path = require('path');
const { scanDirectory } = require('pompelmi');
const UPLOADS_DIR = '/var/app/uploads';
const QUARANTINE_DIR = '/var/app/quarantine';
fs.mkdirSync(QUARANTINE_DIR, { recursive: true });
async function scanAndQuarantine() {
const { malicious, errors } = await scanDirectory(UPLOADS_DIR);
for (const filePath of malicious) {
const dest = path.join(QUARANTINE_DIR, path.basename(filePath));
fs.renameSync(filePath, dest);
console.log(`Quarantined: ${filePath} → ${dest}`);
}
if (errors.length > 0) {
console.warn('Could not scan:', errors);
}
}
scanAndQuarantine().catch(console.error);
fs.renameSync only works within the same filesystem. If
QUARANTINE_DIR is on a different mount point, use
fs.copyFileSync followed by fs.unlinkSync instead.
Scheduled background scans
Combine scanDirectory() with a scheduler like
node-cron to run a nightly rescan of your upload directory.
This catches threats that slipped through when signatures were out of date.
const cron = require('node-cron');
const { scanDirectory } = require('pompelmi');
// Run every night at 02:00
cron.schedule('0 2 * * *', async () => {
console.log('Starting nightly scan…');
const { clean, malicious, errors } = await scanDirectory('/var/app/uploads');
console.log(`Scan complete: ${clean.length} clean, ${malicious.length} malicious, ${errors.length} errors`);
if (malicious.length > 0) {
// alert your team, move to quarantine, emit a metric, etc.
console.error('Malicious files detected:', malicious);
}
});
Error handling
scanDirectory() throws synchronously for two top-level
validation errors and collects everything else into errors:
const { scanDirectory } = require('pompelmi');
// Top-level errors — these throw (reject the Promise)
await scanDirectory(42); // Error: dirPath must be a string
await scanDirectory('/nonexistent'); // Error: Directory not found: /nonexistent
// Per-file errors — collected into results.errors, never thrown
const { errors } = await scanDirectory('/uploads');
// errors contains paths of files clamscan could not open, encrypted archives, etc.
A non-empty errors array does not mean the scan failed —
it means some files produced Verdict.ScanError (ClamAV exit code 2)
or threw while being scanned. Treat those files as untrusted and inspect or
quarantine them separately.