Scanning files before uploading to AWS S3 in Node.js
Many Node.js applications accept file uploads from users and store them in AWS S3. A common question is: where do you scan for malware — before or after uploading to S3?
The answer is before. Scan on your server while the file is
still in a temporary location on disk. Only upload to S3 if the scan returns
Verdict.Clean. This guide shows you exactly how to set that up.
Why scan locally, not in S3?
You might wonder about scanning files after they land in S3, using a Lambda trigger or a third-party service. That approach has real problems:
- The file is accessible before it's scanned. If another Lambda, worker, or user downloads the file between upload and scan, they get an unscanned file. You need a quarantine bucket and complex access controls to avoid this window.
- Malicious files touch S3 at all. Even briefly storing malware in S3 may violate your organization's security policy or compliance requirements (PCI DSS, HIPAA, SOC 2).
- More moving parts. Lambda triggers, EventBridge rules, IAM roles — this is significant infrastructure to maintain. Scanning on the server before upload is simpler.
Scanning on the server with pompelmi means malware never reaches S3 at all.
Install
npm install pompelmi @aws-sdk/client-s3 multer express
The scan-then-upload flow
- Multer receives the multipart upload and writes it to a local temp file.
scan(tmpPath)calls ClamAV on the local file.-
If
Verdict.MaliciousorVerdict.ScanError: delete the temp file, reject the request. S3 is never touched. -
If
Verdict.Clean: read the temp file and upload it to S3 using the AWS SDK. - Delete the temp file after the S3 upload completes.
Complete Express example
const express = require('express');
const multer = require('multer');
const { scan, Verdict } = require('pompelmi');
const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');
const crypto = require('crypto');
const path = require('path');
const fs = require('fs');
const os = require('os');
const app = express();
const upload = multer({
dest: os.tmpdir(),
limits: { fileSize: 50 * 1024 * 1024 } // 50 MB limit
});
const s3 = new S3Client({ region: process.env.AWS_REGION || 'us-east-1' });
const BUCKET = process.env.S3_BUCKET; // Set via environment variable
app.post('/upload', upload.single('file'), async (req, res) => {
if (!req.file) {
return res.status(400).json({ error: 'No file provided.' });
}
const tmpPath = req.file.path;
let tmpDeleted = false;
try {
// Step 1 — Scan locally before touching S3
const verdict = await scan(tmpPath);
if (verdict === Verdict.Malicious) {
return res.status(400).json({ error: 'Malware detected. Upload rejected.' });
}
if (verdict === Verdict.ScanError) {
return res.status(422).json({ error: 'Scan incomplete. Upload rejected as precaution.' });
}
// Step 2 — File is clean, upload to S3
const ext = path.extname(req.file.originalname).toLowerCase();
const s3Key = 'uploads/' + crypto.randomBytes(16).toString('hex') + ext;
const fileStream = fs.createReadStream(tmpPath);
await s3.send(new PutObjectCommand({
Bucket: BUCKET,
Key: s3Key,
Body: fileStream,
ContentType: req.file.mimetype,
// Tag the object to confirm it has been scanned
Tagging: 'scan-status=clean&scanned-by=pompelmi'
}));
// Step 3 — Clean up temp file
fs.unlinkSync(tmpPath);
tmpDeleted = true;
res.json({
status: 'ok',
key: s3Key,
url: `https://${BUCKET}.s3.amazonaws.com/${s3Key}`
});
} catch (err) {
res.status(500).json({ error: err.message });
} finally {
if (!tmpDeleted && fs.existsSync(tmpPath)) {
fs.unlinkSync(tmpPath);
}
}
});
app.listen(3000, () => console.log('Listening on :3000'));
AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY) or, better, attach an IAM role to your EC2
instance or ECS task so credentials are provided automatically.
S3 object tagging for audit trails
Adding S3 object tags when uploading gives you an audit trail of which files were scanned, when, and by what tool. You can then write S3 bucket policies that deny access to untagged objects — preventing any file that bypassed the scan from being accessed.
// Tagging added to PutObjectCommand Tagging: 'scan-status=clean&scanned-by=pompelmi&scan-engine=clamav'
An S3 bucket policy to deny access to unscanned files:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::your-bucket/uploads/*",
"Condition": {
"StringNotEquals": {
"s3:ExistingObjectTag/scan-status": "clean"
}
}
}
]
}
Optional: quarantine bucket for rejected files
In some regulated environments you need to preserve malicious files for
forensic analysis rather than deleting them outright. Instead of calling
fs.unlinkSync() on rejection, upload the file to a separate,
private quarantine bucket.
async function quarantineFile(tmpPath, originalName, verdict) {
const key = 'quarantine/' + Date.now() + '-' + originalName;
await s3.send(new PutObjectCommand({
Bucket: process.env.QUARANTINE_BUCKET,
Key: key,
Body: fs.createReadStream(tmpPath),
// Mark the object as dangerous
Tagging: 'scan-status=' + verdict.description + '&scanned-by=pompelmi',
// Block all public access at the object level (bucket policy should also block)
ACL: 'private'
}));
fs.unlinkSync(tmpPath); // Remove from local disk after uploading to quarantine
return key;
}
// In the route handler:
if (verdict === Verdict.Malicious) {
const qKey = await quarantineFile(tmpPath, req.file.originalname, verdict);
tmpDeleted = true;
console.warn('Malicious file quarantined at:', qKey);
return res.status(400).json({ error: 'Malware detected.' });
}
For containerized deployments, see Running pompelmi with ClamAV in Docker Compose for how to run ClamAV as a sidecar next to your Node.js service.