Scanning files before uploading to Azure Blob Storage
Azure Blob Storage is Microsoft's object storage service and the standard choice for file storage on Azure. This guide shows how to integrate pompelmi into a Node.js upload endpoint that stores files in Azure Blob Storage — scanning each file locally before it is uploaded, so malware never reaches the cloud.
Install
npm install pompelmi @azure/storage-blob multer express
Authentication
The @azure/storage-blob SDK supports several authentication
strategies. The two most common for server-side Node.js are:
-
Connection string — simplest for local development. Copy
the connection string from the Azure portal under
Storage account → Access keys and set it as
AZURE_STORAGE_CONNECTION_STRING. - Managed Identity — recommended for production. No secrets to rotate. See the Managed Identity section below.
const { BlobServiceClient } = require('@azure/storage-blob');
// Connection string (dev / simple deployments)
const blobServiceClient = BlobServiceClient.fromConnectionString(
process.env.AZURE_STORAGE_CONNECTION_STRING
);
Complete Express example
const express = require('express');
const multer = require('multer');
const { scan, Verdict } = require('pompelmi');
const { BlobServiceClient } = require('@azure/storage-blob');
const crypto = require('crypto');
const path = require('path');
const fs = require('fs');
const os = require('os');
const app = express();
const upload = multer({
dest: os.tmpdir(),
limits: { fileSize: 50 * 1024 * 1024 }, // 50 MB
});
const blobServiceClient = BlobServiceClient.fromConnectionString(
process.env.AZURE_STORAGE_CONNECTION_STRING
);
const containerClient = blobServiceClient.getContainerClient(
process.env.AZURE_CONTAINER_NAME
);
app.post('/upload', upload.single('file'), async (req, res) => {
if (!req.file) {
return res.status(400).json({ error: 'No file provided.' });
}
const tmpPath = req.file.path;
let tmpDeleted = false;
try {
// Step 1 — scan locally before touching Azure
const verdict = await scan(tmpPath);
if (verdict === Verdict.Malicious) {
return res.status(400).json({ error: 'Malware detected. Upload rejected.' });
}
if (verdict === Verdict.ScanError) {
return res.status(422).json({ error: 'Scan incomplete. Upload rejected.' });
}
// Step 2 — file is clean, upload to Azure Blob Storage
const ext = path.extname(req.file.originalname).toLowerCase();
const blobName = 'uploads/' + crypto.randomBytes(16).toString('hex') + ext;
const blockBlob = containerClient.getBlockBlobClient(blobName);
await blockBlob.uploadFile(tmpPath, {
blobHTTPHeaders: { blobContentType: req.file.mimetype },
metadata: {
scanStatus: 'clean',
scannedBy: 'pompelmi',
scanEngine: 'clamav',
scannedAt: new Date().toISOString(),
originalName: encodeURIComponent(req.file.originalname),
},
});
// Step 3 — clean up
fs.unlinkSync(tmpPath);
tmpDeleted = true;
return res.json({
status: 'ok',
blobName,
url: blockBlob.url,
});
} catch (err) {
return res.status(500).json({ error: err.message });
} finally {
if (!tmpDeleted && fs.existsSync(tmpPath)) {
fs.unlinkSync(tmpPath);
}
}
});
app.listen(3000, () => console.log('Listening on :3000'));
scanStatus, not
scan-status).
Blob metadata and tags
Azure offers two ways to attach key-value data to a blob:
-
Metadata — attached at upload time or via
setMetadata(). Returned in the HTTP response headers. Keys are case-insensitive and must be valid identifiers. -
Blob index tags — queryable across an entire container.
Useful for finding all blobs with
scanStatus = 'clean'.
// Set blob index tags for queryable audit trail
await blockBlob.setTags({
scanStatus: 'clean',
scannedBy: 'pompelmi',
});
Query blobs by tag using the Blob Service's tag filtering:
// Find all unscanned blobs in the account
const tagFilter = `"scanStatus" = 'clean'`;
for await (const blob of blobServiceClient.findBlobsByTags(tagFilter)) {
console.log(blob.name, blob.tags);
}
Using Managed Identity in production
Avoid storing connection strings in environment variables for production. Use
Managed Identity with DefaultAzureCredential:
npm install @azure/identity
const { BlobServiceClient } = require('@azure/storage-blob');
const { DefaultAzureCredential } = require('@azure/identity');
const blobServiceClient = new BlobServiceClient(
`https://${process.env.AZURE_STORAGE_ACCOUNT_NAME}.blob.core.windows.net`,
new DefaultAzureCredential()
);
Assign the Storage Blob Data Contributor role to your app's managed identity in the Azure portal under the storage account's Access control (IAM) blade. No credentials are stored anywhere.
Next steps
- Using AWS S3 instead? See Scanning files before uploading to AWS S3.
- Running pompelmi on Kubernetes (AKS)? See Setting up pompelmi with ClamAV on Kubernetes.
- Want the full upload security checklist? Read Node.js file upload security checklist.