Scanning image uploads (JPEG, PNG, WebP, GIF) in Node.js

Profile pictures, product photos, and document scans make up the bulk of user-generated file uploads. Images look safe — they are just pixels. But image upload endpoints are a target for several real attack vectors:

  • Malware embedded in image metadata. EXIF and IPTC metadata fields inside JPEG and PNG files are sometimes used to carry shellcode or malicious scripts that exploit vulnerable image-processing libraries (ImageMagick, libvips, Pillow).
  • Polyglot files. A file can simultaneously be a valid JPEG and a valid PHP script, ZIP archive, or HTML document. A server that executes or serves the file incorrectly may trigger the second identity.
  • SVG XSS. SVG is XML that supports embedded <script> tags. Serving an uploaded SVG with Content-Type: image/svg+xml executes JavaScript in the browser of anyone who views it.
  • Oversized or malformed images. Malformed image headers can cause decompression bombs or crashes in image libraries.

Magic byte validation

Never trust the file extension or the Content-Type header — both are set by the client. Read the first few bytes of the file and compare them against the known signatures for your allowed types:

const fs = require('fs');

const IMAGE_MAGIC = {
  jpeg: { bytes: [0xFF, 0xD8, 0xFF],             offset: 0, length: 3 },
  png:  { bytes: [0x89, 0x50, 0x4E, 0x47],       offset: 0, length: 4 },
  gif:  { bytes: [0x47, 0x49, 0x46, 0x38],       offset: 0, length: 4 },
  webp: { bytes: [0x57, 0x45, 0x42, 0x50],       offset: 8, length: 4 },
  // RIFF header at 0 + "WEBP" at offset 8
};

function readBytes(filePath, offset, length) {
  const fd  = fs.openSync(filePath, 'r');
  const buf = Buffer.alloc(length);
  fs.readSync(fd, buf, 0, length, offset);
  fs.closeSync(fd);
  return buf;
}

function detectImageType(filePath) {
  for (const [type, sig] of Object.entries(IMAGE_MAGIC)) {
    const buf = readBytes(filePath, sig.offset, sig.length);
    if (buf.equals(Buffer.from(sig.bytes))) return type;
  }
  return null;
}

// Usage
const type = detectImageType(tmpPath);
if (!type) {
  return res.status(400).json({ error: 'File is not a recognised image type.' });
}

SVG and XSS

SVG files are XML and can contain <script> tags, event handlers, and javascript: URIs. If you serve an SVG with the image/svg+xml content type, any embedded script runs in the user's browser.

The safest policy is to reject SVG uploads entirely unless you have a specific need for them. If you must accept SVGs, sanitise them with a library such as DOMPurify (run server-side via jsdom) before storing:

// Reject SVGs outright
const ext = req.file.originalname.split('.').pop()?.toLowerCase();
if (ext === 'svg') {
  return res.status(400).json({
    error: 'SVG uploads are not accepted. Please upload a JPEG, PNG, or WebP.',
  });
}
Even if you sanitise SVGs, serve them with Content-Disposition: attachment or from a separate sandboxed domain (e.g. static.example.com) so they cannot access cookies on your main domain.

Polyglot file detection

A polyglot file is valid in two formats simultaneously. For example, a file can start with JPEG magic bytes but end with valid PHP or HTML. If your server passes the file to an interpreter based on extension rather than content, the attack triggers.

Practical mitigations:

  • Re-encode images server-side. Processing the image through Sharp or Jimp and saving the output destroys any appended data. sharp(tmpPath).jpeg().toFile(destPath) produces a clean JPEG.
  • Serve uploads from a separate origin. Files served from cdn.example.com cannot access cookies for app.example.com.
  • Set Content-Type explicitly. Always set the response Content-Type from your trusted list, not from what was uploaded.

What ClamAV catches in images

ClamAV scans the full file content, including metadata regions. It detects:

  • Known malware payloads embedded in EXIF/IPTC fields
  • PHP/JSP/HTML webshells appended after the image data
  • Exploit code targeting specific image library CVEs
  • Malicious scripts disguised with image extensions

ClamAV does not detect novel polyglot constructions that have no existing signature. The re-encoding step above is the definitive defense against polyglots.

Complete image upload endpoint

const express = require('express');
const multer  = require('multer');
const { scan, Verdict } = require('pompelmi');
const fs = require('fs');
const os = require('os');

const app    = express();
const upload = multer({
  dest:   os.tmpdir(),
  limits: { fileSize: 10 * 1024 * 1024 },  // 10 MB
});

const ALLOWED_MIME = new Set(['image/jpeg', 'image/png', 'image/webp', 'image/gif']);

function detectImageType(filePath) {
  const fd  = fs.openSync(filePath, 'r');
  const buf = Buffer.alloc(12);
  fs.readSync(fd, buf, 0, 12, 0);
  fs.closeSync(fd);

  if (buf[0] === 0xFF && buf[1] === 0xD8 && buf[2] === 0xFF) return 'jpeg';
  if (buf[0] === 0x89 && buf[1] === 0x50 && buf[2] === 0x4E && buf[3] === 0x47) return 'png';
  if (buf[0] === 0x47 && buf[1] === 0x49 && buf[2] === 0x46) return 'gif';
  if (buf.slice(8, 12).toString('ascii') === 'WEBP') return 'webp';
  return null;
}

app.post('/upload/image', upload.single('file'), async (req, res) => {
  if (!req.file) return res.status(400).json({ error: 'No file provided.' });

  const tmpPath = req.file.path;

  try {
    // Layer 1 — reject SVG regardless of how it was named
    const ext = req.file.originalname.split('.').pop()?.toLowerCase();
    if (ext === 'svg') {
      return res.status(400).json({ error: 'SVG uploads are not accepted.' });
    }

    // Layer 2 — validate by magic bytes (ignore client Content-Type)
    const imageType = detectImageType(tmpPath);
    if (!imageType) {
      return res.status(400).json({ error: 'File is not a recognised image type.' });
    }

    // Layer 3 — ClamAV scan
    const verdict = await scan(tmpPath);

    if (verdict === Verdict.Malicious) {
      return res.status(400).json({ error: 'Malware detected. Upload rejected.' });
    }
    if (verdict === Verdict.ScanError) {
      return res.status(422).json({ error: 'Scan incomplete. Upload rejected.' });
    }

    // File is clean — persist to storage
    return res.json({ status: 'ok', type: imageType });

  } finally {
    if (fs.existsSync(tmpPath)) fs.unlinkSync(tmpPath);
  }
});

Next steps