Setting up pompelmi with ClamAV on Kubernetes

In a Kubernetes environment you have two options for running ClamAV alongside your Node.js application:

  1. Include ClamAV in the app container image. Simple to set up but adds several hundred megabytes to your image and couples the ClamAV upgrade cycle to your application deployments.
  2. Run ClamAV (clamd) as a separate Deployment. Larger initial setup but the recommended approach for production: ClamAV can be scaled, updated, and monitored independently, and a single clamd instance can serve multiple application pods.

This guide uses option 2. pompelmi's TCP mode lets it talk to a remote clamd instance by passing { host, port } to scan(), so no code changes beyond the options object are required.

Architecture

The setup has three Kubernetes objects:

  • clamd Deployment — runs the ClamAV daemon in a container. Uses the official clamav/clamav Docker Hub image.
  • clamd Service — a ClusterIP service that exposes clamd on port 3310 inside the cluster.
  • Node.js app Deployment — your application, configured via an environment variable to reach clamd by its service name.

pompelmi streams the file from the Node.js container to clamd over TCP (ClamAV's INSTREAM protocol). The file never needs to leave the Node.js pod's local /tmp.

clamd Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: clamav
  labels:
    app: clamav
spec:
  replicas: 1
  selector:
    matchLabels:
      app: clamav
  template:
    metadata:
      labels:
        app: clamav
    spec:
      containers:
        - name: clamav
          image: clamav/clamav:stable
          ports:
            - containerPort: 3310
          resources:
            requests:
              memory: "512Mi"
              cpu:    "250m"
            limits:
              memory: "2Gi"
              cpu:    "1000m"
          readinessProbe:
            tcpSocket:
              port: 3310
            initialDelaySeconds: 60   # freshclam needs time on first boot
            periodSeconds: 10
          livenessProbe:
            tcpSocket:
              port: 3310
            initialDelaySeconds: 90
            periodSeconds: 20
          volumeMounts:
            - name: clamav-db
              mountPath: /var/lib/clamav
      volumes:
        - name: clamav-db
          emptyDir: {}   # replace with a PersistentVolumeClaim for faster restarts
The official clamav/clamav image runs freshclam on startup to download the virus database. This takes 1–3 minutes on first run. The initialDelaySeconds: 60 in the readiness probe prevents Kubernetes from routing traffic before the database is ready.

clamd Service

apiVersion: v1
kind: Service
metadata:
  name: clamav
spec:
  selector:
    app: clamav
  ports:
    - port:       3310
      targetPort: 3310
  type: ClusterIP

With this Service in place, any pod in the same namespace can reach clamd at clamav:3310. Pods in other namespaces use clamav.<namespace>.svc.cluster.local:3310.

Node.js app Deployment

Pass the clamd host and port to your Node.js container via environment variables so you can change them without rebuilding the image:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: upload-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: upload-api
  template:
    metadata:
      labels:
        app: upload-api
    spec:
      containers:
        - name: app
          image: your-registry/upload-api:latest
          ports:
            - containerPort: 3000
          env:
            - name: CLAMD_HOST
              value: "clamav"        # matches the Service name
            - name: CLAMD_PORT
              value: "3310"
            - name: CLAMD_TIMEOUT
              value: "15000"

Calling scan() in TCP mode

Read the environment variables and pass them as options to every scan() call:

const { scan, Verdict } = require('pompelmi');

const CLAMD_OPTIONS = {
  host:    process.env.CLAMD_HOST    || '127.0.0.1',
  port:    Number(process.env.CLAMD_PORT)    || 3310,
  timeout: Number(process.env.CLAMD_TIMEOUT) || 15_000,
};

async function scanFile(filePath) {
  return scan(filePath, CLAMD_OPTIONS);
}

Use scanFile wherever you previously called scan directly. The Verdict API is identical regardless of which backend is used.

app.post('/upload', upload.single('file'), async (req, res) => {
  const tmpPath = req.file.path;
  try {
    const verdict = await scanFile(tmpPath);

    if (verdict === Verdict.Malicious) {
      fs.unlinkSync(tmpPath);
      return res.status(400).json({ error: 'Malware detected.' });
    }
    if (verdict === Verdict.ScanError) {
      fs.unlinkSync(tmpPath);
      return res.status(422).json({ error: 'Scan incomplete. Rejected.' });
    }

    // File is clean — proceed
    return res.json({ status: 'ok' });

  } catch (err) {
    if (fs.existsSync(tmpPath)) fs.unlinkSync(tmpPath);
    // ECONNREFUSED means clamd pod is not yet ready
    return res.status(503).json({ error: 'Scanner unavailable: ' + err.message });
  }
});
During rolling deployments, clamd pods may momentarily be unavailable. Return HTTP 503 for ECONNREFUSED errors so your load balancer or API gateway can retry on another pod.

Keeping the virus database current

The clamav/clamav image runs freshclam on startup. For persistent database updates without pod restarts, mount a PersistentVolumeClaim at /var/lib/clamav and add a CronJob to periodically restart the clamd pod so freshclam runs again:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: clamav-db-update
spec:
  schedule: "0 3 * * *"   # Daily at 03:00
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: updater
              image: bitnami/kubectl:latest
              command:
                - kubectl
                - rollout
                - restart
                - deployment/clamav
A production setup would instead run freshclam as a sidecar container that periodically updates the shared PVC, with clamd watching the database directory for changes via SelfCheck. The CronJob restart approach is simpler and suitable for most workloads.

Next steps