Requirements

scanS3() requires @aws-sdk/client-s3. pompelmi does not bundle the SDK — it is an optional peer dependency, so you only pay for it when you actually use the S3 integration.

npm install @aws-sdk/client-s3

If the SDK is not installed when scanS3() is called, pompelmi throws immediately with the message:

Install AWS SDK: npm install @aws-sdk/client-s3

You also need a running clamd instance reachable from your application. See Docker & Remote Scanning for setup instructions.

Basic usage

scanS3() downloads the S3 object as a stream and pipes it directly to scanStream(). No disk I/O occurs in clamd mode. All ScanOptions (socket, host, port, timeout, retries, retryDelay) pass through to the underlying scanStream() call.

const { scanS3, Verdict } = require('pompelmi');

const result = await scanS3(
  { bucket: 'my-uploads', key: 'incoming/report.pdf', region: 'us-east-1' },
  { host: '127.0.0.1', port: 3310 }
);

switch (result) {
  case Verdict.Clean:
    console.log('File is safe');
    break;
  case Verdict.Malicious:
    console.error('Malware detected!');
    // quarantine or delete the object
    break;
  case Verdict.ScanError:
    console.error('Scan inconclusive — treat as untrusted');
    break;
}

IAM permissions

The IAM principal running your application (Lambda execution role, EC2 instance profile, ECS task role, etc.) needs s3:GetObject on the objects you intend to scan. A minimal policy looks like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PompelmiScanRead",
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-uploads/*"
    }
  ]
}

If your Lambda also needs to delete infected objects (see the Lambda example below), add s3:DeleteObject to the same statement.

Passing credentials explicitly

When running without an ambient IAM role (e.g. local development), pass credentials directly in the params object.

const { scanS3, Verdict } = require('pompelmi');

const result = await scanS3(
  {
    bucket: 'my-uploads',
    key: 'test/sample.docx',
    region: 'eu-west-1',
    credentials: {
      accessKeyId:     process.env.AWS_ACCESS_KEY_ID,
      secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
    },
  },
  { host: '127.0.0.1', port: 3310 }
);
When credentials is omitted, pompelmi uses the AWS SDK default credential chain: environment variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY), shared credentials file (~/.aws/credentials), EC2/ECS/Lambda instance metadata, and so on. On AWS infrastructure you typically do not need to pass credentials at all.

Lambda example

The following Lambda function is triggered by an S3 ObjectCreated event. It scans each new object and deletes it if malware is detected.

const { scanS3, Verdict } = require('pompelmi');
const { S3Client, DeleteObjectCommand } = require('@aws-sdk/client-s3');

const s3 = new S3Client({});

exports.handler = async (event) => {
  const results = [];

  for (const record of event.Records) {
    const bucket = record.s3.bucket.name;
    const key    = decodeURIComponent(record.s3.object.key.replace(/\+/g, ' '));

    const verdict = await scanS3(
      { bucket, key },
      {
        host:    process.env.CLAMAV_HOST,
        port:    Number(process.env.CLAMAV_PORT) || 3310,
        timeout: 30_000,
      }
    );

    if (verdict === Verdict.Malicious) {
      console.error(`INFECTED: s3://${bucket}/${key}`);
      await s3.send(new DeleteObjectCommand({ Bucket: bucket, Key: key }));
      results.push({ key, status: 'deleted' });
    } else {
      results.push({ key, status: verdict.description });
    }
  }

  return { statusCode: 200, body: JSON.stringify(results) };
};
clamd must be reachable from the Lambda function. The recommended architecture is a clamd container running as an ECS sidecar or ECS Fargate task on the same VPC as the Lambda, with a security group rule allowing port 3310 inbound from the Lambda's security group.

Scan before upload (pre-upload pattern)

Scan a local file before uploading it to S3. Only upload if the file is clean. This prevents infected files from ever landing in the bucket.

const { scan, Verdict } = require('pompelmi');
const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');
const fs = require('fs');

const s3 = new S3Client({ region: 'us-east-1' });

async function safeUpload(localPath, bucket, key) {
  // Scan locally first
  const verdict = await scan(localPath, {
    host: process.env.CLAMAV_HOST,
    port: Number(process.env.CLAMAV_PORT) || 3310,
  });

  if (verdict !== Verdict.Clean) {
    throw new Error(`Upload rejected: scan result = ${verdict.description}`);
  }

  // Only reached when clean
  await s3.send(new PutObjectCommand({
    Bucket: bucket,
    Key:    key,
    Body:   fs.createReadStream(localPath),
  }));

  console.log(`Uploaded clean file to s3://${bucket}/${key}`);
}

await safeUpload('/tmp/user-upload.zip', 'my-uploads', 'processed/user-upload.zip');

Retry on transient errors

Use retries and retryDelay to handle transient connection errors without extra wrapper code. Useful in Lambda where the clamd container may occasionally restart.

const result = await scanS3(
  { bucket: 'my-uploads', key: 'incoming/archive.zip', region: 'us-east-1' },
  {
    host:        process.env.CLAMAV_HOST,
    port:        3310,
    retries:     3,
    retryDelay:  500,   // ms between attempts
    timeout:     20_000,
  }
);
Retries apply to connection errors (ECONNREFUSED, timeout, etc.), not to positive malware detections. A Verdict.Malicious result is never retried.