SHA256 Scan Cache

The scan cache avoids rescanning files whose SHA256 hash has already been seen. On a cache hit, the stored verdict is returned immediately — no clamd call, no disk I/O. The cache uses an in-memory LRU map with configurable TTL and an optional JSON file backend for persistence across restarts.

All hashing uses Node.js built-in crypto.createHash('sha256') — no external dependencies.

createCache(options?)

const { createCache } = require('pompelmi');

const cache = createCache({
  ttl:     3600000, // 1 hour
  maxSize: 1000,
  storage: 'memory'
});

// Wrap any scan with caching
const verdict = await cache.scan('/uploads/report.pdf', { host: 'localhost', port: 3310 });
const verdict = await cache.scanBuffer(buffer, { host: 'localhost', port: 3310 });

Options

OptionTypeDefaultDescription
ttl number 3600000 Time-to-live per entry in milliseconds (1 hour).
maxSize number 1000 Maximum number of entries. Oldest entry is evicted (LRU) when the limit is reached.
storage 'memory' | 'file' 'memory' Storage backend. Use 'file' to persist the cache across restarts.
filePath string './.pompelmi-cache.json' Path to the JSON cache file. Only used when storage: 'file'.

Methods

MethodReturnsDescription
cache.scan(filePath, options?) Promise<Verdict> Computes SHA256 of the file at filePath, checks the cache, and calls the underlying scan() on a miss.
cache.scanBuffer(buffer, options?) Promise<Verdict> Computes SHA256 of the buffer, checks the cache, and calls the underlying scanBuffer() on a miss.
cache.stats() CacheStats Returns hit/miss/size statistics. See stats().
cache.clear() void Empties the cache and resets all statistics.
cache.delete(sha256) void Removes a single entry by its SHA256 hex string.

stats()

Returns an object with the following properties:

const s = cache.stats();
// {
//   hits:    42,    // cache hits since last clear()
//   misses:  8,     // cache misses since last clear()
//   size:    50,    // current number of entries
//   hitRate: 0.84   // hits / (hits + misses), 0 if no requests yet
// }

File storage

When storage: 'file' is set, the cache is loaded from disk on startup and written back after every mutation. Writes are debounced by 500 ms to avoid excessive I/O during burst scanning. Expired entries are discarded on load.

const cache = createCache({
  storage:  'file',
  filePath: './.pompelmi-cache.json',
  ttl:      86400000  // 24 hours
});

// Cache persists across process restarts.
// The JSON file is human-readable and can be deleted to clear the cache.

File writes are atomic: the library writes to a .tmp file first, then renames it, so the cache file is never left in a partially-written state.

Performance guide

The cache is most effective when the same files are uploaded repeatedly (e.g. profile pictures, document templates) or when rescanning is triggered on every request.

  • Set ttl to the maximum age you trust a verdict — 1 hour is safe for most deployments.
  • Set maxSize to roughly the number of unique files you serve per hour.
  • Use storage: 'file' when the same file set is used across restarts (e.g. static asset servers).
  • Call cache.clear() after a virus database update to force re-evaluation of all cached clean results.

TypeScript

import { createCache } from 'pompelmi';
import type { CacheOptions, CacheStats, ScanCache } from 'pompelmi';

const cache: ScanCache = createCache({ ttl: 3600000, maxSize: 500 });
const verdict = await cache.scanBuffer(buffer);
const s: CacheStats = cache.stats();