Performance
Telemetry
street’s TelemetryTracker collects heap usage, request latency, and error rates. All history is stored in a bounded ring buffer — it never grows unboundedly.
Setup
1
2
3
4
5
6
import { TelemetryTracker, telemetryMiddleware } from './telemetry/tracker.js';
const telemetry = new TelemetryTracker(60_000); // Collect a sample every 60 seconds
container.register(TelemetryTracker, telemetry);
app.use(telemetryMiddleware(telemetry)); // Record every request
What is tracked
| Metric | Description |
|---|---|
heapUsedMb |
V8 heap used (MB) |
rss |
Resident set size (MB) — total process memory |
latencyP50 |
Median request latency (ms) |
latencyP99 |
99th percentile latency (ms) |
requestCount |
Total requests since start |
errorCount |
Total error responses since start |
Accessing metrics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Snapshot: current values
const snap = telemetry.snapshot();
console.log(`Heap: ${snap.heapUsedMb.toFixed(1)} MB`);
console.log(`P99 latency: ${snap.latencyP99.toFixed(1)} ms`);
// History: last N samples
const history = telemetry.getHistory(60); // Last 60 minutes (at 1/min)
// Health summary
const health = telemetry.health();
// {
// status: 'ok',
// uptime: 3600,
// pid: 12345,
// heap: { usedMb: '45.2', rssMb: '120.5' },
// requests: { total: 15000, errors: 12 },
// latency: { p50Ms: '2.3', p99Ms: '45.0' },
// timestamp: '2024-01-15T10:23:45.123Z'
// }
The /api/health endpoint
The built-in HealthController exposes:
1
2
3
4
5
# Real-time health + DB check
curl http://localhost:3000/api/health
# Telemetry history (last 60 samples)
curl http://localhost:3000/api/metrics?count=60
Health response format:
1
2
3
4
5
6
7
8
9
10
11
12
13
{
"status": "ok",
"uptime": 3600.5,
"pid": 12345,
"heap": { "usedMb": "45.2", "rssMb": "120.5" },
"requests": { "total": 15000, "errors": 12 },
"latency": { "p50Ms": "2.3", "p99Ms": "45.0" },
"timestamp": "2024-01-15T10:23:45.123Z",
"checks": {
"database": { "status": "ok", "latencyMs": 2 }
},
"pool": { "size": 5, "idle": 3 }
}
Status is "degraded" if any check fails or heap exceeds 900 MB.
Ring buffer sizing
Samples are collected every collectIntervalMs (default 60,000 ms). The history holds at most MAX_SAMPLES (1,440) entries — exactly 24 hours at 1 sample/minute. When the buffer is full, the oldest sample is dropped:
1
2
3
if (this.samples.length >= MAX_SAMPLES) {
this.samples.shift(); // O(n) — acceptable at 1/min frequency
}
For higher-frequency sampling (e.g., every 10 seconds), reduce MAX_SAMPLES or increase collectIntervalMs to stay within a sensible memory budget.
Cleanup
1
telemetry.destroy(); // Clears the collection timer
Caching
The LruCache provides a bounded, TTL-aware in-memory cache using a doubly-linked list for O(1) eviction.
Setup
1
2
3
4
5
6
7
import { LruCache } from './cache/lru.js';
// Cache up to 1000 entries, each valid for 5 minutes
const cache = new LruCache<string, UserPublic>({
maxEntries: 1000,
ttlMs: 300_000,
});
Usage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Set
cache.set(userId, userObject);
// Get (returns undefined if missing or expired)
const user = cache.get(userId);
if (!user) {
// Cache miss — fetch from DB
const fetched = await this.repo.findById(userId);
if (fetched) cache.set(userId, toPublicUser(fetched));
}
// Check existence
cache.has(userId);
// Invalidate
cache.delete(userId);
// Clear everything
cache.clear();
// Size
console.log(cache.size);
Cache-aside pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
@Injectable()
export class UserService {
private readonly cache = new LruCache<string, UserPublic>({
maxEntries: 5000,
ttlMs: 60_000, // 1 minute TTL
});
async findById(id: string): Promise<UserPublic | null> {
const cached = this.cache.get(id);
if (cached) return cached;
const user = await this.repo.findById(id);
if (user) {
const pub = toPublicUser(user);
this.cache.set(id, pub);
return pub;
}
return null;
}
async update(id: string, dto: UpdateUserDto): Promise<UserPublic> {
const updated = await this.repo.update(id, dto);
// Invalidate cache on write
this.cache.delete(id);
return toPublicUser(updated!);
}
}
LRU eviction behavior
When cache.set() is called and size > maxEntries, the least recently used entry is evicted. “Recently used” is updated on every get() call.
Example with maxEntries: 3:
1
2
3
4
5
set('a', 1) → [a]
set('b', 2) → [a, b]
set('c', 3) → [a, b, c]
get('a') → [b, c, a] ← a moved to head (MRU)
set('d', 4) → [c, a, d] ← b evicted (LRU)
TTL sweep
A periodic sweep runs every ttlMs / 2 milliseconds and removes expired entries. This prevents stale entries from occupying slots until they happen to be accessed.
1
cache.destroy(); // Stops the sweep timer, clears all entries
Clustering
street can run in cluster mode, spawning one worker per CPU core. The ClusterCoordinator manages the workers from the primary process.
How it works
1
2
3
4
5
Primary process (ClusterCoordinator)
├── Worker 1 (HTTP server on port 3000)
├── Worker 2 (HTTP server on port 3000) ← OS load balances
├── Worker 3 (HTTP server on port 3000)
└── Worker 4 (HTTP server on port 3000)
The OS distributes incoming connections across workers using SO_REUSEPORT. Each worker is an independent Node.js process — a crash in one worker does not affect others.
Enabling cluster mode
Set NODE_ENV=production. The main entry point detects this and starts the cluster coordinator:
1
2
3
4
5
6
7
8
9
10
11
12
13
// src/main.ts
if (cluster.isPrimary && process.env['NODE_ENV'] === 'production') {
const coordinator = new ClusterCoordinator({
workers: parseInt(process.env['WORKERS'] ?? '0', 10) || undefined,
heartbeatIntervalMs: 10_000,
heartbeatTimeoutMs: 30_000,
});
coordinator.start();
return;
}
// Workers fall through to bootstrap()
bootstrap();
Worker count
By default, one worker per CPU core (from os.cpus().length). Override:
1
WORKERS=4 NODE_ENV=production node dist/src/main.js
For I/O-bound APIs (database queries, network calls), 2× CPU count can improve throughput. For CPU-bound work, match CPU count exactly.
IPC heartbeat
Each worker sends a heartbeat IPC message to the primary every 5 seconds:
1
2
// Worker side
const heartbeatTimer = workerHeartbeat(5_000);
The primary tracks lastHeartbeat per worker. If a worker goes silent for heartbeatTimeoutMs (30 seconds), it is killed and a new worker is spawned:
1
2
3
4
5
// Primary side — automatic
if (now - state.lastHeartbeat > this.opts.heartbeatTimeoutMs) {
state.worker.kill('SIGTERM');
// cluster 'exit' event fires → _spawnWorker() called automatically
}
Auto-restart
When a worker exits (crash, OOM, SIGKILL), the cluster 'exit' event fires and the coordinator spawns a replacement after a 500 ms delay:
1
2
3
cluster.on('exit', (worker, code, signal) => {
setTimeout(() => this._spawnWorker(), 500).unref();
});
The delay prevents tight restart loops if the worker crashes immediately on startup (e.g., due to a bug).
State considerations in cluster mode
Each worker has its own in-memory state. The LruCache, rate limiter, and telemetry tracker are per-worker:
- Rate limiting: each worker enforces its own limit. With N workers, the effective limit is
N × maxRequests. For accurate global rate limiting, use a shared store (Redis). - Cache: each worker has its own cache. Cache invalidation must go through the database (cache-aside pattern with short TTLs).
- Telemetry: each worker reports independently. The
/api/healthendpoint reflects the responding worker’s metrics.
Memory Safety
Memory safety is the foundation of street’s design. Every component enforces an upper bound on its memory usage.
Why it matters
Node.js is a single-threaded event loop. A memory leak in one part of the application degrades the entire server. Unlike a Java application with multiple threads (where a leak in one thread does not immediately affect others), a Node.js leak hits the event loop latency for all requests.
street’s philosophy: if it can grow, it must be bounded.
Bounded components
| Component | Bound | Mechanism |
|---|---|---|
| HTTP body | 1 MB default | maxBodyBytes option, stream abort |
| File uploads | N/A (disk) | Streaming to disk, never to heap |
| PostgreSQL results | 256 rows buffered | StreetPostgresWireStream.MAX_BUFFERED |
| Connection pool | maxConnections |
Acquire queue with timeout |
| Acquire wait queue | 100 callers | Throws when full |
| Rate limiter keys | 100,000 IPs | Oldest evicted when full |
| Rate limiter timestamps per IP | 1,000 | Oldest evicted |
| LRU cache entries | maxEntries |
LRU eviction |
| Telemetry history | 1,440 samples | Ring buffer |
| WebSocket connections | maxConnections |
Closed with 1013 when full |
| Webhook queue | 10,000 items | Oldest dropped |
| XSS recursion depth | 32 levels | Hard limit, returns null |
Backpressure
Backpressure is how a slow consumer slows down a fast producer. street applies it at two levels:
HTTP → Application:
The body parser reads in chunks. If the total exceeds maxBodyBytes, the socket is destroyed and a 413 error is returned. This prevents large request bodies from exhausting heap.
Database → Application:
StreetPostgresWireStream pauses the TCP socket when its internal queue reaches capacity. The PostgreSQL server stops sending rows. No row is held in memory longer than needed.
Application → HTTP:
Streaming response handlers check res.write() return value. On false, they pause the database stream. On drain, they resume. This prevents a slow HTTP client from causing rows to queue in memory.
Resource cleanup checklist
Every long-lived resource must be closed during graceful shutdown:
1
2
3
4
5
6
7
8
9
10
11
process.once('SIGTERM', async () => {
clearInterval(myInterval); // Timers
clearTimeout(myTimeout); // Timeouts
rateLimiter.destroy(); // Clears sweep interval + data
cache.destroy(); // Clears sweep interval + data
telemetry.destroy(); // Clears collection interval
await wsServer.close(); // Terminates WS connections
await app.close(); // Stops accepting new HTTP connections
await pool.close(); // Closes all PG connections
process.exit(0);
});
Detecting memory leaks
1
2
3
4
5
6
7
8
9
10
11
# Watch heap usage over time
node --inspect dist/src/main.js
# Then open chrome://inspect in Chrome → Memory tab
# Heap snapshot comparison
node -e "
const v8 = require('v8');
v8.writeHeapSnapshot('./heap-before.heapsnapshot');
// ... run workload ...
v8.writeHeapSnapshot('./heap-after.heapsnapshot');
"
Signs of a leak:
heapUsedMbgrowing monotonically in telemetry historyrssgrowing unboundedly after each request batch- P99 latency increasing as heap pressure triggers GC pauses
Common causes in street applications:
- Intervals not unref’d or cleared
- Event listeners not removed on socket close
- Unbounded arrays appended to (use ring buffers)
- Closures capturing large objects (use
WeakRefor explicit nulling)