- Use sync.Pool for relay buffers instead of stack-allocated arrays.
A [16379]byte on the goroutine stack forces Go to grow it to 32KB
(next power of two). Pooled buffers keep goroutine stacks small.
- Same fix for doppelganger write buffer ([16384]byte in conn.start).
- Replace idle goroutines with context.AfterFunc in proxy.ServeConn
and relay.Relay. These goroutines existed only to wait on ctx.Done()
and close connections. AfterFunc achieves the same without allocating
a goroutine until the context is actually cancelled.
Net effect: at 3000 concurrent connections on a 1-vCPU/961MB VPS,
the unmodified binary drops 246 connections and falls to 10 MB/s.
With these changes: zero failures, 63 MB/s, 31% lower RSS.
Closes #412