When the worker pool rejected a connection (ErrPoolOverload), the
accepted net.Conn was never closed — leaking a file descriptor and
TCP socket per rejected connection. Under sustained traffic spikes this
compounds the problem: leaked descriptors reduce the capacity for new
dials (including to the fronting domain), accelerating the failure
cascade described in #378.
fix: apply idle timeout to domain fronting relay connections
Domain fronting relay (for non-Telegram traffic) had no idle timeout,
causing worker pool exhaustion under traffic spikes.
The ProxyOpts.IdleTimeout field existed but was never wired into the
proxy. Now domain fronting connections are wrapped with per-read/write
deadlines reset to the configured idle timeout (default 1m), so stale
or slowloris-style connections are reaped promptly.
Fixes #378
- Use sync.Pool for relay buffers instead of stack-allocated arrays.
A [16379]byte on the goroutine stack forces Go to grow it to 32KB
(next power of two). Pooled buffers keep goroutine stacks small.
- Same fix for doppelganger write buffer ([16384]byte in conn.start).
- Replace idle goroutines with context.AfterFunc in proxy.ServeConn
and relay.Relay. These goroutines existed only to wait on ctx.Done()
and close connections. AfterFunc achieves the same without allocating
a goroutine until the context is actually cancelled.
Net effect: at 3000 concurrent connections on a 1-vCPU/961MB VPS,
the unmodified binary drops 246 connections and falls to 10 MB/s.
With these changes: zero failures, 63 MB/s, 31% lower RSS.
Closes #412
Move cert noise calibration into doppelganger scout
Instead of a separate cert_probe.go that duplicates the scout's TLS
connection logic, measure the cert chain size directly from the same
HTTPS connections the scout already makes.
Changes:
- Extend ScoutConnResult with payloadLen field
- Add Write interception to ScoutConn for handshake boundary detection
- Scout.learn() now computes cert size (sum of ApplicationData between
CCS and first client Write) alongside inter-record durations
- Ganger aggregates cert sizes across raids and exposes NoiseParams()
via atomic pointer for lock-free reads from proxy goroutines
- Proxy reads NoiseParams from Ganger on each handshake instead of
probing at startup
- Remove cert_probe.go, disk cache, and related config options
(noise-cache-path, noise-cache-ttl, noise-probe-count)
Falls back to legacy 2500-4700 range until the first scout raid
completes (typically within 1-2 seconds of startup).
Add dynamic cert noise calibration for FakeTLS handshake
The hardcoded noise range (2500-4700 bytes) in the FakeTLS ServerHello
does not match the real certificate chain sizes of many popular fronting
domains (e.g., dl.google.com ≈ 6480 bytes, microsoft.com ≈ 13004 bytes).
This makes the proxy detectable by DPI systems that compare the
ApplicationData size with the real cert chain size for the SNI domain.
On startup, probe the fronting domain's actual TLS handshake size and
use the measured value ± jitter instead of the static range. Falls back
to the legacy 2500-4700 range if the probe fails.
Also adds optional caching of probe results between restarts
(noise-cache-path, noise-cache-ttl) and a configurable probe count
(noise-probe-count) under [defense.doppelganger].
Closes #408
Even if it makes sense to have a huge buffers, we do artificial delays
now. In that case we could achieve the same results with a lower buffer.
If not, then we won't send a packet bigger that this value