- Use sync.Pool for relay buffers instead of stack-allocated arrays.
A [16379]byte on the goroutine stack forces Go to grow it to 32KB
(next power of two). Pooled buffers keep goroutine stacks small.
- Same fix for doppelganger write buffer ([16384]byte in conn.start).
- Replace idle goroutines with context.AfterFunc in proxy.ServeConn
and relay.Relay. These goroutines existed only to wait on ctx.Done()
and close connections. AfterFunc achieves the same without allocating
a goroutine until the context is actually cancelled.
Net effect: at 3000 concurrent connections on a 1-vCPU/961MB VPS,
the unmodified binary drops 246 connections and falls to 10 MB/s.
With these changes: zero failures, 63 MB/s, 31% lower RSS.
Closes #412
Even if it makes sense to have a huge buffers, we do artificial delays
now. In that case we could achieve the same results with a lower buffer.
If not, then we won't send a packet bigger that this value
This commit fixes a situration when relay can be reset before all
waiting goroutines are finished. For example, we terminate processing
based on some event: socket error etc. So, error happens and context is
cancelled. After that a main relay goroutine starts to wait. Meanwhile a
second goroutine reaches deferred function and set wg to done. It means
that main goroutine can continue.
In this case this is really possible that we can start resetting before
transmit goroutine really exits.
A correct solution is to always do wg.Done() as a first deferred thing
on entering to a function. In that case we do not need reordering and so
on.