Date: 2026-03-27 Platform: darwin/arm64, Apple M4, 10 cores Test: Stress benchmarks — concurrent connections === Concurrent Relays === 100 connections × 10 MB each (1 GB total): stack_16KB: 71,826 MB/s | peak 5.6 MB | 1 GC / 137 us pool_16KB: 68,413 MB/s | peak 4.5 MB | 1 GC / 149 us pool_4KB: 66,985 MB/s | peak 4.3 MB | 1 GC / 108 us 500 connections × 10 MB each (5 GB total): stack_16KB: 68,208 MB/s | peak 6.0 MB | 10 GC / 1,171 us pool_16KB: 63,587 MB/s | peak 6.4 MB | 8 GC / 918 us pool_4KB: 69,775 MB/s | peak 5.6 MB | 8 GC / 1,011 us 1000 connections × 10 MB each (10 GB total): stack_16KB: 68,265 MB/s | peak 7.5 MB | 14 GC / 1,618 us pool_16KB: 71,258 MB/s | peak 9.7 MB | 9 GC / 1,138 us pool_4KB: 55,186 MB/s | peak 6.3 MB | 14 GC / 1,570 us 2000 connections × 1 MB each (2 GB total, many short connections): stack_16KB: 45,666 MB/s | peak 16.0 MB | 16 GC / 1,898 us pool_16KB: 53,451 MB/s | peak 9.0 MB | 16 GC / 1,723 us pool_4KB: 53,367 MB/s | peak 8.5 MB | 17 GC / 1,970 us 500 connections × 50 MB each (25 GB total, large files): stack_16KB: 70,020 MB/s | peak 7.3 MB | 7 GC / 868 us pool_16KB: 71,983 MB/s | peak 7.0 MB | 5 GC / 653 us pool_4KB: 67,908 MB/s | peak 6.2 MB | 6 GC / 769 us === Pool Contention (sync.Pool.Get/Put under parallel load) === 100 workers: 1.25 ns/op 500 workers: 1.30 ns/op 1000 workers: 1.29 ns/op 2000 workers: 1.32 ns/op (No contention visible — scales perfectly) === GC Pressure (500 conns × 10 MB) === stack_16KB: 63,325 MB/s | 12 GC / 1,286 us | stack 2.5 MB / heap 3.3 MB pool_16KB: 68,286 MB/s | 8 GC / 933 us | stack 2.5 MB / heap 4.4 MB