Improve TCP keepalive and idle timeout for mobile clients
TCP keepalive was configured (SetKeepAlivePeriod) but never actually
enabled (SO_KEEPALIVE) on accepted client connections. Go 1.26's
SetKeepAlivePeriod only sets TCP_KEEPIDLE — it does not call
setsockopt(SO_KEEPALIVE, 1). Without SO_KEEPALIVE the kernel never
sends probe packets, so dead connections from sleeping mobile clients
linger until the idle timeout fires.
Replace SetKeepAlive + SetKeepAlivePeriod with net.KeepAliveConfig
(available since Go 1.24) for explicit per-socket control:
Idle: 30s (time before first probe)
Interval: 10s (between probes)
Count: 3 (failed probes to declare dead)
This detects dead connections in ~60s instead of relying on system
defaults (tcp_keepalive_intvl=75s, probes=9 → up to 11 minutes).
Increase the default idle timeout from 1 minute to 5 minutes.
MTProto clients send ping_delay_disconnect every ~60s, which resets
the idle timer. The previous 1-minute default created a race: if a
ping arrived even 1–2 seconds late the relay was killed. A 5-minute
window also survives typical mobile sleep periods (phone idle 2–5 min)
where the NAT mapping is still alive and the connection can resume
without reconnection.
Cherry-picked from 9seconds/mtg@5f81ae3 with fork-specific fixes:
- keep "time" import in run_proxy.go (used by throttle feature)
- fix network/sockopts_test.go import path to dolonet/mtg-multi
Upstream replaced our reassembleTLSHandshake with a cleaner
fragmentedHandshakeReader + parseClientHello in utils.go.
Adapted ReadClientHelloMulti to use the new API.
Add per-user connection throttling with fair-share algorithm
When total connections exceed a configurable limit, a background
goroutine (every 5s by default) computes per-user caps using a
fair-share algorithm: small users keep their connections, remaining
budget is split equally among heavy users. New connections from
over-cap users are rejected; existing connections are not killed.
Config:
[throttle]
max-connections = 5000
check-interval = "5s"
Stats API response now includes throttle state with active caps.
fix: move [secrets] after global keys in config examples
In TOML, all keys after a [section] header belong to that table.
The examples had api-bind-to after [secrets], causing it to be
parsed as secrets.api-bind-to and triggering "incorrect secret
format" errors.
Fixes #6
DPI bypass tools like ByeDPI fragment a single TLS record into multiple
records to evade censorship. This broke ReadClientHello because it
assumed the entire ClientHello arrives in one TLS record.
Add reassembleTLSHandshake that reads continuation records and
reconstructs a single TLS record before parsing and HMAC verification.
Per RFC 5246 Section 6.2.1, handshake messages may be fragmented
across multiple records — this is valid TLS behavior.
fix: change module path to github.com/dolonet/mtg-multi
go install github.com/dolonet/mtg-multi@latest was broken because go.mod
declared module path as github.com/9seconds/mtg/v2 while the repo lives
at github.com/dolonet/mtg-multi. Users ended up installing upstream
binary without multi-secret support.
Fixes https://github.com/9seconds/mtg/issues/376#issuecomment-4162877568
For a couple of releases we use collected IPs as a prioritized source
for connecting to Telegram. But apparently, they work way worse than it
should, and having connectivity to core ip ALWAYS gives better results.
Thus, this PR flips priorities, so users could have auto-update enabled
as a source of secondary addresses, not primary ones
Address review: use slices.Clone, simplify concurrent test
- Replace manual make+copy with slices.Clone in Snapshot()
- Remove redundant _ = len(data); Snapshot() call alone is
sufficient to exercise the lock under -race
fix: tighten ScoutConnCollected encapsulation and add concurrency test
- Move error check before Snapshot() to avoid unnecessary allocation
- Update existing tests to use Snapshot() instead of direct field access
- Add TestConcurrentAddSnapshot to explicitly exercise the mutex
1. Add sync.Mutex to ScoutConnCollected to eliminate data race between
Add()/MarkWrite() in readLoop and learn() iterating results.
Introduce Snapshot() for safe read access.
2. Increase bloom filter test size from 500 to 100000 to prevent
false negatives from random eviction in the stable bloom filter.
3. Use Require().NoError() in TestHTTPSRequest to prevent nil-pointer
panic on resp.Body.Close() when the request fails.
Fixes #425
Cover shared idle tracker behavior:
- tracker lifecycle (new, idle after timeout, touch resets)
- read/write with data touches tracker
- read retries on timeout when tracker is not idle
- read closes on timeout when tracker is idle
- shared tracker prevents false timeout across directions
fix: use shared idle tracker for relay connections
connIdleTimeout previously set per-direction deadlines independently.
During media downloads the client→telegram direction can be idle at the
application level while telegram→client is actively streaming data.
After IdleTimeout (default 1 min) the idle direction's ReadDeadline
fires, tearing down the entire relay and breaking media transfers.
Replace the per-direction timeout with a shared atomic timestamp that
both pump goroutines update on any successful Read or Write. When a
ReadDeadline fires on the idle direction, we check the shared tracker:
if the other direction was recently active, we retry instead of closing.
The connection is only torn down when both directions are idle for the
full timeout period.
This matches the documented IdleTimeout contract: "if we have any
message which will pass to either direction, a timer is reset."
Overhead: one atomic.Int64 (8 bytes) per connection pair, one
atomic.Store (~1 ns) per Read/Write with data, zero extra goroutines.
Fixes #423
Fix CI: benchmark lint exclusions and data race in doppel test
Benchmarks added in the per-connection-overhead work triggered errcheck
warnings (37 unchecked Close/Fprintf in tool/bench code) and a data race
on package-level `sink` variable written by concurrent goroutines.
- Exclude benchmarks/ and *_bench_test.go from errcheck/ineffassign
- Replace concurrent `sink = buf[X]` with runtime.KeepAlive(&buf)