If you've ever been on a video call that stuttered, froze, or dropped entirely, there's a good chance the culprit was a TURN server. TURN (Traversal Using Relays around NAT) is the invisible backbone of every WebRTC-based video platform. It's the relay that routes media when peer-to-peer connections fail — which happens roughly 15-20% of the time in enterprise networks behind restrictive firewalls.
Most video platforms — including several unicorn-valued competitors — use coturn, an open-source C implementation that hasn't seen a major architectural update in years. Others use commercial TURN-as-a-service providers, paying per-gigabyte rates that make scaling to millions of minutes ruinously expensive. A few roll their own in Go or C++, but still end up with the same fundamental limitations: high memory overhead per connection, poor multi-core utilization, and no native integration with the rest of their media pipeline.
At V100, we decided none of these options were acceptable. So we built RustTURN — a proprietary TURN/STUN/ICE media relay server written from the ground up in Rust. It's now the core of every video session on our platform, handling media relay for enterprise conferencing, AI-powered meeting recordings, and white-label video products used by our customers.
The Problem With Off-the-Shelf TURN Servers
To understand why we built RustTURN, you need to understand what a TURN server actually does. When two peers on a video call can't establish a direct connection (because of NATs, firewalls, or corporate proxies), the TURN server acts as a relay. Every audio and video packet from both sides flows through it. For a 1-on-1 call, that's manageable. For a 200-person conference call with screen sharing, AI transcription feeds, and recording pipelines — the relay becomes the most performance-critical component in the entire stack.
Coturn, the de facto open-source TURN server, was designed in an era when "scale" meant a few hundred concurrent connections. It uses a thread-per-connection model, allocates heap memory for every TURN allocation, and has no concept of media awareness — it treats video packets the same as audio packets the same as data channel messages. It works, but at scale, the cracks become canyons.
The specific problems we hit with coturn at scale:
- Memory bloat. Each TURN allocation consumed ~64KB of heap memory. At 10,000 concurrent relayed connections, that's 640MB of overhead before a single media packet is forwarded. At 100,000 connections, you're looking at 6.4GB of pure overhead.
- Single-threaded bottleneck. Coturn's event loop doesn't effectively distribute work across cores. On a 96-vCPU instance, we measured utilization below 15%. We were paying for compute we couldn't use.
- No media awareness. A TURN server that doesn't understand media can't prioritize audio over video during congestion, can't do selective forwarding, and can't integrate with recording or AI pipelines without a separate SFU layer.
- C memory safety issues. We found and patched 3 buffer overflow vulnerabilities in our coturn deployment in under a year. For a HIPAA-compliant platform handling healthcare video, this was unacceptable.
- Cost. Commercial TURN-as-a-service providers charge $0.04-0.08 per GB. At our scale, media relay alone would cost more than the rest of our infrastructure combined.
Why Rust
The decision to write a media server in Rust wasn't trendy contrarianism — it was the only choice that satisfied all our constraints simultaneously. We needed memory safety without garbage collection pauses. We needed predictable sub-millisecond latency for real-time media forwarding. We needed zero-cost abstractions that let us write high-level relay logic without sacrificing the packet-per-second throughput of hand-tuned C. And we needed fearless concurrency — the ability to saturate all 96 vCPUs on a Graviton4 metal instance without data races.
Go was the other serious contender. It's fast, concurrent, and has a great networking story. But Go's garbage collector introduces unpredictable pauses that show up as audio glitches in real-time media. When you're relaying 48kHz Opus audio packets that need to arrive within a 20ms window, a 5ms GC pause is audible. Rust's ownership model gives us deterministic memory management with zero GC pauses.
C++ was ruled out for a simpler reason: we've spent too many engineering hours debugging use-after-free and buffer overflow bugs in other parts of our stack. For a security-critical component that handles PHI (Protected Health Information) under HIPAA, memory safety isn't optional. Rust gives us both performance and safety without compromise.
RustTURN Architecture
RustTURN is built on a multi-threaded, async I/O architecture using Tokio. Each incoming TURN allocation is assigned to a lightweight task rather than a heavyweight OS thread. On our production instances (c8g.metal-48xl with 192 vCPUs), RustTURN handles over 500,000 concurrent relay connections with under 2GB of resident memory — a 300x improvement over coturn's memory profile at the same connection count.
RustTURN by the numbers:
The key architectural decisions that make this possible:
Arena-based allocation for relay buffers. Instead of heap-allocating a buffer for every packet, RustTURN uses per-thread arena allocators. Each worker thread maintains a pool of pre-allocated 1500-byte buffers (matching the typical MTU). Packets are received into an arena buffer, forwarded, and the buffer is recycled — zero heap allocations in the hot path. This eliminates allocator contention that plagues multi-threaded C servers.
io_uring on Linux, kqueue on macOS. RustTURN uses the most efficient I/O primitives available on each platform. On our production Linux instances, io_uring lets us batch hundreds of UDP send/receive operations into a single syscall, dramatically reducing kernel transitions. This alone accounts for a 40% throughput improvement over epoll-based designs.
Media-aware relay. Unlike dumb TURN relays, RustTURN parses RTP headers to understand what it's forwarding. This lets us implement quality-of-service policies at the relay level: audio packets get priority over video, key frames get priority over delta frames, and screen share streams can be rate-limited independently. During network congestion, your audio stays crystal clear even if video quality temporarily drops.
Integrated recording taps. Because RustTURN understands media streams, it can fork packets directly to our recording pipeline without a separate SFU layer. When a meeting has recording enabled, the relay simply copies media packets to a recording buffer that's asynchronously flushed to S3. Zero additional latency for participants, no extra hop for the media.
The Cost Advantage
The financial impact of owning our own media infrastructure is massive. Commercial TURN-as-a-service providers charge between $0.04 and $0.08 per gigabyte of relayed traffic. A typical 1-hour video conference with 10 participants generates roughly 15-20GB of relayed media (assuming not all connections need relay — typically 15-20% do). At scale, this adds up to millions of dollars per month.
With RustTURN running on our own infrastructure, our media relay cost is effectively the EC2 instance cost divided by the total traffic handled. On a c8g.metal-48xl instance ($1.80-2.30/hr spot pricing), handling 500,000+ concurrent connections, our effective per-GB cost is under $0.001 — roughly 40-80x cheaper than commercial alternatives. This cost advantage is what allows V100 to offer video at $0.002-0.004 per minute while maintaining healthy margins.
More importantly, this cost structure is what makes white-label economics work for our customers. When you're reselling video at $0.05/minute and your infrastructure cost is $0.002/minute, you have a 96% gross margin on media delivery. That's the kind of margin that lets SaaS companies build sustainable businesses on top of our platform.
HIPAA and Security
For healthcare customers, the security story is equally important. Rust's memory safety guarantees eliminate entire categories of vulnerabilities — buffer overflows, use-after-free, data races — that have historically plagued C-based media servers. Every media packet relayed through RustTURN is encrypted with DTLS-SRTP (AES-256-GCM), and our TURN authentication uses long-term credentials with HMAC-SHA256.
We maintain a BAA (Business Associate Agreement) that covers all media relay infrastructure, and RustTURN's architecture ensures that no unencrypted PHI ever touches disk. Relay buffers are zeroed on deallocation, media streams are never logged, and our audit trail captures connection metadata without capturing content.
What's Next
RustTURN is now in its third major version and handles all media relay for V100's platform — conferencing, AI studio recordings, calendar-based video meetings, and marketplace-deployed applications. We're currently working on post-quantum key exchange for DTLS handshakes (using Kyber/ML-KEM) and selective forwarding unit (SFU) capabilities that will let RustTURN handle simulcast and SVC layer selection natively, further reducing our architecture's complexity.
Building your own media server is not for the faint of heart. It took our team over a year of focused work to reach production stability. But for a platform that needs to handle millions of concurrent minutes at enterprise-grade reliability, owning the media layer is the single highest-leverage infrastructure investment we've made. Every improvement to RustTURN — every microsecond of latency reduction, every megabyte of memory saved — flows directly to our customers as lower costs and better call quality.
That's the V100 philosophy: build the hard things from scratch, in Rust, and own the economics from silicon to session.
Build on RustTURN
V100's entire media infrastructure — including RustTURN — is available through a single API. Start a free trial and have your first video call running in under 5 minutes.
Start Free Trial