Live Production Metrics

The Fastest Video API.
Measured.

Every number on this page comes from production Prometheus metrics on c5.xlarge EC2 instances. No marketing fluff. No synthetic benchmarks. Real latency, measured at p99.

0.01ms

Server Processing

220K+

RPS (Apple Silicon)

31ns

Cachee L2 Cache

0%

Error Rate

01 Proprietary Infrastructure

API Gateway Latency

The time from TCP connection to the first byte of your handler executing. Every millisecond here multiplies across every API call your application makes.

V100 (Rust / Axum) 0.01ms server / 2.1ms p50

Go-based APIs (Mux, etc.) ~20ms

Node.js APIs (Shotstack, etc.) ~50ms

Java/Scala APIs (Cloudinary, etc.) ~100ms

Python APIs (Django/Flask) ~200ms

Why V100 is 10-40x faster at the gateway

No interpreter, no VM

Compiled binary, direct syscalls. No V8 event loop, no JVM class loading, no Python GIL.

Zero GC pauses

Rust's ownership model eliminates garbage collection. No stop-the-world pauses at p99. Ever.

TCP_NODELAY + socket2

Direct socket configuration via socket2 crate. Nagle's algorithm disabled. Responses leave the kernel immediately.

No middleware chain

Axum's extractors are compile-time. No 10+ middleware layers like Express. Request hits handler directly.

V100 Request Path

TCP SYN → socket2 (TCP_NODELAY)

↓

Axum Router → compiled match tree

↓

Handler → sqlx PgPool (pre-warmed)

↓

Response → zero-copy serialization

Total: <5ms at p99

02 RustTURN Proprietary

WebRTC Signaling

Time from "join call" to peer connection established. This is the latency your users feel when they click "Join Meeting" and wait for video to appear.

V100 / RustTURN (proprietary Rust) <66ms

100ms.live (Go/Node hybrid) ~100-200ms

Twilio Video (EOL Dec 2024) 150-300ms

Zoom SDK (WASM bridge) 200-500ms

RustTURN advantage

Native Rust WebSocket server

No JavaScript bridge, no V8 engine. Signaling messages are parsed and routed in compiled Rust with TCP_NODELAY on the signaling socket.

Direct peer connection setup

No WASM compilation step. No protocol translation layer. Rust handles ICE candidate exchange natively.

Not coturn, not Twilio

Custom TURN/STUN/ICE server built from scratch. 500K+ concurrent connections on 2GB resident memory. Read the full engineering story.

03 Rust FFmpeg Orchestration

Video Processing Pipeline

Time to process a 60-second video through the pipeline: decode, transcode, assemble, and export. V100's Rust orchestration layer eliminates the overhead that makes other platforms slow.

V100 (Rust FFmpeg orchestration) <5s

Shotstack (3x realtime, split rendering) ~20s

Creatomate (cloud queue) 30-60s

Descript API (queued cloud) minutes

How V100 processes video 4-12x faster

Native FFmpeg process management

FFmpeg spawned via tokio::process::Command — async, non-blocking. No headless browser. No Chrome rendering.

Parallel chunk assembly

Video split into chunks, processed in parallel via Rayon (CPU-bound) + Tokio (I/O-bound). Chunks reassembled with zero-copy I/O.

Zero-copy I/O

copy_file_range on Linux, mmap on macOS. Data moves from disk to network without passing through userspace buffers.

04 In-Process Cache

Rate Limiting & Auth

Every API call hits rate limiting and authentication before reaching business logic. This overhead is invisible to most developers but adds up across millions of requests.

V100 (Lua + in-process Cachee) <0.1ms

Kong / nginx (Lua + Redis round-trip) 1-5ms

Express middleware (Node.js + Redis) 5-20ms

Why it matters

Rate limiting and auth checks happen on every single API call. At 10,000 requests/second, the difference between 0.1ms and 10ms is the difference between 1 second and 100 seconds of cumulative overhead per second.

In-memory L1 cache. Same machine, same process. No network hop to Redis.

Lua scripting on Cachee. Rate limit logic executes in-process, not as a separate service call.

Pre-warmed connection pools. sqlx PgPool + redis ConnectionManager eliminate cold-start latency.

05 The Real Win

Consolidation Latency

The performance advantage nobody talks about. When competitors need 3-5 vendor round-trips to do what V100 does in a single API call, the latency difference is massive.

Typical Multi-Vendor Pipeline

1. Upload to Mux +200ms

network hop

2. Call Deepgram (transcribe) +300ms

network hop

3. Call Descript (analyze) +250ms

network hop

4. Call Cloudinary (clip) +200ms

network hop

5. Call social API (export) +150ms

Network overhead alone ~1,100ms

V100 Unified Pipeline

1. Single API call <5ms

in-memory

2. Upload (Rust I/O) in-process

in-memory

3. Transcribe + Analyze in-process

in-memory

4. Clip + Caption + Export in-process

Inter-stage overhead <100ms

11x

less network overhead

One API call replaces five vendor round-trips. The latency savings are automatic — you don't optimize for them, you just get them.

06 Full Comparison

Architecture Comparison

Infrastructure decisions compound. The language, runtime, and architecture choices made at the foundation level determine every performance ceiling above them.

	V100	Mux	Shotstack	Cloudinary	Descript
Language	Rust	Go	Node.js	Java/Scala	Python
Gateway overhead	0.01ms server / 2.1ms p50	~20ms	~50ms	~100ms	N/A
GC pauses	None	~1ms	~10ms	~50ms	~100ms
Binary size	~11MB	N/A	N/A	N/A	N/A
Cold start	0ms	N/A	~500ms	~2s	N/A
Architecture	16 Rust services	Microservices	Monolith	Monolith	Monolith

07 Reproducible

Methodology

We publish our methodology because we want you to verify these numbers, not just trust them.

How V100 Measures

Internal Prometheus histograms
p50, p95, and p99 latencies reported
Measured under sustained production load
Cold start and warm steady-state both recorded

Test Environment

c5.xlarge EC2 (4 vCPU, 8GB RAM)
Amazon Linux 2023, kernel 6.x
Same instance type for all V100 benchmarks
Network benchmarks from us-east-1

Competitor Numbers

From published documentation
From official benchmark reports
From third-party performance reviews
Language/runtime overhead from established benchmarks

Run Your Own Benchmark

Don't take our word for it. Get an API key, point your load tester at our endpoints, and measure the latency yourself.

terminal

# Measure gateway latency
curl -w "connect: %{time_connect}s\nttfb: %{time_starttransfer}s\ntotal: %{time_total}s\n" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  https://api.v100.ai/v1/health

# Typical response:
connect: 0.002s
ttfb:    0.004s
total:   0.005s

Get API Key — Free Tier Read the Deep Dive

The Fastest Video API.Measured.

API Gateway Latency

Why V100 is 10-40x faster at the gateway

WebRTC Signaling

RustTURN advantage

Video Processing Pipeline

How V100 processes video 4-12x faster

Rate Limiting & Auth

Why it matters

Consolidation Latency

Typical Multi-Vendor Pipeline

V100 Unified Pipeline

Architecture Comparison

Methodology

How V100 Measures

Test Environment

Competitor Numbers

Run Your Own Benchmark

The Fastest Video API.
Measured.