What You Will Build
By the end of this tutorial, you will have a React component that creates peer-to-peer video calls using WebRTC. The finished component handles the full lifecycle: creating a meeting, exchanging signaling messages over WebSocket, negotiating ICE candidates, and rendering both local and remote video feeds. No third-party UI library required — just React hooks and the browser's native RTCPeerConnection API.
React App V100 API Remote Peer | | | |--- POST /api/meetings --->| | |<--- { meetingId, token } ---| | | | | |--- WSS connect ---------->| | | |<--- WSS connect ----------| | | | |--- SDP offer ------------>|--- SDP offer ------------>| |<--- SDP answer -----------|<--- SDP answer ----------| | | | |--- ICE candidates ------->|--- ICE candidates ------->| |<--- ICE candidates -------|<--- ICE candidates ------| | | | |=========== P2P VIDEO STREAM (WebRTC) ===========|
Prerequisites
- Node.js 18+ and npm (or yarn/pnpm)
- React 18+ — works with Create React App, Vite, Next.js, or any React setup
- A V100 API key — free tier gives you 100 API calls per month, no credit card required
This tutorial assumes you have an existing React project. If you are starting from scratch:
Step 1 — Get Your API Key
Sign up at app.v100.ai and grab your API key from the dashboard. The free tier includes 100 API calls per month, which is enough for development and testing. No credit card required.
Store the key in your environment variables. Never commit API keys to source control.
Security note: In production, create meetings from your backend server, not the browser. The API key should never be exposed in client-side code. This tutorial uses REACT_APP_ for simplicity during development. See the server-side guide for production patterns.
Step 2 — Create the Video Component
This is the core of the integration. The VideoCall component handles everything: creating the meeting, connecting to the signaling server, setting up the peer connection, and rendering video. Drop this into your project and it works.
That is the entire video calling component. Let us walk through what happens when a user clicks Start Call:
- Create a meeting — a
POSTto/api/meetingsreturns ameetingIdand a short-livedtokenfor WebSocket auth. - Fetch ICE servers —
/api/webrtc/ice-serversreturns STUN and TURN server credentials. V100 runs RustTURN, a Rust-based TURN server, for NAT traversal. - Capture media —
getUserMediagrabs the camera and microphone. The stream is assigned to the local<video>element. - Create peer connection — the
RTCPeerConnectionis configured with ICE servers from V100. Local tracks are added. - Connect to signaling — a WebSocket connection to
wss://api.v100.ai/ws/signalinghandles SDP offer/answer exchange and ICE candidate trickle. - Peer joins — when a second participant connects to the same meeting, the signaling server sends a
peer-joinedevent. The first peer creates an SDP offer, and the standard WebRTC negotiation completes.
Share the meeting link. To let someone else join, share the meetingId and have them connect to the same signaling WebSocket with that ID. In production, you would generate a join URL like https://yourapp.com/call/{meetingId} and pass it via your own UI.
Step 3 — Add Real-Time Transcription
V100 provides server-side transcription in 40+ languages. You enable it with a single API call when creating the meeting, and captions arrive over the same WebSocket connection you are already using for signaling.
Then add a caption handler to your existing WebSocket onmessage callback and a state variable to display the text:
That is it. Live captions appear as an overlay on the video call. The transcription runs server-side on V100's infrastructure — no client-side speech recognition, no additional dependencies, and no extra cost on the free tier.
Step 4 — Add Recording
Recording stores the meeting video to S3. You start and stop recording with API calls, and retrieve the recording URL when the meeting ends.
Add record/stop buttons to your UI next to the mute and camera controls. The recording is processed server-side and delivered as an MP4 via a signed S3 URL. You can also set up a webhook to receive a notification when processing completes — see the webhook docs.
Full transcript included. If transcription was enabled during the meeting, the recording response also includes a transcriptUrl with the complete text transcript in JSON and SRT formats. Use it for search, compliance, or AI-generated meeting summaries.
Going Further
The component above gives you a working video call in under 80 lines. Here is what you can add next, each with a single API call or config option:
- Virtual backgrounds — set
virtualBackground: { type: 'blur' }or{ type: 'image', url: '...' }in the meeting config. Processed client-side using TensorFlow.js body segmentation. - Noise suppression — enabled by default. V100 applies AI-based noise suppression on audio tracks. To disable:
audio: { noiseSuppression: false }. - AI meeting summaries — after the meeting ends, hit
POST /api/meetings/{id}/summaryto generate an AI summary from the transcript. Returns key points, action items, and decisions. - Screen sharing — replace the camera stream with
navigator.mediaDevices.getDisplayMedia()and callpc.getSenders()[0].replaceTrack(screenTrack). No API change needed. - Multi-participant — for more than 2 participants, V100 automatically switches to an SFU topology. The API is the same; the server handles routing.
V100 vs Building from Scratch
You could build all of this yourself. WebRTC is an open standard. TURN servers are open-source. Transcription models are available on Hugging Face. Here is what that actually looks like:
| Build from Scratch | V100 API | |
|---|---|---|
| Time to first video call | 3–6 months | 5 minutes |
| TURN server setup | Deploy coturn, configure TLS, monitor uptime | Included (RustTURN) |
| Signaling server | Build WebSocket relay, handle reconnects, scale | Managed (wss://api.v100.ai) |
| Recording | FFmpeg pipeline, S3 storage, transcoding | One API call |
| Transcription | Whisper deployment, GPU infra, 40+ language support | One config flag |
| NAT traversal reliability | Your problem | 99.9% connection rate |
| Post-quantum encryption | Implement ML-KEM + ML-DSA yourself | Default on every call |
| Infrastructure cost | $500–$2,000+/mo minimum | Free tier, then usage-based |
| Ongoing maintenance | WebRTC spec changes, browser updates, security patches | Managed by V100 |
Building a production-grade video conferencing system from scratch is a 3–6 month project for a team of 2–3 engineers. Maintaining it is a permanent headcount. V100 gives you the same capabilities with a single React component and a few API calls.
Pricing
V100 offers a free tier with 100 API calls per month — enough for development, testing, and small projects. No credit card required. For production workloads:
- Free — 100 API calls/month, 720p video, transcription included. Perfect for prototyping.
- Pro — Usage-based pricing, 1080p video, recording, virtual backgrounds, priority TURN servers, AI summaries. Starts at $0.004 per participant-minute.
- Enterprise — Volume discounts, dedicated TURN infrastructure, SLA, custom integrations.
See the full pricing page for details.
Start Building for Free
Get your API key and make your first video call in under 5 minutes. No credit card. No sales call. Just code.
Get Your Free API Key