~150
Lines of Code
5 API
Calls Total
<2s
Transcription Start
Free
100 Recordings/mo

What You Will Build

By the end of this tutorial, you will have a fully functional Loom alternative embedded in your own application. The user clicks a button, records their screen with an optional camera overlay, stops recording, and gets back a shareable link with an auto-generated transcript and AI summary. No video infrastructure to manage, no transcription pipeline to deploy, no CDN to configure. V100 handles all of it through a single API.

The async video recording API is the foundation of any screen recording product, whether you are building an internal tool for engineering teams, a customer support widget, a sales outreach platform, or a standalone Loom alternative. Here is everything your Loom clone will include:

Screen + camera capture (getDisplayMedia)
Client-side recording (MediaRecorder)
Server-side processing with S3 storage
Auto-transcription with word-level timestamps
AI summary generation
Shareable link with embed player
Virtual backgrounds
Noise suppression
Multi-platform publishing
IndexedDB offline buffer
Architecture
Browser                   V100 API                  Storage
    |                          |                          |
    |-- getDisplayMedia() ---->|                          |
    |-- getUserMedia() ------->|                          |
    |-- MediaRecorder -------->|                          |
    |                          |                          |
    |-- POST /upload --------->|--- encode + store ------->|
    |<-- { videoId } ----------|                          |
    |                          |                          |
    |-- POST /transcribe ----->|--- whisper pipeline ----->|
    |<-- { transcript } -------|                          |
    |                          |                          |
    |-- POST /summarize ------>|--- LLM summary --------->|
    |<-- { summary, share } ---|                          |
    |                          |                          |
    |=========== SHAREABLE LINK (CDN-backed) ===========|
            

Step 1 — Record Screen + Camera

The first step in building a Loom clone is capturing the user's screen alongside their camera feed. The browser's getDisplayMedia API handles screen capture, and getUserMedia grabs the webcam. You combine both into a single MediaRecorder instance that writes chunks to an array (or IndexedDB for longer recordings) until the user clicks stop.

This is the client-side recording pattern that every async video messaging tool uses. The video never leaves the browser until the user explicitly uploads it, which means you get instant preview playback and no server costs during recording.

recorder.js — screen + camera capture
const API_BASE = 'https://api.v100.ai'; const API_KEY = 'v100_sk_your_api_key_here'; let mediaRecorder; let recordedChunks = []; async function startRecording() { // 1. Capture the screen (tab, window, or entire display) const screenStream = await navigator.mediaDevices.getDisplayMedia({ video: { width: 1920, height: 1080, frameRate: 30 }, audio: true, // capture system audio (tab audio) }); // 2. Capture the webcam for the facecam bubble const cameraStream = await navigator.mediaDevices.getUserMedia({ video: { width: 320, height: 320, facingMode: 'user' }, audio: { echoCancellation: true, noiseSuppression: true }, }); // 3. Merge screen + camera + mic into one stream const combined = new MediaStream([ ...screenStream.getVideoTracks(), ...cameraStream.getAudioTracks(), ]); // 4. Record with MediaRecorder recordedChunks = []; mediaRecorder = new MediaRecorder(combined, { mimeType: 'video/webm;codecs=vp9,opus', videoBitsPerSecond: 2_500_000, }); mediaRecorder.ondataavailable = (e) => { if (e.data.size > 0) recordedChunks.push(e.data); }; mediaRecorder.start(1000); // chunk every 1 second // Store camera stream for facecam preview document.getElementById('facecam').srcObject = cameraStream; } function stopRecording() { return new Promise((resolve) => { mediaRecorder.onstop = () => { const blob = new Blob(recordedChunks, { type: 'video/webm' }); resolve(blob); }; mediaRecorder.stop(); }); }

When the user clicks Stop, the stopRecording function returns a Blob containing the full video. The facecam stream renders into a small circular <video> element overlaid on the screen preview — the same picture-in-picture pattern Loom uses. V100 composites the camera overlay server-side during processing so the final video is a single clean MP4.

Offline buffer. For recordings longer than 5 minutes, write chunks to IndexedDB instead of holding them in memory. The V100 SDK includes a ChunkStore utility that handles this automatically — it buffers to IndexedDB during recording and reassembles the blob on stop. This prevents browser tab crashes on long recordings.

Step 2 — Upload to V100

Once you have the recorded blob, upload it to V100's server-side recording pipeline. V100 handles encoding, transcoding to MP4, S3 storage, and CDN distribution. A single POST to /api/recordings/upload with the video file returns a videoId you will use for all subsequent operations.

upload.js — send the recording to V100
async function uploadRecording(blob) { const form = new FormData(); form.append('file', blob, 'recording.webm'); form.append('title', 'Screen Recording'); form.append('visibility', 'unlisted'); // 'public', 'unlisted', or 'private' const res = await fetch(`${API_BASE}/api/recordings/upload`, { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, }, body: form, }).then(r => r.json()); // res = { // videoId: "rec_abc123", // status: "processing", // duration: 124.5, // size: 15_400_000, // storageUrl: "https://cdn.v100.ai/recordings/rec_abc123.mp4" // } return res; }

The upload endpoint accepts WebM, MP4, and MOV files up to 2 GB. V100 transcodes everything to H.264 MP4 for universal playback. Processing takes 10 to 30 seconds for a typical 5-minute recording. You can poll the status or set up a webhook to get notified when processing completes.

Server-side uploads in production. In production, generate a signed upload URL from your backend using POST /api/recordings/upload-url. The client uploads directly to the signed URL without exposing your API key in the browser. This is the same pattern S3 presigned URLs use, and it is strongly recommended for any public-facing application.

Step 3 — Auto-Transcribe

Transcription is what separates a real Loom alternative from a simple screen recorder. V100's transcription API produces word-level timestamps, speaker identification, and paragraph segmentation. You get back structured JSON that powers searchable video, auto-generated captions, and the AI summary in the next step.

transcribe.js — request transcription
async function transcribeRecording(videoId) { const res = await fetch(`${API_BASE}/api/recordings/${videoId}/transcribe`, { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ language: 'en', // or 'auto' for language detection wordTimestamps: true, // word-level timing for highlights speakerLabels: true, // identify different speakers paragraphs: true, // auto-segment into paragraphs }), }).then(r => r.json()); // res = { // transcriptId: "tx_def456", // status: "completed", // language: "en", // confidence: 0.97, // duration: 124.5, // text: "Hey team, I wanted to walk you through...", // words: [ // { word: "Hey", start: 0.24, end: 0.48, confidence: 0.99 }, // { word: "team", start: 0.52, end: 0.81, confidence: 0.98 }, // ... // ], // paragraphs: [ // { start: 0.24, end: 15.7, text: "Hey team, I wanted to..." }, // ... // ], // srtUrl: "https://cdn.v100.ai/transcripts/tx_def456.srt" // } return res; }

The word-level timestamps are what make this powerful. You can build a clickable transcript where clicking any word jumps the video to that exact moment — the same UX that makes Loom's viewer so useful. The srtUrl gives you a ready-made subtitle file you can pass to any video player for burned-in captions.

Transcription runs on V100's server-side infrastructure. A 5-minute video typically transcribes in under 10 seconds. Supported languages include English, Spanish, French, German, Japanese, Korean, Portuguese, Chinese, and 30+ more. Set language: 'auto' to let the API detect the language automatically.

Step 4 — Generate Summary

Once the transcript exists, you can generate an AI summary with a single API call. The summary includes key points, action items, and a one-paragraph overview — exactly what recipients need when they do not have time to watch the full video. This is the feature that turns a screen recorder into an async communication tool.

summarize.js — AI-generated summary
async function generateSummary(videoId) { const res = await fetch(`${API_BASE}/api/recordings/${videoId}/summarize`, { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ style: 'professional', // 'professional', 'casual', or 'detailed' includeActionItems: true, includeChapters: true, // auto-generate chapter markers }), }).then(r => r.json()); // res = { // summary: "Eric walks through the new dashboard redesign...", // keyPoints: [ // "Dashboard load time reduced from 3.2s to 0.8s", // "New chart component replaces legacy D3 implementation", // "Rollout planned for next sprint" // ], // actionItems: [ // { assignee: "Sarah", task: "Review PR #847 by Friday" }, // { assignee: "Team", task: "Test on staging environment" } // ], // chapters: [ // { start: 0, title: "Introduction" }, // { start: 28.4, title: "Dashboard Performance" }, // { start: 67.1, title: "New Chart Component" }, // { start: 98.3, title: "Rollout Plan" } // ] // } return res; }

The chapter markers are generated from topic shifts in the transcript. Display them as a clickable timeline in your video player so viewers can jump to the section they care about. Action items include auto-detected assignees when speaker labels are available from the transcription step.

Chain it automatically. You can request transcription and summary in a single upload call by passing transcribe: true and summarize: true in the upload body. V100 will run the full pipeline and fire a webhook when everything is ready. No polling required.

Step 5 — Share

Every recording gets a shareable link that works instantly — no sign-up required for the viewer. The link loads a hosted player with the video, transcript, summary, and chapter navigation. You can also embed the player in your own app with an iframe or use the raw URLs to build a completely custom viewer.

share.js — get shareable link and publish
async function getShareLink(videoId) { const res = await fetch(`${API_BASE}/api/recordings/${videoId}/share`, { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ visibility: 'unlisted', // link-only access allowDownload: true, expiresIn: '30d', // auto-expire after 30 days password: null, // optional password protection notifyOnView: true, // webhook when someone watches }), }).then(r => r.json()); // res = { // shareUrl: "https://watch.v100.ai/s/rec_abc123", // embedHtml: '<iframe src="https://watch.v100.ai/embed/rec_abc123"...>', // thumbnailUrl: "https://cdn.v100.ai/thumbs/rec_abc123.jpg", // mp4Url: "https://cdn.v100.ai/recordings/rec_abc123.mp4", // gifPreview: "https://cdn.v100.ai/previews/rec_abc123.gif" // } return res; } // Publish to multiple platforms at once async function publishRecording(videoId, platforms) { const res = await fetch(`${API_BASE}/api/recordings/${videoId}/publish`, { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ platforms: platforms, // ['slack', 'notion', 'youtube'] message: 'New recording: Dashboard Redesign Walkthrough', }), }).then(r => r.json()); return res; }

The shareUrl is a hosted page with a video player, full transcript, AI summary, and chapter navigation. Viewers do not need an account. The embedHtml gives you a responsive iframe you can drop into Notion, Confluence, or any web page. The gifPreview is a 5-second animated thumbnail — useful for Slack or email previews where video links need a visual hook.

Multi-platform publishing pushes the video to connected integrations in a single call. Connect Slack, Notion, YouTube, or custom webhook destinations through the V100 dashboard, then pass the platform names to the publish endpoint. Each platform gets the video in its native format — Slack gets a rich unfurl with the GIF preview, Notion gets an embedded block, YouTube gets a properly formatted upload.

Going Further

The five steps above give you a complete Loom clone. Here is how to polish it into a production-grade async video messaging tool:

Trim Silence

V100's silence removal API detects and trims dead air from recordings automatically. Add trimSilence: true to the upload options and the API strips pauses longer than 2 seconds. For a typical 5-minute recording, this usually shaves off 30 to 60 seconds of awkward silence without any manual editing.

Auto-Captions

The transcription from Step 3 powers burned-in captions. Pass captions: { burn: true, style: 'modern' } to the upload options and V100 renders word-highlighted captions directly into the MP4. You get the same animated-word caption style that performs well on social media, with no post-processing on your end.

Virtual Backgrounds

Replace or blur the user's background in the camera feed. V100 processes this client-side using body segmentation so there is zero latency. Pass virtualBackground: { type: 'blur', intensity: 0.7 } or { type: 'image', url: 'https://...' } when initializing the camera stream. The composited result is what gets recorded — no post-processing needed.

Noise Suppression

AI-based noise suppression is enabled by default on all audio tracks. It removes keyboard clicks, fan noise, background chatter, and other common distractions. To configure sensitivity or disable it entirely, pass audio: { noiseSuppression: false } in the camera stream options.

Viewer Analytics

Track who watched, how far they got, and whether they clicked any chapters or action items. The notifyOnView flag from the share step fires a webhook on each view, and the GET /api/recordings/{id}/analytics endpoint returns aggregate view counts, average watch time, and drop-off points.

Loom vs Building Your Own

Loom costs $12.50 per user per month on the Business plan. That adds up fast for teams. Building your own async video tool gives you full control over the UX, data ownership, and no per-seat licensing. Here is what each approach actually looks like:

Loom (SaaS) DIY from Scratch V100 API
Time to first recording Instant (download app) 2–4 months 1 hour
Custom branding Enterprise plan only Full control Full control
Transcription Included (English-focused) Deploy Whisper, manage GPUs One API call, 40+ languages
AI summaries Business plan ($12.50/user/mo) Build LLM pipeline yourself One API call
Data ownership Loom hosts everything Your infrastructure Your S3 bucket option
Embed in your product Limited embed options Full integration Full integration
Video storage Loom servers S3 + CDN + transcoding ($$$) Managed (CDN included)
Silence removal Manual trim only FFmpeg + ML pipeline One config flag
Per-seat cost $12.50/user/mo (Business) Engineering time + infra Usage-based, no per-seat
Maintenance Managed by Loom Permanent headcount Managed by V100

The sweet spot for most teams is building on V100. You get full control over the user experience and branding — it is your product, not a Loom embed — without the 2 to 4 months of infrastructure work. The video messaging API handles the hard parts (transcoding, transcription, CDN, AI summaries) and you focus on the product layer that differentiates your tool.

For teams already paying for Loom Business, the math is straightforward. A 50-person team on Loom costs $625 per month. V100's usage-based pricing typically comes in at 60 to 80% less for equivalent usage, and you own the UX completely.

Pricing

V100 offers a free tier with 100 recordings per month — enough for development, testing, and small teams. No credit card required. For production workloads:

See the full pricing page for details.

Start Building Your Loom Clone

Get your API key, record your first screen capture, and generate a shareable link in under an hour. No credit card. No sales call. Just code.

Get Your Free API Key