How do I record my screen and webcam at the same time?

Use a tool that supports picture-in-picture (PiP) recording. OBS Studio (free, desktop) lets you create a scene with your screen as the main source and webcam as an overlay. Loom ($12.50/month) provides a browser extension with one-click screen+webcam recording. For developers, the browser's getDisplayMedia and getUserMedia APIs let you capture screen and webcam simultaneously and combine them on a canvas. V100's recording API handles the capture, compositing, transcription, and hosting as a single API integration.

What is the best free screen and webcam recorder?

OBS Studio is the best free option for desktop recording. It supports screen+webcam with customizable PiP layouts, records locally with no watermarks, and outputs high-quality video. The downside is that OBS has a steep learning curve and no built-in sharing or transcription. For browser-based free recording, you can use the Web APIs (getDisplayMedia + getUserMedia) with about 50 lines of JavaScript, but this requires building the recording UI yourself.

Can I get automatic transcription while recording my screen?

Yes. V100's recording API captures screen and webcam, then automatically transcribes the audio when recording completes. The transcription includes word-level timestamps and speaker diarization. You can use the transcript for searchable video content, auto-generated captions, meeting notes, and edit-by-transcript. Loom also offers basic transcription, but without word-level timestamps or API access.

How do I build screen recording into my own app?

Use the browser's getDisplayMedia API for screen capture and getUserMedia for webcam capture. Combine both streams on an HTML canvas using requestAnimationFrame. Record the canvas output using MediaRecorder. For production-quality recording with transcription, hosting, and sharing, use V100's recording API which handles capture, compositing, encoding, storage, transcription, and shareable link generation.

What PiP layouts work best for screen and webcam recordings?

For tutorials and demos, a small circular webcam overlay in the bottom-left or bottom-right corner (15-20% of frame) works best because it keeps the face visible without blocking content. For presentations, a side-by-side layout (70% slides, 30% speaker) maintains speaker presence. For sales demos, a larger webcam overlay (25-30%) increases personal connection. For code walkthroughs, bottom-left circle keeps the webcam away from the code area.

How to Record Screen and Webcam at the Same Time

Loom proved that screen + webcam recording is the most effective format for async communication. A face in the corner of the screen creates a personal connection that a plain screen recording lacks. Viewers are 2x more likely to watch a screen recording to completion when they can see the presenter's face. This is why every modern tutorial, product demo, sales walkthrough, and course lecture uses the picture-in-picture (PiP) format: screen content fills the frame, and a circular or rectangular webcam overlay sits in one corner.

The challenge is that recording screen and webcam simultaneously requires capturing two separate media streams (one from the display, one from the camera), compositing them into a single frame, encoding the combined output, and ideally transcribing the audio and generating a shareable link. Different tools handle different parts of this pipeline, and the right choice depends on whether you are a solo user, a team, or a developer building recording into a product.

This guide covers four methods, from the simplest free tool to full API integration, with working code samples and an honest comparison of features, limitations, and pricing.

Why PiP Recording Works: The Data

The effectiveness of screen + webcam recording is not subjective. Multiple studies and platform analyses confirm that adding a face to a screen recording significantly improves engagement and comprehension.

PiP recording impact

2x higher completion rate

Loom's internal data shows that videos with a webcam overlay have approximately 2x the completion rate of screen-only recordings. The face creates social pressure (someone is talking to you) that keeps viewers watching.

Higher trust in sales contexts

Sales teams using video prospecting (sending screen+webcam recordings instead of emails) report 3x higher response rates. The face creates a personal connection that text cannot replicate. Vidyard, Loom, and Sendspark all recommend PiP format for sales outreach.

Better retention in educational content

Research on video lectures shows that students retain information better when they can see the instructor's face alongside the content. The face provides nonverbal cues (emphasis, confusion, excitement) that enhance understanding of the screen content.

Async communication replaces meetings

Teams using PiP screen recordings for status updates, code reviews, and design feedback report 30-50% fewer meetings. A 3-minute recording replaces a 15-minute meeting because the presenter can be more concise, and viewers can watch at 1.5-2x speed.

Method 1: OBS Studio (Free, Desktop)

OBS Studio is a free, open-source desktop application for video recording and live streaming. It is the most powerful free option for screen + webcam recording, offering complete control over layouts, encoding settings, and output formats. OBS is available on Windows, macOS, and Linux.

To record screen and webcam together in OBS, you create a Scene with two Sources: a Display Capture (or Window Capture) for the screen and a Video Capture Device for the webcam. You position and resize the webcam overlay to create the PiP layout. OBS records the combined output as a single video file.

OBS excels at customization. You can create any layout: circular webcam in the corner, side-by-side, picture-in-picture with custom borders, or full-screen webcam with screen as background. You control the encoding codec (H.264, H.265, AV1), bitrate, resolution, and frame rate. For creators who need maximum quality and do not mind a learning curve, OBS is the best free option.

The downside is complexity. OBS has a steep learning curve and is designed for power users. Setting up a clean PiP layout for the first time takes 15-30 minutes of configuration. There is no built-in sharing (you get a local video file), no automatic transcription, no AI features, and no way to generate a shareable link. After recording, you need to manually upload the file, host it, and share the URL. For teams and developers, OBS does not have an API.

Method 2: Loom ($12.50/month)

Loom is the category-defining tool for async video communication. Its browser extension provides one-click screen + webcam recording with automatic cloud hosting and shareable links. Click the extension, choose "Screen + Camera", hit record, and when you stop, Loom instantly generates a shareable link. The entire process takes seconds.

Loom's strengths are speed and simplicity. There is zero configuration. The PiP layout is a circular webcam overlay in the bottom-left corner, and it works immediately. Recordings are automatically hosted on Loom's cloud with a shareable link, viewer analytics (who watched, how far they got), and basic transcription.

The limitations are flexibility and pricing. Loom's PiP layout is fixed (you cannot choose the corner, size, or shape of the webcam overlay in most plans). The free tier limits recordings to 5 minutes. The Business plan at $12.50/user/month removes the limit but adds up quickly for teams. There is no API for developers building recording into their own products. And the transcription is basic: no word-level timestamps, no speaker diarization, no edit-by-transcript.

Method 3: Browser-Native JavaScript (Free)

Modern browsers provide two APIs that enable screen + webcam recording without any external tools. getDisplayMedia() captures the screen, and getUserMedia() captures the webcam. By combining both streams on an HTML Canvas and recording the canvas output with MediaRecorder, you can build a complete PiP recorder in approximately 80 lines of JavaScript.

pip-recorder.js

// Screen + Webcam PiP Recorder using Browser APIs
async function startPiPRecording() {
  // 1. Capture screen
  const screenStream = await navigator.mediaDevices.getDisplayMedia({
    video: { width: 1920, height: 1080 },
    audio: true                  // System audio (if supported)
  });

  // 2. Capture webcam
  const webcamStream = await navigator.mediaDevices.getUserMedia({
    video: { width: 320, height: 320, facingMode: 'user' },
    audio: true                  // Microphone
  });

  // 3. Create canvas for compositing
  const canvas = document.createElement('canvas');
  canvas.width = 1920;
  canvas.height = 1080;
  const ctx = canvas.getContext('2d');

  // Create video elements for streams
  const screenVideo = document.createElement('video');
  screenVideo.srcObject = screenStream;
  screenVideo.play();

  const webcamVideo = document.createElement('video');
  webcamVideo.srcObject = webcamStream;
  webcamVideo.play();

  // 4. Composite loop: draw screen + webcam PiP
  function draw() {
    // Full-screen: screen capture
    ctx.drawImage(screenVideo, 0, 0, 1920, 1080);

    // PiP overlay: circular webcam in bottom-left
    const pipSize = 200;
    const pipX = 40;
    const pipY = 1080 - pipSize - 40;

    ctx.save();
    ctx.beginPath();
    ctx.arc(pipX + pipSize/2, pipY + pipSize/2, pipSize/2, 0, Math.PI * 2);
    ctx.clip();
    ctx.drawImage(webcamVideo, pipX, pipY, pipSize, pipSize);
    ctx.restore();

    // Border around PiP
    ctx.strokeStyle = '#6366f1';
    ctx.lineWidth = 3;
    ctx.beginPath();
    ctx.arc(pipX + pipSize/2, pipY + pipSize/2, pipSize/2, 0, Math.PI * 2);
    ctx.stroke();

    requestAnimationFrame(draw);
  }
  draw();

  // 5. Record the canvas output
  const canvasStream = canvas.captureStream(30);  // 30fps

  // Mix microphone audio into canvas stream
  const audioTrack = webcamStream.getAudioTracks()[0];
  canvasStream.addTrack(audioTrack);

  const recorder = new MediaRecorder(canvasStream, {
    mimeType: 'video/webm;codecs=vp9'
  });

  const chunks = [];
  recorder.ondataavailable = (e) => chunks.push(e.data);
  recorder.onstop = () => {
    const blob = new Blob(chunks, { type: 'video/webm' });
    const url = URL.createObjectURL(blob);
    console.log('Recording ready:', url);
  };
  recorder.start();

  return { recorder, screenStream, webcamStream };
}

This approach gives you complete control over the recording experience. You choose the PiP position, size, shape, and border style. You can add your logo, a recording timer, or click indicators. The recording happens entirely in the browser with no external dependencies.

The limitations of browser-native recording are significant for production use. WebM is the only widely supported output format (MP4 requires server-side remuxing). Video quality depends on the user's hardware and browser. There is no built-in hosting, sharing, or transcription. You need server infrastructure to store recordings, generate shareable links, and process the video. This is where V100's API fills the gap.

Method 4: V100 Screen Recording API

V100 provides a recording API that handles the entire pipeline: capture screen and webcam streams from the browser, upload to V100's infrastructure, composite the PiP layout server-side, transcode to MP4, auto-transcribe the audio, generate a shareable link, and optionally produce an AI summary of the recording's content.

The browser-side integration captures the raw streams and sends them to V100. V100 handles compositing, encoding, storage, transcription, and sharing. This means your recording quality is consistent regardless of the user's hardware, the output is always MP4 (universally playable), and every recording gets automatic transcription with word-level timestamps.

v100-recorder.js

import { V100 } from 'v100-sdk';
const v100 = new V100('YOUR_API_KEY');

// Start a PiP recording session
const session = await v100.recording.start({
  screen: true,                    // Capture screen via getDisplayMedia
  webcam: true,                    // Capture webcam via getUserMedia
  audio: true,                     // Capture microphone
  layout: {
    type: 'pip',                   // Picture-in-picture
    pip_position: 'bottom-left',  // Corner placement
    pip_size: 200,                // Webcam overlay diameter (px)
    pip_shape: 'circle',           // circle or rectangle
    pip_border: '#6366f1'          // Indigo border color
  },
  transcription: true,              // Auto-transcribe when done
  ai_summary: true,                // Generate AI summary
  resolution: '1080p'
});

// User records... then stop:
const result = await session.stop();

// result.video_url — hosted MP4 video
// result.share_link — shareable link with viewer analytics
// result.transcript — word-level timestamped transcript
// result.summary — AI-generated summary of the recording
// result.duration — recording length

console.log(`Share: ${result.share_link}`);
console.log(`Summary: ${result.summary}`);

PiP Layout Guide: Which Layout for Which Use Case

Tutorials and code walkthroughs: Bottom-left circle, 200px

Small circular webcam in the bottom-left corner keeps the presenter's face visible without blocking code or content. The bottom-left position avoids the code area (typically top and center) and the scrollbar (right side).

Product demos and sales: Bottom-right circle, 250px

Slightly larger webcam overlay increases the personal connection. Bottom-right works well for demo walkthroughs because the mouse cursor is usually moving through center and left content.

Presentations: Side-by-side, 70/30 split

Slides on the left (70% of frame), speaker on the right (30%). This layout gives the speaker equal visual presence with the slides, which is better for keynotes, lectures, and any content where the speaker's facial expressions and body language add value.

Async status updates: Full webcam with screen thumbnail

For quick 1-2 minute updates, full webcam with a small screen thumbnail in the corner inverts the typical PiP. The focus is on the person, with screen content as supporting context. This works well for stand-up summaries and Slack video messages.

Auto-Transcription During Recording

The most valuable feature of API-based screen recording is automatic transcription. Every recording gets a word-level timestamped transcript immediately after recording completes. This transcript powers several features that standalone recorders like OBS cannot provide.

Searchable recordings: Search your entire library of screen recordings by spoken content. Find the recording where you explained the new API endpoint by searching for "API endpoint." The search returns the recording with a timestamp link that jumps directly to the relevant moment.

AI summary: V100 generates a bullet-point summary of the recording's content. A 10-minute product demo becomes a 5-bullet summary: "Demonstrated new dashboard, showed analytics filtering, explained export feature, discussed pricing change, answered objection about integration timeline." Viewers can read the summary before deciding whether to watch the full recording.

Captions: The transcript is automatically converted to captions, making the recording accessible and watchable on mute. This is particularly important for async recordings shared in Slack or email, where recipients may watch without audio.

Comparison: OBS vs. Loom vs. Browser API vs. V100

Feature	OBS	Loom	Browser API	V100
Screen + webcam PiP	Yes (manual setup)	Yes (one-click)	Yes (code required)	Yes (API config)
Custom PiP layouts	Full control	Limited	Full control	4 presets + custom
Auto transcription	No	Basic	No	Word-level + diarization
AI summary	No	Yes (paid)	No	Yes
Shareable link	No (local file)	Yes	No (local blob)	Yes + analytics
Output format	MP4, MKV, FLV	MP4 (cloud)	WebM only	MP4 (cloud)
API for developers	No	No	Yes (raw APIs)	Yes (managed)
Viewer analytics	No	Yes	No	Yes
Price	Free	$12.50/user/mo	Free (DIY)	Pay per minute

When to Use Each Method

Use OBS when:

You want maximum quality and control, you are comfortable with setup complexity, you record locally and share manually, or you are also live streaming. OBS is the power-user choice.

Use Loom when:

You want one-click simplicity, you are an individual or small team, you do not need API access, and the $12.50/user/month price fits your budget. Loom is the fastest path from recording to shared link.

Use Browser APIs when:

You are building a recording feature into your own application and want full control over the UI and experience. Be prepared to handle storage, hosting, and transcription yourself.

Use V100 when:

You are building screen recording into your product and need managed infrastructure for compositing, transcription, hosting, sharing, and analytics. V100 handles the backend so you build the UI.

Build screen recording into your product

V100's recording API handles capture, compositing, transcription, hosting, and sharing. Free tier includes 100 API calls per month. Start building today.

Start Free Trial Screen Recording API Docs