Virtual backgrounds are table stakes for any video platform in 2026. Zoom has them. Teams has them. Google Meet has them. Your users expect to blur their messy apartment, replace their background with a corporate logo, or hide the fact that they are taking a board meeting from a coffee shop. This is not a differentiator. It is a baseline requirement.
What is a differentiator is whether developers can control virtual backgrounds programmatically. On Zoom, the user clicks a button in the UI. On Teams, the user selects a background from a gallery. On both platforms, the developer building on top of the SDK has limited or no control over what backgrounds are available, when they are applied, or whether they can be enforced. The background is a user feature, not a developer feature.
V100 takes a fundamentally different approach. Virtual backgrounds are an API. Developers set them via session configuration, swap them mid-call via API calls, enforce company-approved backgrounds for all participants, disable them for compliance scenarios, and control them per-participant. The user-facing experience is still a button in the meeting UI. But the developer has full programmatic control over what happens behind that button. This post is a technical deep-dive into how the implementation works.
How It Works: MediaPipe + Canvas + captureStream
V100's virtual background pipeline has three stages: segmentation, compositing, and stream replacement. The entire pipeline runs in the browser on the participant's device. No video frames are sent to a server for background processing. This is critical for both latency and privacy — the unprocessed camera feed never leaves the user's device.
Stage 1: Segmentation (MediaPipe Selfie Segmentation)
The segmentation stage uses MediaPipe Selfie Segmentation, a lightweight machine learning model that runs in the browser via WebAssembly. The model takes each video frame as input and produces a segmentation mask: a grayscale image where white pixels represent the person and black pixels represent the background. The model runs at the camera's frame rate (typically 30fps) and adds approximately 3-8 milliseconds of processing time per frame on modern hardware.
MediaPipe's segmentation model is trained specifically for selfie-style video: a single person facing the camera at arm's length. It handles common challenges well: varying skin tones, glasses, hats, headphones, and partially visible hands. It struggles with unusual poses (person turned sideways), transparent objects (glass on a desk), and very similar foreground/background colors (wearing a white shirt against a white wall). For these edge cases, V100 applies Gaussian blur to the mask edges to create a smooth transition rather than a hard cutoff.
Stage 2: Compositing (Canvas API)
The compositing stage uses the HTML5 Canvas API to combine the person (foreground) with the selected background. The process works as follows for each frame:
Per-frame compositing pipeline
The globalCompositeOperation approach is the most performant method for canvas-based compositing because it uses the browser's hardware-accelerated 2D renderer. Alternative approaches — pixel-by-pixel manipulation via getImageData/putImageData — are 10-50x slower because they bypass GPU acceleration and force a CPU round-trip for every pixel.
Stage 3: Stream Replacement (captureStream)
The final stage replaces the original camera track in the WebRTC connection with the composited canvas output. The canvas produces a MediaStream via canvas.captureStream(30) (30fps). The video track from this stream replaces the original camera track using RTCRtpSender.replaceTrack(). This is a seamless, glitch-free replacement — the remote participant sees the background change without any interruption in the video feed.
The canvas handles both portrait and landscape orientations automatically. The compositing logic detects the video resolution and adjusts the background image scaling to fill the frame without stretching or letterboxing. When the user rotates their device or resizes their window, the canvas dimensions update and the background scales accordingly.
The 7 Presets + Custom Upload
V100 ships with 7 built-in virtual background presets that cover the most common use cases. Each preset is selectable via the meeting UI or settable via the API.
| Preset | API Value | Description | Use Case |
|---|---|---|---|
| Slight Blur | blur-light | Gaussian blur (radius 10px) on background | Subtle background de-emphasis |
| Heavy Blur | blur-heavy | Gaussian blur (radius 30px) on background | Complete background concealment |
| Solid Black | solid-black | Pure black (#000000) background | Professional, minimal distraction |
| Solid White | solid-white | Pure white (#FFFFFF) background | Clean, studio-style look |
| Office | office | Professional office environment image | Work-from-home professionalism |
| Green Screen | green-screen | Solid green (#00FF00) background | Post-processing, chroma key workflows |
| Custom Image | custom | User-uploaded image (URL or base64) | Branding, custom environments |
| None | none | Disable virtual background | Original camera feed |
The green screen preset deserves special mention. It enables post-production workflows where the meeting recording is later processed with professional chroma key software to composite participants into any environment. This is used by broadcast media companies, content creators, and event production teams who need studio-quality compositing that goes beyond what real-time browser-based processing can achieve.
API Integration: Set and Swap Backgrounds via Code
The key difference between V100 and every other video platform is that virtual backgrounds are controllable via the API. This enables use cases that are impossible when virtual backgrounds are a user-only feature.
// Set virtual background in session config (before joining)
const session = await v100.createSession({
roomId: 'board-meeting-q1',
virtualBackground: {
enabled: true,
preset: 'blur-heavy', // or 'office', 'custom', etc.
},
});
// Switch background mid-call
await session.setVirtualBackground({
preset: 'custom',
imageUrl: 'https://company.com/branding/bg.jpg',
});
// Enforce company background for all participants
await v100.rooms.update('board-meeting-q1', {
virtualBackground: {
enforced: true,
preset: 'custom',
imageUrl: 'https://company.com/branding/bg.jpg',
allowOverride: false, // participants cannot change it
},
});
// Disable virtual backgrounds (compliance mode)
await v100.rooms.update('deposition-room', {
virtualBackground: {
enabled: false, // no backgrounds allowed
},
});
// Per-participant control
await session.participants.setBackground('participant-id', {
preset: 'blur-light',
});
The enforcement API is particularly important for enterprise customers. A company can mandate that all employees use the corporate-branded background on client-facing calls. A legal firm can disable virtual backgrounds entirely for depositions, ensuring that the video recording shows the actual environment. A healthcare provider can enforce a neutral background for patient consultations to maintain professionalism. These are policy decisions that require programmatic control, not user preferences.
Screen Share Integration: Automatic Pause and Resume
A common edge case that most virtual background implementations handle poorly is the transition between camera and screen share. When a participant starts sharing their screen, the virtual background should stop processing — applying a segmentation mask to a screen capture makes no sense and wastes CPU. When screen sharing stops, the virtual background should resume seamlessly.
V100 handles this automatically. When a participant starts screen sharing, the virtual background pipeline pauses: the MediaPipe model stops receiving frames, the canvas compositing loop stops, and the screen share track replaces the composited track in the WebRTC connection. When screen sharing ends, the pipeline resumes within one frame (33ms at 30fps). The participant sees their virtual background return instantly. Remote participants see a seamless transition with no black frames, no flickering, and no delay.
This integration extends to V100's picture-in-picture mode. When a participant is screen sharing with a camera PiP overlay, the virtual background is applied only to the PiP overlay (the camera feed) and not to the screen share content. The segmentation model processes the small PiP frame, which is computationally cheaper than processing a full-resolution camera feed, resulting in even lower CPU usage during screen share.
Graceful Fallback: When MediaPipe Is Unavailable
MediaPipe Selfie Segmentation loads from a CDN. In environments where CDN access is blocked (corporate firewalls, air-gapped networks, regions with CDN restrictions), the model cannot load and segmentation is unavailable. V100 handles this gracefully with a two-tier fallback strategy.
Tier 1: Full-frame blur. If the segmentation model cannot load but the user has selected a virtual background, V100 applies a full-frame Gaussian blur to the entire video feed. This does not separate the person from the background (the person is also blurred), but it provides privacy by obscuring the environment. This is better than showing the raw camera feed when the user explicitly requested a background.
Tier 2: Raw camera feed with notification. If even canvas processing is unavailable (extremely old browsers, canvas disabled by policy), V100 falls back to the raw camera feed and displays a notification to the user explaining that virtual backgrounds are not available in their environment. The video call proceeds normally — virtual backgrounds are never a hard requirement for joining a meeting.
For customers who need virtual backgrounds in air-gapped environments, V100 supports self-hosted MediaPipe model files. The model weights can be served from the customer's own infrastructure, eliminating the CDN dependency entirely. This is configured via the session configuration by providing a custom modelUrl parameter.
Technical Architecture Diagram
Camera (getUserMedia)
|
v 30fps video frames
+--------------------+
| MediaPipe Selfie | ~3-8ms per frame (WASM)
| Segmentation | Output: grayscale mask
+--------+-----------+
|
v mask + original frame
+--------------------+
| Canvas Compositor | <1ms per frame (GPU-accelerated)
| |
| 1. Draw background| (preset image, blur, or solid color)
| 2. destination-in | (clip with mask)
| 3. destination- | (composite person over background)
| over |
+--------+-----------+
|
v composited frame
+--------------------+
| captureStream(30) | Canvas -> MediaStream
+--------+-----------+
|
v video track
+--------------------+
| replaceTrack() | Swap into WebRTC connection
| RTCRtpSender | Seamless, no renegotiation
+--------------------+
|
v
Remote participants see composited video
Screen Share Active?
--------------------
YES -> Pause segmentation + canvas loop
Replace track with screen capture
NO -> Resume pipeline within 1 frame (33ms)
Performance: CPU Impact and Optimization
Virtual background processing adds CPU load on the participant's device. The MediaPipe segmentation model is the primary cost at 3-8ms per frame. The canvas compositing adds less than 1ms per frame when GPU-accelerated. At 30fps, the total CPU overhead is approximately 120-270ms of processing per second, or 12-27% of a single CPU core. On modern laptops and phones with multiple cores, this is well within budget. On older devices, V100 automatically reduces the segmentation frequency to 15fps to halve the CPU load while maintaining acceptable visual quality.
The blur presets are significantly cheaper than image replacement backgrounds because they skip the segmentation step entirely for the background. Instead of compositing a separate image, blur presets apply a CSS filter to the canvas and draw the camera frame twice: once blurred (background) and once sharp (foreground with mask). This reduces per-frame processing time to under 2ms total.
Per-frame processing cost by preset type
Use Cases: Why API-Controlled Backgrounds Matter
Work from home: The most common use case. Employees hide their home environment on professional calls. V100's API allows companies to pre-configure the default background for their organization, so employees do not need to select it manually on every call.
HIPAA compliance: In telehealth sessions, the patient's environment may contain sensitive information (medication bottles, medical equipment, other people). Enforcing a blur or solid background protects patient privacy and prevents incidental disclosure of PHI. V100's enforcement API ensures the background is applied automatically, removing the burden from the patient.
Corporate branding: Enterprise customers use custom background images with their company logo, product imagery, or event branding. For webinars and customer-facing calls, a consistent branded background reinforces professionalism. V100's room-level enforcement ensures every participant presents a unified brand image.
Legal depositions: In video depositions, attorneys may want to disable virtual backgrounds entirely to ensure the recording shows the actual environment of the deponent. V100's disable API prevents any participant from activating a background, providing an authentic record. This is discussed further in our video deposition recording guide.
Content creation: The green screen preset enables professional post-production workflows. Creators record with a digital green screen, then use professional compositing software to place themselves in any environment with higher quality than real-time browser processing can achieve. This bridges V100's real-time video capabilities with traditional production workflows.
Comparison: V100 vs. Zoom vs. Teams vs. Daily
| Feature | V100 | Zoom | Teams | Daily |
|---|---|---|---|---|
| Virtual backgrounds | Yes | Yes | Yes | Yes |
| Set via API | Yes | No | No | Limited |
| Swap mid-call via API | Yes | No | No | No |
| Enforce per-room | Yes | No | Admin only | No |
| Disable for compliance | Yes (API) | Admin setting | Admin setting | No |
| Per-participant control | Yes | No | No | No |
| Custom image upload | Yes (URL + base64) | Yes (file upload) | Yes (file upload) | Limited |
| Screen share auto-pause | Yes (automatic) | Yes | Yes | Varies |
| Green screen preset | Yes | Yes | No | No |
The pattern is clear. Zoom and Teams treat virtual backgrounds as a user feature controlled through their UI. Daily offers limited SDK control. V100 treats virtual backgrounds as a developer feature controlled through the API. For developers building custom video experiences — telehealth platforms, corporate meeting tools, event production systems, legal deposition software — API-level control is not a nice-to-have. It is the difference between building a polished product and hacking around platform limitations.
Implementation: Switching Backgrounds Mid-Call
// Example: branded background for sales calls, blur for internal
const session = await v100.joinRoom('sales-demo-42');
// Start with company branding
await session.setVirtualBackground({
preset: 'custom',
imageUrl: 'https://cdn.acme.com/bg/sales-backdrop.jpg',
});
// Client leaves, switch to casual internal standup
session.on('participantLeft', async (participant) => {
if (participant.role === 'external') {
const externals = session.participants
.filter(p => p.role === 'external');
if (externals.length === 0) {
// All external participants left -- relax background
await session.setVirtualBackground({
preset: 'blur-light',
});
}
}
});
// Dynamic per-participant backgrounds
session.on('participantJoined', async (participant) => {
if (participant.role === 'presenter') {
await session.participants.setBackground(
participant.id,
{ preset: 'custom', imageUrl: participant.brandedBg }
);
}
});
This kind of dynamic, event-driven background control is impossible on platforms where virtual backgrounds are a user-facing button. V100's API makes the background a programmable element of the video experience, just like the video layout, the recording settings, or the participant permissions. It is one more dimension of control that developers need to build polished, professional video products.
Build with API-controlled virtual backgrounds
7 presets, custom upload, per-participant control, enforcement policies, and automatic screen share integration. Virtual backgrounds as a developer feature, not just a user button.