Try to zoom into a video tile on Zoom. You cannot. Try it on Teams. You cannot. Google Meet, Webex, Daily, LiveKit — none of them. Every video conferencing platform in the world renders participants as fixed-size tiles in a grid layout, and that is the only view you get. The tile size is determined by the number of participants, the window size, and the layout algorithm. You have no control over it.
This is bizarre when you think about it. Every other application that displays visual content lets you zoom in. Google Maps: pinch to zoom. Photos: pinch to zoom. PDFs: pinch to zoom. Browsers: pinch to zoom. But video calls — the application where seeing fine detail on a person's face has the highest stakes — give you zero zoom capability. You are stuck with whatever tile size the layout algorithm gives you. If you are on a phone with 8 participants, each face is 40 pixels wide and you cannot do a thing about it.
V100 fixes this. Every video tile in V100 supports real zoom: pinch to zoom on mobile, scroll to zoom on desktop, drag to pan when zoomed in. Each tile has independent zoom state. The zoom is canvas-based, not CSS-based, so it is sharp at every magnification. And it works alongside V100's AI auto-zoom without conflict.
Why Canvas, Not CSS: The Technical Foundation
The naive approach to zoom is CSS transform: scale(2) on the video element. This is fast and easy to implement. It is also terrible. CSS scale operates on the composited pixel output of the element, not the source data. When you scale a 640x360 video element to 2x, the browser scales 640x360 pixels to fill 1280x720 pixels. The result is blurry, pixelated, and aliased — especially on faces, where the human visual system is extraordinarily sensitive to artifacts.
V100 takes a fundamentally different approach. Each participant's video is rendered to an HTML5 <canvas> element. The zoom is implemented as a viewport transformation on the canvas 2D context. When you zoom to 2x, the canvas does not scale pixels. It re-samples from the full-resolution video source, drawing only the visible region at the canvas's native resolution. The result is sharp at any zoom level because every pixel in the output is drawn from the full-resolution input.
The difference is analogous to the difference between stretching a JPEG in an image editor versus cropping it. Stretching produces blur. Cropping preserves sharpness. V100's zoom is a crop, not a stretch.
CSS scale vs. canvas viewport
The canvas approach does require more CPU than CSS scale because the canvas must redraw the visible region on every frame. V100 mitigates this by only drawing when the video frame updates (using requestVideoFrameCallback where available, falling back to requestAnimationFrame), and by skipping redraws when the zoom state has not changed and the video frame is identical. On modern hardware, the per-frame cost of the canvas draw is under 1ms — negligible compared to the 16.7ms frame budget at 60fps.
Zoom Mechanics: Every Input Method
Mobile: Pinch to Zoom
On touch devices, V100 captures two-finger pinch gestures on each video tile. The implementation tracks touchstart, touchmove, and touchend events, calculating the distance between two touch points to determine zoom delta. The zoom anchors to the midpoint between the two fingers, so the region between your fingers stays centered as you zoom in or out — identical to how Google Maps and Apple Photos handle pinch zoom.
A critical implementation detail: the video tile's CSS must include touch-action: none. Without this, the browser's default pinch-zoom behavior intercepts the gesture and zooms the entire page instead of the individual tile. This is the single most common implementation mistake when building custom pinch-zoom interactions, and it is the reason most web developers give up on custom pinch handling.
When zoomed in beyond 1x, a single-finger drag pans the viewport within the zoomed frame. The pan is momentum-based: a quick flick continues scrolling with deceleration, matching the native iOS/Android scroll physics that users expect. At 1x zoom, single-finger drag does nothing (to avoid conflicting with page scrolling).
Desktop: Scroll Wheel Zoom
On desktop, the scroll wheel (or trackpad scroll gesture) zooms the video tile. The zoom anchors to the cursor position: if you hover over the speaker's left eye and scroll up, the zoom centers on their left eye. This "zoom follows cursor" behavior is intuitive because it matches how zoom works in every image editor, map application, and CAD tool. The zoom increment is 0.15x per scroll tick, providing smooth granularity.
When zoomed in, click-and-drag pans the viewport. The cursor changes to a grab cursor to indicate pannability. The pan is clamped to the bounds of the video frame, so you cannot pan past the edge of the video content.
Keyboard Shortcuts
V100 supports full keyboard control for accessibility and power users. + and - zoom in and out on the focused tile. Arrow keys pan when zoomed in. 0 resets to 1x. These shortcuts are active when a video tile has focus (click to focus, Tab to cycle), ensuring they do not conflict with other application shortcuts.
Double-Tap / Double-Click Reset
Double-tap on mobile or double-click on desktop instantly resets the tile to 1x zoom with a smooth 200ms animation. This provides a quick escape hatch when you want to return to the default view without fiddling with pinch gestures. If you double-tap while at 1x, it zooms to 2x centered on the tap position — a quick way to zoom in on a specific area.
Per-Tile Independent Zoom State
Every video tile in V100 maintains its own zoom state: current zoom level, pan offset (x, y), and animation state. Zooming into one participant has zero effect on any other tile. In a 6-person call, you can zoom to 3x on the presenter, zoom to 2x on your manager, and leave everyone else at 1x. Each tile remembers its state for the duration of the session.
This independence is essential for usability. A global zoom that affects all tiles would be useless — you want to see one person up close, not scale the entire grid. Per-tile state also enables V100's AI auto-zoom to operate on the active speaker's tile without disturbing your manual zoom on other tiles.
Zoom state per tile
- • zoom: float from 1.0 to 5.0 (default 1.0)
- • panX / panY: float offset in normalized coordinates (0.0 = center)
- • isAnimating: bool true during easeOutCubic transitions
- • source: "manual" | "auto" distinguishes user zoom from AI auto-zoom
- • lastInteraction: timestamp for manual override priority
Smooth Animation: easeOutCubic at 60fps
Every zoom transition in V100 is animated with an easeOutCubic easing function. This produces a fast initial movement that decelerates smoothly, giving zoom interactions a responsive, high-quality feel. The animation runs at the display's native refresh rate (typically 60fps, 120fps on ProMotion displays) using requestAnimationFrame.
The easeOutCubic function is 1 - (1 - t)^3, where t progresses from 0 to 1 over the animation duration. For zoom-level changes, the duration is 250ms. For auto-zoom speaker transitions, it is 300ms (slightly longer for a calmer transition). For double-tap reset, it is 200ms (faster to feel snappy). These durations were tuned through extensive user testing to balance responsiveness with smoothness.
The animation interpolates both zoom level and pan position simultaneously. When auto-zoom transitions from one speaker to another, the previous tile zooms out from 3x to 1x while the new tile zooms in from 1x to 3x. Both animations run in parallel, creating a smooth visual handoff that directs the viewer's attention naturally.
Floating Zoom Controls and Level Indicator
When a tile is zoomed beyond 1x, V100 displays a floating zoom level indicator in the corner of the tile. This shows the current zoom level (e.g., "2.3x") and provides + / - buttons for click-based zoom control. The indicator fades in when the tile is zoomed and fades out 2 seconds after the last interaction, keeping the UI clean. Hovering over a zoomed tile brings the indicator back.
The floating controls serve three purposes: they communicate that zoom is active (many users do not realize a tile is zoomed without a visual indicator), they provide a mouse-friendly zoom alternative for users who prefer clicking to scrolling, and they include a reset button that returns to 1x instantly. On touch devices, the controls are larger (44px minimum touch target per WCAG guidelines) and positioned to avoid being obscured by fingers during pinch gestures.
Enable Zoom via API
Zoom is enabled by default for all video tiles. Developers can customize the zoom range, behavior, and controls through the API.
// Configure zoom behavior for the room
const room = await v100.createRoom({
name: 'product-demo',
zoom: {
enabled: true,
minZoom: 1.0, // Minimum zoom level
maxZoom: 5.0, // Maximum zoom level
scrollIncrement: 0.15, // Zoom per scroll tick
doubleTapZoom: 2.0, // Zoom level on double-tap
showControls: true, // Floating zoom indicator
keyboardShortcuts: true, // +/- and arrow keys
animationMs: 250, // easeOutCubic duration
}
});
// Programmatically zoom a specific tile
room.setZoom(participantId, {
level: 2.5,
centerX: 0.5, // Center horizontally
centerY: 0.35, // Upper third (face region)
animate: true,
});
// Reset all tiles to 1x
room.resetAllZoom();
// Listen for zoom changes
room.on('zoom:change', (event) => {
console.log(`${event.participantId}: ${event.zoom}x`);
});
Use Cases: When Zoom Matters
Screen Share Detail
When someone shares their screen during a call, the content is rendered at whatever resolution the tile allows. In a 4-person call, the screen share tile might be 640 pixels wide. Code in a text editor, spreadsheet cells, small UI elements — these become illegible. With V100's zoom, you can pinch into the screen share tile to read the code, examine the spreadsheet, or see the fine detail in a design mockup. This alone makes V100 more useful than any competing platform for technical reviews, design feedback sessions, and pair programming calls.
Facial Expressions in Group Calls
In an 8-person meeting, each tile is tiny. You cannot see if someone is frowning, confused, or about to speak. V100's zoom lets you enlarge any participant at any time. Combined with AI auto-zoom, the active speaker is automatically enlarged while you can manually zoom into other participants to read their reactions.
Telehealth Visual Assessment
A dermatologist needs to see a skin lesion. An ophthalmologist needs to see the patient's eye. A physical therapist needs to see the patient's range of motion at specific joints. Zoom into the video tile to examine the relevant area at up to 5x magnification. Drag to pan across the patient's body. This capability transforms telehealth from "I can sort of see you" to "I can examine you."
Presentations and Slides
Presentation slides often have small text, fine charts, and detailed diagrams that are illegible in a standard video tile. V100's zoom lets any participant zoom into the presentation to read the fine print, examine the chart axes, or study the diagram details — without asking the presenter to "make it bigger."
Why Nobody Else Has This
The reason other video platforms do not offer per-tile zoom comes down to their rendering architecture. Zoom, Teams, Meet, and most WebRTC-based platforms render video using native <video> elements. The video element is a black box: the browser decodes the video stream and composites the output to the element's bounds. You cannot draw a sub-region of the video at a different resolution. You cannot apply a viewport transformation to the decoded frame. The only option is CSS scale, which produces blurry results.
V100's canvas-based rendering pipeline was designed for zoom from day one. Every video tile is a canvas that draws from the video source. This adds engineering complexity — you must manage frame scheduling, handle canvas context state, and deal with the overhead of an additional draw call per frame — but it unlocks capabilities that are impossible with raw video elements. Zoom is the most visible of those capabilities, but the same pipeline also enables V100's real-time effects, virtual backgrounds, and frame-level annotations.
Building this after the fact is not trivial. Migrating from <video> elements to a canvas pipeline means rearchitecting the entire rendering layer — video capture, compositing, layout, interaction handling, and accessibility. For platforms with millions of users, this is a multi-quarter engineering project with significant regression risk. V100 built the canvas pipeline first and never accumulated the technical debt.
Feature Comparison
| Capability | V100 | Zoom | Teams | Meet |
|---|---|---|---|---|
| Pinch to zoom (mobile) | 1x-5x | No | No | No |
| Scroll to zoom (desktop) | Cursor-anchored | No | No | No |
| Drag to pan | With momentum | No | No | No |
| Per-tile independent zoom | Yes | No | No | No |
| Keyboard shortcuts | +/-/arrows/0 | No | No | No |
| Double-tap reset | Yes | No | No | No |
| Rendering method | Canvas | Video element | Video element | Video element |
| Zoom quality | Sharp (source resample) | N/A | N/A | N/A |
The comparison is stark because there is nothing to compare. No other video platform has any of these features. V100 is competing against a complete absence of zoom functionality in the video conferencing space. The question is not whether V100's zoom is better than the competition's. The question is why no one else has built it.
Performance: Zoom at 60fps
Canvas-based rendering has a cost. Each zoomed tile requires an additional drawImage() call per frame, which involves reading from the video source and writing to the canvas. On modern hardware, this costs 0.3-0.8ms per tile per frame. In an 8-person call with all tiles at 1x, the overhead is under 6ms total — well within the 16.7ms frame budget at 60fps.
V100 applies several optimizations to minimize the rendering cost. Tiles at 1x zoom skip the viewport transformation and draw the full frame directly (cheaper than calculating a subregion). Tiles whose video source has not produced a new frame since the last draw skip the redraw entirely (checked via requestVideoFrameCallback). Animation frames are driven by requestAnimationFrame to align with the display's vsync, avoiding unnecessary draws between refreshes.
On lower-powered devices (budget Android phones, older iPads), V100 automatically reduces the canvas resolution for non-focused tiles and caps the maximum zoom at 3x to maintain frame rate. This adaptive quality strategy ensures zoom remains smooth even on hardware that cannot sustain full-resolution rendering for all tiles simultaneously.
Try zoom in a live call
Start a free V100 room, join from your phone and your laptop, and pinch-zoom into your own video tile. It works. Nobody else's does.