Spatial Audio for the Spatial Computing Era.
First-Order Ambisonics encoding, binaural HRTF rendering, and real-time head tracking for Apple Vision Pro, AirPods Pro, and any stereo headphones. Sub-millisecond pipeline latency.
Full Soundfield in 4 Channels
First-Order Ambisonics (FOA) captures the complete 3D soundfield using four B-format channels derived from spherical harmonics. V100 encodes, rotates, and decodes in real time.
W — Omnidirectional
Pressure signal. Captures sound equally from all directions. The "mono" reference channel.
X — Front–Back
Figure-8 dipole aligned to the front axis. Positive lobe faces forward.
Y — Left–Right
Figure-8 dipole on the lateral axis. Positive lobe points left.
Z — Up–Down
Figure-8 dipole on the vertical axis. Captures height information.
SN3D Normalization
Schmidt semi-normalization ensures all spherical harmonic components have comparable energy levels. Unlike N3D (full normalization) or maxN (Furse-Malham), SN3D prevents clipping while maintaining numerical stability across all channel orders.
ACN Channel Ordering
Ambisonics Channel Number (ACN) is the standard ordering used by MPEG-H, YouTube 360, and Google spatial audio. Each channel is assigned a unique integer index based on its spherical harmonic degree and order.
Place Audio Objects in 3D Space
Position mono or stereo sources anywhere around the listener using azimuth, elevation, and distance. V100 encodes each object into the Ambisonics soundfield in real time.
Azimuth
Horizontal angle around the listener. 0° is front, ±180° is directly behind.
Elevation
Vertical angle. +90° is directly overhead, -90° is directly below.
Distance
Distance from listener with inverse-square attenuation and air absorption modeling.
Real-Time Soundfield Rotation
Quaternion-based head tracking from Vision Pro, AirPods Pro, or Meta Quest feeds directly into V100's rotation matrix. The soundfield follows the listener's head in real time.
Yaw
Rotation around the vertical axis. Turning your head left or right. Maps to azimuth shift in the soundfield.
range: ±180°
effect: horizontal pan
Pitch
Rotation around the lateral axis. Tilting your head up or down. Maps to elevation shift in the soundfield.
range: ±90°
effect: vertical pan
Roll
Rotation around the front axis. Tilting your head ear-to-shoulder. Adjusts left-right balance and height cues.
range: ±180°
effect: tilt compensation
// WebSocket message from Vision Pro
{
"timestamp": 1711547823.456,
"quaternion": {
"w": 0.9239, // scalar component
"x": 0.0000, // pitch axis
"y": 0.3827, // yaw axis (45deg turn)
"z": 0.0000 // roll axis
},
"device": "apple_vision_pro",
"session_id": "sa_sess_7f3a..."
}
HRTF Convolution for True 3D Perception
Head-Related Transfer Functions model how sound diffracts around your head, pinnae, and torso. V100 convolves the Ambisonics soundfield with HRTF filters to produce binaural stereo that works with any headphones.
From Vision Pro to Any Headphones
Full spatial audio with head tracking on supported devices. Binaural rendering on everything else. No listener left behind.
Apple Vision Pro
AirPods Pro
Meta Quest 3
Sony WH-1000XM5
Any Stereo Headphones
| Feature | Vision Pro | AirPods Pro | Meta Quest | Stereo |
|---|---|---|---|---|
| Binaural HRTF | ||||
| Head Tracking | 6DOF | 3DOF | 6DOF | None |
| Personalized HRTF | ||||
| Latency | <1ms | <5ms | <2ms | <1ms |
MPEG-H & Dolby Atmos Ready
V100 generates standards-compliant metadata for object-based audio delivery. Export to MPEG-H 3D Audio, Dolby Atmos ADM, or our native spatial format.
MPEG-H 3D Audio
Object-based audio with scene metadata. Supports up to 128 audio objects with position, gain, and spread parameters. Used in broadcast (ATSC 3.0, DVB).
- Scene description metadata
- Object position + interactivity
- HOA + channel bed support
Dolby Atmos
Export Audio Definition Model metadata for Dolby Atmos workflows. Object positions, bed assignments, and binaural render metadata in standard ADM XML.
- ADM BWF export
- 7.1.4 bed + objects
- Renderer-agnostic positioning
V100 Spatial
Our native format optimized for low-latency streaming. Compact binary metadata interleaved with audio frames for sub-millisecond scene updates.
- Binary + JSON hybrid
- Frame-level position updates
- WebSocket real-time streaming
{
"format": "v100_spatial_v1",
"sample_rate": 48000,
"ambisonics_order": 1,
"normalization": "SN3D",
"channel_order": "ACN",
"objects": [
{
"id": "commentator",
"azimuth": 0.0,
"elevation": 0.0,
"distance": 2.0,
"gain": 1.0
},
{
"id": "crowd_left",
"azimuth": -90.0,
"elevation": 5.0,
"distance": 15.0,
"gain": 0.8
}
],
"export": ["mpeg_h", "dolby_atmos_adm"]
}
7 Endpoints. Full Control.
Create spatial scenes, position objects, stream head tracking, and render binaural output. All through a single REST + WebSocket API.
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/spatial/scenes | Create a new spatial audio scene with Ambisonics config |
| POST | /v1/spatial/scenes/{id}/objects | Add an audio object with position (azimuth, elevation, distance) |
| PATCH | /v1/spatial/objects/{id}/position | Update object position in real time (azimuth, elevation, distance, gain) |
| WS | /v1/spatial/scenes/{id}/tracking | WebSocket for head-tracking quaternion stream (device → V100) |
| POST | /v1/spatial/scenes/{id}/render | Render binaural output (returns stereo PCM or streams via WebSocket) |
| POST | /v1/spatial/scenes/{id}/export | Export metadata (MPEG-H, Dolby Atmos ADM, or V100 native) |
| GET | /v1/spatial/scenes/{id} | Retrieve scene state, all objects, and current head-tracking status |
# Create a spatial audio scene
curl -X POST https://api.v100.ai/v1/spatial/scenes \
-H "Authorization: Bearer $V100_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Live Sports Mix",
"ambisonics_order": 1,
"normalization": "SN3D",
"sample_rate": 48000,
"head_tracking": true
}'
# Response
{
"scene_id": "sa_scene_7f3a...",
"ws_url": "wss://api.v100.ai/v1/spatial/scenes/sa_scene_7f3a.../tracking",
"status": "active"
}
Build Immersive Audio Experiences
From live sports to virtual events, V100 spatial audio turns flat stereo into a fully immersive 3D soundfield. Start with our free tier.