SILENCE REMOVAL

Remove Dead Air.
Automatically.

The average meeting recording is 40% silence and filler. V100's silence removal API detects dead air through waveform analysis and transcript alignment, then cuts it out -- preserving natural pacing and conversation flow.

Typical Results

40%
Average dead air in recordings
0.3-5s
Configurable threshold
2.4x
Avg. watch-through improvement

How It Works

1

Waveform Energy Analysis

The audio waveform is segmented into 50ms frames. Each frame's RMS energy is computed and compared against a noise floor baseline derived from the first 2 seconds of audio (or a user-specified reference). Frames below the silence threshold (default: -40dB relative to peak) are marked as candidate silence regions.

2

Transcript Alignment

Simultaneously, the audio is transcribed with word-level timestamps. The transcript reveals gaps between words that represent natural pauses, filler words ("um", "uh", "like", "you know", "so", "basically"), and extended hesitations. These are cross-referenced with the waveform silence regions to distinguish between intentional dramatic pauses and unintentional dead air.

3

Intelligent Cutting

Silence regions exceeding your configured threshold (0.3s to 5s) are removed with 80ms crossfade transitions to prevent audio clicks. A configurable "keep padding" (default: 150ms) is preserved on each side of remaining speech to maintain natural breathing rhythm. The video track is cut in sync with zero frame drift.

One Call. No FFmpeg.

Remove silence with a single API request. Configure thresholds, filler word detection, and padding.

terminal
curl -X POST https://api.v100.ai/v1/editor/edit \
  -H "Authorization: Bearer $V100_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": "s3://your-bucket/meeting-recording.mp4",
    "instructions": "Remove all silence longer than 0.8 seconds and remove filler words",
    "silence_options": {
      "threshold_seconds": 0.8,
      "remove_fillers": true,
      "filler_words": ["um", "uh", "like", "you know", "basically"],
      "keep_padding_ms": 150,
      "crossfade_ms": 80
    },
    "output": {
      "format": "mp4",
      "resolution": "source"
    },
    "webhook": "https://your-app.com/api/webhooks/v100"
  }'

# Response (immediate):
# {
#   "job_id": "job_sil_7f3a9b2c",
#   "status": "processing",
#   "estimated_seconds": 180
# }
Webhook payload on completion
{
  "job_id": "job_sil_7f3a9b2c",
  "status": "completed",
  "output_url": "https://cdn.v100.ai/out/7f3a9b2c.mp4",
  "stats": {
    "original_duration_seconds": 3612,
    "output_duration_seconds": 2247,
    "removed_seconds": 1365,
    "silence_segments_removed": 284,
    "filler_words_removed": 67,
    "reduction_percent": 37.8
  }
}

Use Cases

Podcasts

Interview recordings typically contain 15-25% dead air from thinking pauses, connection delays, and filler words. Removing these makes episodes tighter and more listenable without manual editing.

Meeting Recordings

A 60-minute meeting recording often contains 20-25 minutes of silence from screen sharing transitions, people joining/leaving, and "can you hear me?" troubleshooting. Cut it to a focused 35-minute recap.

Online Courses

Lecture recordings with pauses for writing, thinking, or slide transitions. Students watch at 1.5-2x speed anyway -- removing silence first makes normal playback feel natural and saves bandwidth.

Cut the Dead Air

Free tier includes 60 minutes of processing per month. No credit card required.

Get API Key — Free Tier

Related