Send a plain-English command. Get back edited video. V100's API transcribes, cuts, captions, removes silence, and exports in any format -- no timeline, no UI, no FFmpeg flags. Just one POST request.
curl -X POST https://api.v100.ai/v1/editor/edit \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": "https://storage.example.com/meeting.mp4",
"instructions": "Remove all ums and ahs, cut to the best 60 seconds, add captions in Spanish",
"output": { "format": "mp4", "resolution": "1080p" }
}'
V100 parses natural language instructions, maps them to an optimized FFmpeg pipeline, and returns edited video. No preset names to memorize. No JSON schemas to learn.
"Remove the ums"
Detects filler words (um, uh, like, you know) via transcript alignment and cuts them with crossfade transitions. Typically removes 8-15% of runtime.
"Cut to 60 seconds"
Scores every sentence by information density and engagement signals, keeps the top segments, and assembles a coherent clip at your target duration.
"Add captions in Spanish"
Transcribes in the source language, translates to any of 20 supported languages, and burns captions or exports SRT/VTT sidecar files.
"Remove all silence"
Waveform analysis combined with transcript alignment detects pauses longer than your threshold (default 0.8s). Cuts dead air, keeps natural pacing.
"Make it vertical for TikTok"
Reframes 16:9 video to 9:16 with speaker tracking. Keeps faces centered using landmark detection. Adds safe zones for platform UI overlays.
"Extract the part about pricing"
Semantic search over the transcript finds the relevant segments, extracts them with context padding, and assembles a topical clip.
No separate services for transcription, captioning, and editing. V100 handles the full pipeline.
Word-level timestamps with 97%+ accuracy across 20 languages. Speaker diarization identifies up to 12 speakers. Returns JSON, SRT, VTT, or plain text.
POST /v1/transcribe
{ "source": "s3://bucket/video.mp4",
"language": "auto",
"diarize": true,
"format": "json" }
Dual-mode detection: waveform energy analysis for hard silence, plus transcript alignment for filler word gaps. Configurable threshold from 0.3s to 5s.
POST /v1/editor/edit
{ "source": "https://example.com/podcast.mp4",
"instructions": "Remove silence longer than 0.5 seconds",
"output": { "format": "mp4" } }
Generate captions in 20 languages with burned-in rendering or sidecar SRT/VTT export. Customize font, size, position, background color, and animation style.
POST /v1/captions
{ "source": "s3://bucket/video.mp4",
"languages": ["en", "es", "ja"],
"style": "burned_in",
"position": "bottom_center" }
Submit up to 10,000 videos per batch request. Parallel processing across our GPU cluster with webhook callbacks on completion. Process an entire content library overnight.
POST /v1/batch
{ "jobs": [
{ "source": "s3://b/vid1.mp4", "instructions": "..." },
{ "source": "s3://b/vid2.mp4", "instructions": "..." }
],
"webhook": "https://your-app.com/done" }
Export as MP4, WebM, MOV, GIF, or audio-only (MP3, WAV, FLAC). Set resolution, bitrate, codec (H.264, H.265, VP9, AV1), and frame rate per output.
POST /v1/editor/edit
{ "source": "...",
"instructions": "...",
"output": { "format": "webm", "codec": "vp9",
"resolution": "720p", "fps": 30 } }
All editing jobs run asynchronously. Get a job ID immediately, poll for status, or receive a webhook POST when processing completes. Includes progress percentage for long jobs.
GET /v1/jobs/job_abc123
// Response:
{ "status": "completed",
"progress": 100,
"output_url": "https://cdn.v100.ai/..." }
A complete Node.js integration that uploads, edits, and downloads the result.
// npm install v100-sdk
import { V100 } from 'v100-sdk';
const v100 = new V100('YOUR_API_KEY');
const job = await v100.editor.edit({
source: 'https://storage.example.com/raw-podcast.mp4',
instructions: 'Remove filler words and silence over 1 second. Add English captions. Cut to the best 3 minutes.',
output: { format: 'mp4', resolution: '1080p' }
});
const result = await v100.jobs.wait(job.id); // polls until complete
console.log(result.output_url); // https://cdn.v100.ai/out/abc123.mp4
MP4, MOV, WebM, MKV, AVI, FLV, WMV, and audio formats (MP3, WAV, M4A, FLAC). Files up to 10GB. Provide a URL, S3 path, or upload directly via multipart form.
Transcription runs at roughly 10x real-time (a 60-minute video transcribes in ~6 minutes). Editing operations add 1-3 minutes depending on complexity. Batch jobs parallelize across our GPU cluster.
Yes. Pass multiple instructions in a single request: "Remove ums, add Spanish captions, cut to 90 seconds, export as vertical 9:16." The API plans an optimal pipeline and executes all steps in one pass.
English, Spanish, French, German, Portuguese, Italian, Dutch, Polish, Russian, Ukrainian, Japanese, Korean, Chinese (Simplified & Traditional), Arabic, Hindi, Turkish, Vietnamese, Thai, and Indonesian.
Yes. The free tier includes 60 minutes of processing per month, all features, and full API access. No credit card required to start.
Free tier. No credit card required. First API call in under 2 minutes.
Get API Key — Free TierGenerate accurate captions via API with burned-in, SRT, or VTT output.
FeatureRemove dead air and filler words programmatically with configurable thresholds.
FeatureProcess thousands of videos in parallel with webhook callbacks.