AUTO-CAPTIONS

Auto-Caption Any
Video via API

One API call generates word-level captions in 20 languages with 97%+ accuracy. Burn them into the video, export as SRT/VTT, or both. No manual transcription. No timing adjustments.

97%+
Accuracy
20
Languages
10x
Faster than real-time

20 Languages. One Endpoint.

Pass a language code or set "language": "auto" for automatic detection. All languages support word-level timestamps.

EN
English
98.2% accuracy
ES
Spanish
97.5% accuracy
FR
French
97.1% accuracy
DE
German
96.8% accuracy
PT
Portuguese
97.0% accuracy
IT
Italian
96.5% accuracy
NL
Dutch
96.3% accuracy
PL
Polish
95.8% accuracy
RU
Russian
96.1% accuracy
UK
Ukrainian
95.4% accuracy
JA
Japanese
95.9% accuracy
KO
Korean
95.7% accuracy
ZH
Chinese (Simplified)
96.0% accuracy
ZT
Chinese (Traditional)
95.6% accuracy
AR
Arabic
94.8% accuracy
HI
Hindi
95.2% accuracy
TR
Turkish
95.5% accuracy
VI
Vietnamese
94.9% accuracy
TH
Thai
94.5% accuracy
ID
Indonesian
95.3% accuracy

Three Output Modes

Burned-In

Captions rendered directly into the video frames. Customizable font (family, size, weight), position (top/center/bottom), background (box, shadow, or transparent), and animation (fade, pop, typewriter).

"style": "burned_in",
"caption_options": {
  "font_size": 42,
  "position": "bottom_center",
  "background": "semi_transparent",
  "animation": "pop"
}

SRT File

SubRip format with sequential numbering and millisecond timestamps. Compatible with YouTube, Vimeo, Facebook, LinkedIn, and every major video player. One file per language.

1
00:00:01,240 --> 00:00:04,890
Welcome to today's deep dive
into our API architecture.

2
00:00:05,100 --> 00:00:08,340
We'll cover three main topics.

WebVTT File

W3C standard format with support for styling cues, positioning, and speaker identification. Native in all modern browsers via the HTML5 <track> element.

WEBVTT

00:00:01.240 --> 00:00:04.890
<v Speaker 1>Welcome to today's
deep dive into our API.

00:00:05.100 --> 00:00:08.340
<v Speaker 1>We'll cover three
main topics.

Generate Captions in Two API Calls

Submit the job, then retrieve the result. That is the entire integration.

captions.js
import { V100 } from 'v100-sdk';

const v100 = new V100(process.env.V100_API_KEY);

// Generate captions in 3 languages simultaneously
const job = await v100.captions.create({
  source: 'https://storage.example.com/webinar.mp4',
  languages: ['en', 'es', 'ja'],
  style: 'burned_in',
  caption_options: {
    font_size: 38,
    position: 'bottom_center',
    background: 'semi_transparent',
    max_chars_per_line: 42
  },
  sidecar_formats: ['srt', 'vtt']
});

const result = await v100.jobs.wait(job.id);
// result.outputs:
// - result.outputs.en.video_url  (burned-in MP4)
// - result.outputs.en.srt_url    (SRT file)
// - result.outputs.en.vtt_url    (VTT file)
// - result.outputs.es.video_url  ...
// - result.outputs.ja.video_url  ...

Caption Your First Video Free

Free tier: 60 minutes/month. Pay-as-you-go after that. No contracts.

Get API Key — Free Tier

Related