AUTO-CAPTIONS

Auto-Caption Any
Video via API

One API call generates word-level captions in 20 languages with 97%+ accuracy. Burn them into the video, export as SRT/VTT, or both. No manual transcription. No timing adjustments.

97%+

Accuracy

20

Languages

10x

Faster than real-time

20 Languages. One Endpoint.

Pass a language code or set "language": "auto" for automatic detection. All languages support word-level timestamps.

EN

English

98.2% accuracy

ES

Spanish

97.5% accuracy

FR

French

97.1% accuracy

DE

German

96.8% accuracy

PT

Portuguese

97.0% accuracy

IT

Italian

96.5% accuracy

NL

Dutch

96.3% accuracy

PL

Polish

95.8% accuracy

RU

Russian

96.1% accuracy

UK

Ukrainian

95.4% accuracy

JA

Japanese

95.9% accuracy

KO

Korean

95.7% accuracy

ZH

Chinese (Simplified)

96.0% accuracy

ZT

Chinese (Traditional)

95.6% accuracy

AR

Arabic

94.8% accuracy

HI

Hindi

95.2% accuracy

TR

Turkish

95.5% accuracy

VI

Vietnamese

94.9% accuracy

TH

Thai

94.5% accuracy

ID

Indonesian

95.3% accuracy

Three Output Modes

Burned-In

Captions rendered directly into the video frames. Customizable font (family, size, weight), position (top/center/bottom), background (box, shadow, or transparent), and animation (fade, pop, typewriter).

"style": "burned_in",
"caption_options": {
  "font_size": 42,
  "position": "bottom_center",
  "background": "semi_transparent",
  "animation": "pop"
}

SRT File

SubRip format with sequential numbering and millisecond timestamps. Compatible with YouTube, Vimeo, Facebook, LinkedIn, and every major video player. One file per language.

1
00:00:01,240 --> 00:00:04,890
Welcome to today's deep dive
into our API architecture.

2
00:00:05,100 --> 00:00:08,340
We'll cover three main topics.

WebVTT File

W3C standard format with support for styling cues, positioning, and speaker identification. Native in all modern browsers via the HTML5 <track> element.

WEBVTT

00:00:01.240 --> 00:00:04.890
<v Speaker 1>Welcome to today's
deep dive into our API.

00:00:05.100 --> 00:00:08.340
<v Speaker 1>We'll cover three
main topics.

Generate Captions in Two API Calls

Submit the job, then retrieve the result. That is the entire integration.

captions.js

import { V100 } from 'v100-sdk';

const v100 = new V100(process.env.V100_API_KEY);

// Generate captions in 3 languages simultaneously
const job = await v100.captions.create({
  source: 'https://storage.example.com/webinar.mp4',
  languages: ['en', 'es', 'ja'],
  style: 'burned_in',
  caption_options: {
    font_size: 38,
    position: 'bottom_center',
    background: 'semi_transparent',
    max_chars_per_line: 42
  },
  sidecar_formats: ['srt', 'vtt']
});

const result = await v100.jobs.wait(job.id);
// result.outputs:
// - result.outputs.en.video_url  (burned-in MP4)
// - result.outputs.en.srt_url    (SRT file)
// - result.outputs.en.vtt_url    (VTT file)
// - result.outputs.es.video_url  ...
// - result.outputs.ja.video_url  ...

Auto-Caption Any
Video via API

20 Languages. One Endpoint.

Three Output Modes

Burned-In

SRT File

WebVTT File

Generate Captions in Two API Calls

Caption Your First Video Free

Related

Video Editing API

Auto-Caption API Developer Guide

Batch Processing

Auto-Caption AnyVideo via API

20 Languages. One Endpoint.

Three Output Modes

Burned-In

SRT File

WebVTT File

Generate Captions in Two API Calls

Caption Your First Video Free

Related

Video Editing API

Auto-Caption API Developer Guide

Batch Processing

Auto-Caption Any
Video via API