Tools/video
Video

Video

Active

Your agent can't run media tooling or render video itself; this edits existing clips and generates new ones from a prompt.

4 tools

A complete video toolkit: edit existing footage (transcode, clip, concatenate, overlay, burn in subtitles, draw text) and generate short videos with audio from a text prompt in a regular and a high-quality tier. One call, MP4/WebM/GIF and more out.

Audio & Videovideotranscodeconcatoverlaysubtitlestextwatermarklogoconvertaudiomp4webmgifmp3

Tools (4)

Edit

Transcode, clip, concatenate, overlay, burn subtitles, or add text labels to video or audio.

Usage-based · 1 credit per second of output

Example prompts

  • Convert this MP4 to WebM for use in a web player
  • Extract the audio track from this video as an MP3 at 192k bitrate
  • Clip seconds 10 to 40 from this video and export as MP4
  • Concatenate these three clips into one MP4
  • Burn these subtitles into the video and export as MP4
  • Overlay this logo PNG in the top-right of the video from seconds 2 to 8
  • Add a 'LIVE' label in 64px red text with a black background box in the top-left of this video

Parameters

textobjectoptional

Burn a single text label into the video, optionally with a background box and time window.

concatarrayoptional

Additional clip URLs to concatenate after input_url. Output duration = sum of all clips. Up to 20 extra clips.

overlayobjectoptional

Overlay another video or image on top of the primary input.

end_timenumberoptional

Clip end time in seconds. Single-input mode only, ignored if concat is set.

input_urlstringrequired

Pre-signed GET URL of the input video or audio file. Obtain via POST /uploads/presign on faro-api.

video_crfintegeroptional

Video quality (CRF). 0=lossless, 18=visually lossless, 23=default H.264, 28=lower quality. Only applies to H.264 (mp4/mkv/avi) output.

expires_inintegeroptionaldefault: 3600

Download URL TTL in seconds (default 1h, max 24h).

resolutionstringoptional

Output resolution, e.g. '1280x720'. Use '-2' for the dimension you want auto-scaled, e.g. '1280:-2' keeps the aspect ratio.

start_timenumberoptional

Clip start time in seconds.

audio_bitratestringoptional

Audio bitrate, e.g. '128k', '192k', '320k'.

output_formatstringrequired

Target container/format. Use mp3/aac/wav/flac/ogg to extract audio only.

subtitles_urlstringoptional

Pre-signed GET URL of an SRT or VTT subtitle file to burn into the video.

output_filenamestringoptional

Filename for the output file. Defaults to `output.<ext>`.

API Usage

curl -X POST "https://skill.askfaro.com/skills/video/run" \
  -H "Authorization: Bearer faro_<your_key>" \
  -H "Content-Type: application/json" \
  -d '{
  "intent": {
    "prompt": "Convert this MP4 to WebM for use in a web player"
  }
}'

CLI Usage

askfaro describe video/transcode

Install pip install askfaro-cli, then askfaro auth login.

Generate

Generate a short video with audio from a text prompt at a balanced, standard quality, up to 4K. If the response comes back with a continuation_token, call again with only that token until the video is ready.

Usage-based · 125 credits per second at 720p, 150 at 1080p, 375 at 4K

Example prompts

  • Generate an 8-second cinematic clip of a fox running through a snowy forest at dawn
  • Create a 1080p time-lapse of clouds rolling over a mountain range, 6 seconds

Parameters

modelstringoptionaldefault: "veo-3.1-fast-generate-preview"

Standard, balanced-quality video model.

promptstringoptional

Text prompt describing the video to generate. Required on the first call; ignored when continuation_token is set.

resolutionstringoptionaldefault: "720p"

Output resolution. 1080p and 4K force duration to 8 seconds.

aspect_ratiostringoptionaldefault: "16:9"
duration_secondsintegeroptionaldefault: 8

Clip length in seconds. Must be 8 when resolution is 1080p or 4k.

continuation_tokenstringoptional

Token from a prior pending response. When set, all other params are ignored and the server resumes polling; call again with only this token until the video is ready.

API Usage

curl -X POST "https://skill.askfaro.com/skills/video/run" \
  -H "Authorization: Bearer faro_<your_key>" \
  -H "Content-Type: application/json" \
  -d '{
  "intent": {
    "prompt": "Generate an 8-second cinematic clip of a fox running through a snowy forest at dawn"
  }
}'

CLI Usage

askfaro describe video/generate

Install pip install askfaro-cli, then askfaro auth login.

Generate (Fast)

Generate a short video with audio from a text prompt fast and at the lowest cost, up to 1080p. If the response comes back with a continuation_token, call again with only that token until the video is ready.

Usage-based · 62.5 credits per second at 720p, 100 at 1080p

Example prompts

  • Generate a low-cost 6-second 720p clip of waves crashing on a beach
  • Make a quick 9:16 vertical clip of clouds rolling over a mountain range

Parameters

modelstringoptionaldefault: "veo-3.1-lite-generate-preview"

Most affordable video model.

promptstringoptional

Text prompt describing the video to generate. Required on the first call; ignored when continuation_token is set.

resolutionstringoptionaldefault: "720p"

Output resolution. 1080p forces duration to 8 seconds.

aspect_ratiostringoptionaldefault: "16:9"
duration_secondsintegeroptionaldefault: 8

Clip length in seconds. Must be 8 when resolution is 1080p or 4k.

continuation_tokenstringoptional

Token from a prior pending response. When set, all other params are ignored and the server resumes polling; call again with only this token until the video is ready.

API Usage

curl -X POST "https://skill.askfaro.com/skills/video/run" \
  -H "Authorization: Bearer faro_<your_key>" \
  -H "Content-Type: application/json" \
  -d '{
  "intent": {
    "prompt": "Generate a low-cost 6-second 720p clip of waves crashing on a beach"
  }
}'

CLI Usage

askfaro describe video/generate_fast

Install pip install askfaro-cli, then askfaro auth login.

Generate (HQ)

Generate a short video with audio from a text prompt at the highest quality, up to 4K. If the response comes back with a continuation_token, call again with only that token until the video is ready.

Usage-based · 500 credits per second at 720p or 1080p, 750 at 4K

Example prompts

  • Generate a high-quality 8-second 4K clip of a fox running through a snowy forest at dawn
  • Create a cinematic 1080p product teaser, 8 seconds, at the highest quality

Parameters

modelstringoptionaldefault: "veo-3.1-generate-preview"

Highest-quality video model.

promptstringoptional

Text prompt describing the video to generate. Required on the first call; ignored when continuation_token is set.

resolutionstringoptionaldefault: "720p"

Output resolution. 1080p/4k force duration_seconds=8. Lite does not support 4k.

aspect_ratiostringoptionaldefault: "16:9"
duration_secondsintegeroptionaldefault: 8

Clip length in seconds. Must be 8 when resolution is 1080p or 4k.

continuation_tokenstringoptional

Token from a prior pending response. When set, all other params are ignored and the server resumes polling; call again with only this token until the video is ready.

API Usage

curl -X POST "https://skill.askfaro.com/skills/video/run" \
  -H "Authorization: Bearer faro_<your_key>" \
  -H "Content-Type: application/json" \
  -d '{
  "intent": {
    "prompt": "Generate a high-quality 8-second 4K clip of a fox running through a snowy forest at dawn"
  }
}'

CLI Usage

askfaro describe video/generate_hq

Install pip install askfaro-cli, then askfaro auth login.

README

Video Editor

Transcode, clip, concatenate, overlay, and burn subtitles into video or audio, all in a single call.

Workflow

  1. Call POST /uploads/presign on faro-api to get a put_url + get_url pair for each input you need (primary, extra clips, overlay, subtitle file).
  2. PUT your files to the put_urls.
  3. Call this tool with input_url = get_url plus any extras you want.
  4. Download the result from the returned download_url (default 1h TTL, max 24h).

Supported formats

Video outputAudio output
mp4, webm, gif, mkv, avimp3, aac, wav, flac, ogg

Input format is auto-detected. Subtitle files: SRT, VTT, or ASS.

Options

Core (single-input transcode)

NameTypeDefaultDescription
input_urlstringrequiredPre-signed GET URL of the source file.
output_formatstringrequiredTarget format. Audio formats strip video automatically.
output_filenamestringoutput.<ext>Filename in the download URL.
expires_ininteger3600Download URL TTL in seconds (60–86400).
video_crfinteger-Video quality (CRF 0–51). 18 = visually lossless, 23 = default. Lower is better.
audio_bitratestring-Audio bitrate, e.g. 128k, 192k, 320k.
resolutionstring-Output resolution, e.g. 1280x720. Use -2 to auto-scale, e.g. 1280:-2.
start_timenumber-Clip start in seconds. Single-input mode only.
end_timenumber-Clip end in seconds. Single-input mode only.

Multi-clip and effects (optional)

NameTypeDescription
concatarray of URLsAdditional clip URLs to concatenate after input_url. Output = primary + concat clips in order. Up to 20 extras.
overlayobjectOverlay another video/image on top of the primary input. Fields: url, x, y, optional start_time, end_time.
subtitles_urlstringPre-signed GET URL of an SRT/VTT/ASS subtitle file to burn into the video.
textobjectBurn a single text label into the video, optionally with a background box and time window. See fields below.

Overlay object

FieldTypeDefaultDescription
urlstringrequiredPre-signed GET URL of the overlay video or image.
xinteger0Horizontal position of overlay top-left (px).
yinteger0Vertical position of overlay top-left (px).
start_timenumber-Time in seconds when overlay appears. Omit for full duration.
end_timenumber-Time in seconds when overlay disappears.

Text object

Draws a single text label, optionally inside a coloured box, optionally only between start_time and end_time.

FieldTypeDefaultDescription
textstringrequiredText to burn into the video.
xinteger10Horizontal position (px).
yinteger10Vertical position (px).
font_sizeinteger32Font size in pixels (6–400).
colorstringwhiteText color (CSS name or hex).
background_colorstring-If set, a coloured box is drawn behind the text.
background_opacitynumber0.5Box opacity 0.0–1.0.
paddinginteger8Padding (px) around the text inside the box.
start_timenumber-Time in seconds when text appears. Omit for full duration.
end_timenumber-Time in seconds when text disappears.

Output

{
  "download_url": "https://…/output/…/output.mp4",
  "key": "output/…",
  "size_bytes": 4821903,
  "mime": "video/mp4",
  "input_bytes": 52428800,
  "duration_seconds": 62.4,
  "expires_at": "2026-05-15T13:00:00Z"
}

Pricing

1 credit per second of output duration. A 60-second video costs 60 credits. Concatenating five 30-second clips into one 150-second video costs 150 credits. Adding an overlay or subtitles doesn't change the price, output duration is what counts.

Max input size: 500 MB per file. Max encode time: 10 minutes.

GIF output

GIF output is automatically downsampled to 10 fps for manageable file sizes. For a custom resolution, set the resolution field.