Your agent can't run media tooling or render video itself; this edits existing clips and generates new ones from a prompt.
A complete video toolkit: edit existing footage (transcode, clip, concatenate, overlay, burn in subtitles, draw text) and generate short videos with audio from a text prompt in a regular and a high-quality tier. One call, MP4/WebM/GIF and more out.
Transcode, clip, concatenate, overlay, burn subtitles, or add text labels to video or audio.
Burn a single text label into the video, optionally with a background box and time window.
Additional clip URLs to concatenate after input_url. Output duration = sum of all clips. Up to 20 extra clips.
Overlay another video or image on top of the primary input.
Clip end time in seconds. Single-input mode only, ignored if concat is set.
Pre-signed GET URL of the input video or audio file. Obtain via POST /uploads/presign on faro-api.
Video quality (CRF). 0=lossless, 18=visually lossless, 23=default H.264, 28=lower quality. Only applies to H.264 (mp4/mkv/avi) output.
Download URL TTL in seconds (default 1h, max 24h).
Output resolution, e.g. '1280x720'. Use '-2' for the dimension you want auto-scaled, e.g. '1280:-2' keeps the aspect ratio.
Clip start time in seconds.
Audio bitrate, e.g. '128k', '192k', '320k'.
Target container/format. Use mp3/aac/wav/flac/ogg to extract audio only.
Pre-signed GET URL of an SRT or VTT subtitle file to burn into the video.
Filename for the output file. Defaults to `output.<ext>`.
curl -X POST "https://skill.askfaro.com/skills/video/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "Convert this MP4 to WebM for use in a web player"
}
}'askfaro describe video/transcode
Install pip install askfaro-cli, then askfaro auth login.
Generate a short video with audio from a text prompt at a balanced, standard quality, up to 4K. If the response comes back with a continuation_token, call again with only that token until the video is ready.
Standard, balanced-quality video model.
Text prompt describing the video to generate. Required on the first call; ignored when continuation_token is set.
Output resolution. 1080p and 4K force duration to 8 seconds.
Clip length in seconds. Must be 8 when resolution is 1080p or 4k.
Token from a prior pending response. When set, all other params are ignored and the server resumes polling; call again with only this token until the video is ready.
curl -X POST "https://skill.askfaro.com/skills/video/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "Generate an 8-second cinematic clip of a fox running through a snowy forest at dawn"
}
}'askfaro describe video/generate
Install pip install askfaro-cli, then askfaro auth login.
Generate a short video with audio from a text prompt fast and at the lowest cost, up to 1080p. If the response comes back with a continuation_token, call again with only that token until the video is ready.
Most affordable video model.
Text prompt describing the video to generate. Required on the first call; ignored when continuation_token is set.
Output resolution. 1080p forces duration to 8 seconds.
Clip length in seconds. Must be 8 when resolution is 1080p or 4k.
Token from a prior pending response. When set, all other params are ignored and the server resumes polling; call again with only this token until the video is ready.
curl -X POST "https://skill.askfaro.com/skills/video/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "Generate a low-cost 6-second 720p clip of waves crashing on a beach"
}
}'askfaro describe video/generate_fast
Install pip install askfaro-cli, then askfaro auth login.
Generate a short video with audio from a text prompt at the highest quality, up to 4K. If the response comes back with a continuation_token, call again with only that token until the video is ready.
Highest-quality video model.
Text prompt describing the video to generate. Required on the first call; ignored when continuation_token is set.
Output resolution. 1080p/4k force duration_seconds=8. Lite does not support 4k.
Clip length in seconds. Must be 8 when resolution is 1080p or 4k.
Token from a prior pending response. When set, all other params are ignored and the server resumes polling; call again with only this token until the video is ready.
curl -X POST "https://skill.askfaro.com/skills/video/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "Generate a high-quality 8-second 4K clip of a fox running through a snowy forest at dawn"
}
}'askfaro describe video/generate_hq
Install pip install askfaro-cli, then askfaro auth login.
Transcode, clip, concatenate, overlay, and burn subtitles into video or audio, all in a single call.
POST /uploads/presign on faro-api to get a put_url + get_url pair for each input you need (primary, extra clips, overlay, subtitle file).put_urls.input_url = get_url plus any extras you want.download_url (default 1h TTL, max 24h).| Video output | Audio output |
|---|---|
| mp4, webm, gif, mkv, avi | mp3, aac, wav, flac, ogg |
Input format is auto-detected. Subtitle files: SRT, VTT, or ASS.
| Name | Type | Default | Description |
|---|---|---|---|
input_url | string | required | Pre-signed GET URL of the source file. |
output_format | string | required | Target format. Audio formats strip video automatically. |
output_filename | string | output.<ext> | Filename in the download URL. |
expires_in | integer | 3600 | Download URL TTL in seconds (60–86400). |
video_crf | integer | - | Video quality (CRF 0–51). 18 = visually lossless, 23 = default. Lower is better. |
audio_bitrate | string | - | Audio bitrate, e.g. 128k, 192k, 320k. |
resolution | string | - | Output resolution, e.g. 1280x720. Use -2 to auto-scale, e.g. 1280:-2. |
start_time | number | - | Clip start in seconds. Single-input mode only. |
end_time | number | - | Clip end in seconds. Single-input mode only. |
| Name | Type | Description |
|---|---|---|
concat | array of URLs | Additional clip URLs to concatenate after input_url. Output = primary + concat clips in order. Up to 20 extras. |
overlay | object | Overlay another video/image on top of the primary input. Fields: url, x, y, optional start_time, end_time. |
subtitles_url | string | Pre-signed GET URL of an SRT/VTT/ASS subtitle file to burn into the video. |
text | object | Burn a single text label into the video, optionally with a background box and time window. See fields below. |
| Field | Type | Default | Description |
|---|---|---|---|
url | string | required | Pre-signed GET URL of the overlay video or image. |
x | integer | 0 | Horizontal position of overlay top-left (px). |
y | integer | 0 | Vertical position of overlay top-left (px). |
start_time | number | - | Time in seconds when overlay appears. Omit for full duration. |
end_time | number | - | Time in seconds when overlay disappears. |
Draws a single text label, optionally inside a coloured box, optionally only between start_time and end_time.
| Field | Type | Default | Description |
|---|---|---|---|
text | string | required | Text to burn into the video. |
x | integer | 10 | Horizontal position (px). |
y | integer | 10 | Vertical position (px). |
font_size | integer | 32 | Font size in pixels (6–400). |
color | string | white | Text color (CSS name or hex). |
background_color | string | - | If set, a coloured box is drawn behind the text. |
background_opacity | number | 0.5 | Box opacity 0.0–1.0. |
padding | integer | 8 | Padding (px) around the text inside the box. |
start_time | number | - | Time in seconds when text appears. Omit for full duration. |
end_time | number | - | Time in seconds when text disappears. |
{
"download_url": "https://…/output/…/output.mp4",
"key": "output/…",
"size_bytes": 4821903,
"mime": "video/mp4",
"input_bytes": 52428800,
"duration_seconds": 62.4,
"expires_at": "2026-05-15T13:00:00Z"
}
1 credit per second of output duration. A 60-second video costs 60 credits. Concatenating five 30-second clips into one 150-second video costs 150 credits. Adding an overlay or subtitles doesn't change the price, output duration is what counts.
Max input size: 500 MB per file. Max encode time: 10 minutes.
GIF output is automatically downsampled to 10 fps for manageable file sizes. For a custom resolution, set the resolution field.