Content Moderation

Your agent has no reliable way to judge if content is unsafe; this scores text and images by category before you act on them.

Images

Analyzes text or images for harmful content, including hate speech, explicit material, violence, and spam. Returns an overall verdict together with per-category flags and calibrated confidence scores, so you can apply your own thresholds rather than trusting a single yes-or-no flag.

Use when

You want to check user or generated content for unsafe material before showing, storing, or acting on it.

Not for

Blocking or enforcement, reading text out of an image, virus or malware scanning, AI or deepfake detection, copyright or personal-data detection, or any legal reporting pipeline.

What you can do

Each is a sub-skill of Content Moderation; the router picks the right one for your request.

Check text

Checks a piece of text for unsafe content and returns a verdict with per-category scores.

0.1 credits / check

Check an image

Checks an image for unsafe content and returns a verdict with per-category scores.

0.1 credits / check

Check text and image together

Checks an image with its accompanying text in one call and returns a combined verdict with per-category scores.

0.1 credits / check

Score for triage

Returns the full per-category score map ranked for thresholding or queue triage.

0.1 credits / check

What you get back

information

Returns an overall flagged verdict, a per-category flag map, a per-category score from 0 to 1, and which modality triggered each category. It reports a verdict and never blocks; the caller decides what to do with it. Not a downloadable file.

When it checks with you first

It is unclear whether the content to check is text, an image, or both.

Run it

Skills run through one gateway with your Faro token. Hand it an intent in plain language; Faro routes to the right sub-skill, runs it, and bills per call.

curl -X POST "https://skill.askfaro.com/skills/content-moderation/run" \
  -H "Authorization: Bearer $FARO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"intent":{"prompt":"Is this user comment safe to publish?"}}'

Example requests

›Is this user comment safe to publish?
›Is this profile photo appropriate for a family-friendly platform?
›Check this social post: it has a caption and an attached photo
›Score this batch of comments so I can rank them by risk level

Create a free account