# Content Moderation

> Skill `content-moderation` on Faro. 4 sub-skills.

Analyzes text or images for harmful content, including hate speech, explicit material, violence, and spam. Returns an overall verdict together with per-category flags and calibrated confidence scores, so you can apply your own thresholds rather than trusting a single yes-or-no flag.

**Category:** Images  
**Tags:** moderation, safety, nsfw, content, classification  
**Use when:** You want to check user or generated content for unsafe material before showing, storing, or acting on it.  
**Not for:** Blocking or enforcement, reading text out of an image, virus or malware scanning, AI or deepfake detection, copyright or personal-data detection, or any legal reporting pipeline.  
**Returns:** information — Returns an overall flagged verdict, a per-category flag map, a per-category score from 0 to 1, and which modality triggered each category. It reports a verdict and never blocks; the caller decides what to do with it. Not a downloadable file.

## How to run
Skills run through one gateway with your Faro token. Hand it an `intent` in plain language; Faro routes to the right sub-skill, runs it, and bills per call. Raw tools are internal plumbing and are not directly callable.

```
POST https://skill.askfaro.com/skills/content-moderation/run
Authorization: Bearer faro_<your_key>
Content-Type: application/json

{"intent":{"prompt":"Is this user comment safe to publish?"}}
```

Or from the CLI:

```bash
pip install askfaro-cli && askfaro auth login
askfaro run content-moderation "Is this user comment safe to publish?"
```

Full run reference: https://askfaro.com/llms/run.md — Agent recipe: https://askfaro.com/llms/skill.md

## Example requests

- Is this user comment safe to publish?
- Is this profile photo appropriate for a family-friendly platform?
- Check this social post: it has a caption and an attached photo
- Score this batch of comments so I can rank them by risk level

## Sub-skills

### Check text

Checks a piece of text for unsafe content and returns a verdict with per-category scores.

**Cost:** 0.1 credits / check

**Use when:** You want to screen a message, comment, caption, or generated text before acting on it.

**Details:** https://askfaro.com/llms/skills/content-moderation/moderate_text.md

---

### Check an image

Checks an image for unsafe content and returns a verdict with per-category scores.

**Cost:** 0.1 credits / check

**Use when:** You want to screen a picture for explicit, violent, or self-harm content before showing or storing it.

**Details:** https://askfaro.com/llms/skills/content-moderation/moderate_image.md

---

### Check text and image together

Checks an image with its accompanying text in one call and returns a combined verdict with per-category scores.

**Cost:** 0.1 credits / check

**Use when:** You want to screen a post that has both a caption and an image as a single unit.

**Details:** https://askfaro.com/llms/skills/content-moderation/moderate_multimodal.md

---

### Score for triage

Returns the full per-category score map ranked for thresholding or queue triage.

**Cost:** 0.1 credits / check

**Use when:** You want to set your own per-category thresholds or rank a queue by confidence, not trust one flag.

**Details:** https://askfaro.com/llms/skills/content-moderation/triage.md

---

---
On the web: https://askfaro.com/search/content-moderation