Marketplace/firecrawl
Firecrawl

Firecrawl

Active

by Faro

2 tools
upstream:

Turn any web page into clean, LLM-ready Markdown. Firecrawl fetches a URL, renders the JavaScript, strips the chrome, and hands back the content an agent actually needs — no proxies, no headless-browser plumbing, no anti-bot fights to manage yourself. Use it whenever an agent needs the readable content of a URL (research, RAG, summarization), or to quickly discover what URLs exist on a site before deciding what to scrape.

Web Dataweb-scrapingmarkdowncontent-extractionragcrawler

Tools (2)

Scrape

Usage-based · 6.25 credits per page scraped (≈ $0.00625).

Fetch a single URL and return its content as clean Markdown (or HTML, raw text, screenshot, or links). Handles JavaScript rendering, anti-bot, and proxy rotation automatically — the agent just supplies a URL. Ideal for reading articles, docs pages, PDFs, or any page an agent needs to reason about. Supports per-call options like `formats`, `onlyMainContent`, `waitFor` (for dynamic pages), `includeTags` / `excludeTags`, and `actions` (click, scroll, type) before extraction.

Usage-based · 6.25 credits per page scraped (≈ $0.00625).

Example prompts

  • Scrape https://example.com/article and return the markdown
  • Get the cleaned main content of this docs page as markdown
  • Fetch this URL and give me both markdown and a full-page screenshot
  • Read the PDF at this URL and convert it to markdown
  • Get the list of links on a webpage

Parameters

urlstringrequired

The URL to scrape

proxystringoptional

Specifies the type of proxy to use. - **basic**: Proxies for scraping sites with none to basic anti-bot solutions. Fast and usually works. - **enhanced**: Enhanced proxies for scraping sites with advanced anti-bot solutions. Slower, but more reliable on certain sites. Costs up to 5 credits per request. - **auto**: Firecrawl will automatically retry scraping with enhanced proxies if the basic proxy fails. If the retry with enhanced is successful, 5 credits will be billed for the scrape. If the first attempt with basic is successful, only the regular cost will be billed. If you do not specify a proxy, Firecrawl will default to basic.

maxAgeintegeroptionaldefault: 0

Returns a cached version of the page if it is younger than this age in milliseconds. If a cached version of the page is older than this value, the page will be scraped. If you do not need extremely fresh data, enabling this can speed up your scrapes by 500%. Defaults to 0, which disables caching.

mobilebooleanoptionaldefault: false

Set to true if you want to emulate scraping from a mobile device. Useful for testing responsive pages and taking mobile screenshots.

actionsarrayoptional

Actions to perform on the page before grabbing the content

formatsarrayoptionaldefault: ["markdown"]

Formats to include in the output.

headersobjectoptional

Headers to send with the request. Can be used to send cookies, user-agent, etc.

timeoutintegeroptionaldefault: 30000

Timeout in milliseconds for the request

waitForintegeroptionaldefault: 0

Specify a delay in milliseconds before fetching the content, allowing the page sufficient time to load.

blockAdsbooleanoptionaldefault: true

Enables ad-blocking and cookie popup blocking.

locationobjectoptional

Location settings for the request. When specified, this will use an appropriate proxy if available and emulate the corresponding language and timezone settings. Defaults to 'US' if not specified.

parsePDFbooleanoptionaldefault: true

Controls how PDF files are processed during scraping. When true, the PDF content is extracted and converted to markdown format, with billing based on the number of pages (1 credit per page). When false, the PDF file is returned in base64 encoding with a flat rate of 1 credit total.

excludeTagsarrayoptional

Tags to exclude from the output.

includeTagsarrayoptional

Tags to include in the output.

jsonOptionsobjectoptional

JSON options object

storeInCachebooleanoptionaldefault: true

If true, the page will be stored in the Firecrawl index and cache. Setting this to false is useful if your scraping activity may have data protection concerns. Using some parameters associated with sensitive scraping (actions, headers) will force this parameter to be false.

onlyMainContentbooleanoptionaldefault: true

Only return the main content of the page excluding headers, navs, footers, etc.

removeBase64Imagesbooleanoptional

Removes all base 64 images from the output, which may be overwhelmingly long. The image's alt text remains in the output, but the URL is replaced with a placeholder.

skipTlsVerificationbooleanoptionaldefault: false

Skip TLS certificate verification when making requests

changeTrackingOptionsobjectoptional

Options for change tracking (Beta). Only applicable when 'changeTracking' is included in formats. The 'markdown' format must also be specified when using change tracking.

API Usage

curl -X POST "https://api.askfaro.com/invoke/firecrawl/scrapeAndExtractFromUrl" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
  "arguments": {
    "url": "<url>"
  }
}'

CLI Usage

faro invoke firecrawl/scrapeAndExtractFromUrl --params '{"url":"<url>"}'

Install pip install askfaro-cli, then faro auth login.

Map

6.25 credits/call ($0.00625) · 6.25 credits per call (≈ $0.00625).

Given a starting URL, return up to N URLs from the same site without scraping their content — a fast, cheap way to discover the shape of a site before deciding what to scrape. Use it as a precursor to `scrapeAndExtractFromUrl` for crawling-style workflows, or to find all docs/blog/product URLs on a domain. Supports `search` (filter URLs by keyword), `limit`, and `includeSubdomains`.

6.25 credits/call ($0.00625) · 6.25 credits per call (≈ $0.00625).

Example prompts

  • List up to 50 URLs under https://docs.example.com
  • Map this site and filter for URLs containing "pricing"
  • Find all blog post URLs on example.com

Parameters

urlstringrequired

The base URL to start crawling from

limitintegeroptionaldefault: 5000

Maximum number of links to return

searchstringoptional

Search query to use for mapping. During the Alpha phase, the 'smart' part of the search functionality is limited to 1000 search results. However, if map finds more results, there is no limit applied.

timeoutintegeroptional

Timeout in milliseconds. There is no timeout by default.

sitemapOnlybooleanoptionaldefault: false

Only return links found in the website sitemap

ignoreSitemapbooleanoptionaldefault: true

Ignore the website sitemap when crawling.

includeSubdomainsbooleanoptionaldefault: true

Include subdomains of the website

API Usage

curl -X POST "https://api.askfaro.com/invoke/firecrawl/mapUrls" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
  "arguments": {
    "url": "<url>"
  }
}'

CLI Usage

faro invoke firecrawl/mapUrls --params '{"url":"<url>"}'

Install pip install askfaro-cli, then faro auth login.

README

Firecrawl on Faro

Firecrawl turns any web page into clean, LLM-ready Markdown. Give it a URL, get back the content an agent can actually reason about — no headless-browser plumbing, no anti-bot fights, no proxy management.

What's in this listing

Two sync tools, billed per call:

ToolWhat it doesPricing
scrapeAndExtractFromUrlFetch one URL → Markdown, HTML, raw text, screenshot, or links. Handles JS rendering, anti-bot, proxies.6 credits per page scraped (≈ $0.006). Heavier formats (PDF, JS-render, screenshot) consume more pages internally.
mapUrlsDiscover up to N URLs on a site without scraping their content.6 credits per call (≈ $0.006).

When to use what

  • Reading a single page (article, docs page, PDF, product page) → scrapeAndExtractFromUrl
  • Discovering what's on a site before deciding what to scrape → mapUrls, then loop into scrapeAndExtractFromUrl
  • Bulk crawling an entire site, structured extraction across many pages, deep research, search-then-scrape → not in this listing yet (async jobs, coming soon)

Tips for agents

  • For docs and articles, set formats: ["markdown"] and onlyMainContent: true — strips navigation and footers.
  • For dynamic SPAs, use waitFor (ms) to let the page settle before extraction.
  • For pages behind a click (e.g. "Load more"), use the actions array to click/scroll/type before scraping.
  • Costs are returned per call in data.metadata.creditsUsed so the agent can budget precisely.

Limits

  • Single URL per call on scrapeAndExtractFromUrl. For many URLs, loop — batch scraping is async and not exposed yet.
  • mapUrls caps at the upstream's per-call limit (typically 5,000 URLs).
  • Timeouts default to 60s; pages that need more should set waitFor explicitly.