Scrape = Crawl + Extract

FetchFox scraping is two steps:

Crawl to find relevant URLs.
Extract to turn page content into structured items.

Crawl for URLs

Use /api/crawl with a pattern:

* matches any characters except /
** matches any characters including /

Example:

curl -X POST https://api.fetchfox.ai/api/crawl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FETCHFOX_API_KEY" \
-d '{
    "pattern":"https://pokemondb.net/pokedex/*",
    "maxVisits": 50
}'

Typical response:

{
  "jobId": "5ooygvit1y",
  "results": {
    "hits": [
      "https://pokemondb.net/pokedex/all",
      "https://pokemondb.net/pokedex/archaludon",
      "https://pokemondb.net/pokedex/charizard",
      "https://pokemondb.net/pokedex/corviknight",
      "https://pokemondb.net/pokedex/dipplin",
      "https://pokemondb.net/pokedex/dragapult",
      "https://pokemondb.net/pokedex/dragonite",
      "https://pokemondb.net/pokedex/eevee",
      "https://pokemondb.net/pokedex/game/legends-arceus",
      "https://pokemondb.net/pokedex/game/scarlet-violet",
      "...more results..."
    ]
  },
  "metrics": { "...cost and usage metrics..." }
}

The URLs are returned in results.hits.

Extract from URLs to get items

Use /api/extract with:

url or urls
template

Example:

curl -X POST "https://api.fetchfox.ai/api/extract" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FETCHFOX_API_KEY" \
-d '{
   "urls": [
     "https://pokemondb.net/pokedex/bulbasaur",
     "https://pokemondb.net/pokedex/ivysaur",
     "https://pokemondb.net/pokedex/venusaur"
   ],
   "template": {
     "name": "Name of the pokemon",
     "number": "National pokedex number",
     "stats": "Base stats as a dictionary"
   }
}'

Typical response:

{
  "jobId": "j8rcgsnxq3",
  "results": {
    "items": [
      {
        "name": "Bulbasaur",
        "number": "0001",
        "stats": {
          "HP": 45,
          "Attack": 49,
          "Defense": 49,
          "Sp. Atk": 65,
          "Sp. Def": 65,
          "Speed": 45,
          "Total": 318
        },
        "_url": "https://pokemondb.net/pokedex/bulbasaur",
        "_htmlUrl": "https://ffcloud.s3.amazonaws.com/visit/html/4h0o70v9fh.html"
      },
      "...more results..."
    ]
  },
  "metrics": { "...cost and usage metrics..." }
}

A single endpoint to crawl and extract

To run both phases in one request, use /api/scrape.

curl -X POST "https://api.fetchfox.ai/api/scrape" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
     "pattern": "https://www.pokemon.com/us/pokedex/*",
     "template": {
       "name": "Name of the pokemon",
       "number": "National pokedex number"
     },
     "maxVisits": 50,
     "maxExtracts": 10
  }'

Use /api/scrape when you want the convenience of one call. Use /api/crawl + /api/extract directly when you want fine-grained control over each phase.

​Crawl for URLs

​Extract from URLs to get items

​A single endpoint to crawl and extract

Crawl for URLs

Extract from URLs to get items

A single endpoint to crawl and extract