Most pages on the web fall into one of two categories:

  • Detail pages that describe a single item
  • List pages that show multiple items

FetchFox can extract from both pages. For detail pages, you want to extract one item for each URL, and for list pages, you want to extract multiple items per page.

The default extraction mode is to extract one item per URL. In this mode, if you pass in 10 URLs, you will get exactly 10 items in your results. This works well for detail pages.

To extract multiple items per page, simple set the per_page parameter to many. This will tell FetchFox that each URL you pass in contains many items.

Below is an example of passing in several URLs to the extract endpoint, and asking FetchFox to extract multliple items from each URL.

curl -X POST "https://api.fetchfox.ai/api/extract" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
   "urls": [
     "https://pokemondb.net/pokedex/bulbasaur",
     "https://pokemondb.net/pokedex/ivysaur",
     "https://pokemondb.net/pokedex/venusaur"
   ],
   "template": {
     "move_name": "Name of the pokemon move",
     "move_type": "Name of the move type",
     "move_power": "The power of the move"
   },
   "per_page": "many"
}'

The response will look something like this:

{
  "job_id": "fjszygdh38",
  "results": {
    "items": [
      {
        "move_name": "Growl",
        "move_type": "Normal",
        "move_power": "100",
        "_url": "https://pokemondb.net/pokedex/ivysaur",
        "_htmlUrl": "https://ffcloud.s3.amazonaws.com/visit/html/xz6rjf8h2v.html"
      },
      {
        "move_name": "Growth",
        "move_type": "Normal",
        "move_power": "—",
        "_url": "https://pokemondb.net/pokedex/ivysaur",
        "_htmlUrl": "https://ffcloud.s3.amazonaws.com/visit/html/xz6rjf8h2v.html"
      },
      ...more items...
    ]
  },
  "metrics": { ...cost and usage metrics... },
  "artifacts": [
    {
      "type": "divide",
      "divide": {
        "reasoning": "Moves are presented in tables with the class 'data-table', where each move (row) is represented by a <tr> inside <tbody>. I focused on extracting each <tr> for coverage.",
        ...more chain of thought...
        "selector": ".data-table tbody tr"
      }
    },
    {
      "type": "schema",
      "schema": { ...JSON schema definition... }
    }
  ]
}

Each URL results in multiple items. Pricing for the extract endpoint is on a per URL basis, so the FetchFox fee you extract does go up if you extract hundreds or thousands of items from a single URL. The AI charges may be slightly higher though.