Reducing AI cost and context

By default, extractions in FetchFox look at the more or less the full HTML of a page. The full HTML contains all the possible data needed for extraction, but it is almost always overkill. The full HTML includes lots of extra data like headers, footers, navigation, and irrelevant attributes. Sending all that extra data inflates the number of tokens the AI has to look at, and makes extraction very expensive.

FetchFox offers different extraction modes which reduce the amount of tokens sent to the AI. These modes can be specified using the extract_context parameter.

Below are the allowed values for extract_context and their behavior.

text_only context send only text from the page to the AI.
slim_html context sends a subset of the page HTML to the AI. This subset keeps only a few tags and attributes that often contain valuable data, like <img src="..." /> and <a href="..."> tags, and converts the rest to text.
reduce context use AI to learn how to reduce the context. It takes the full HTML of the page along with the template, and asks the AI to write code that reduces the context. The AI looks at the page structure and template, and figures out what to data to keeep. The code from this process is re-used for subsequent extractions. As such, it adds a one time cost at the start of the process, but context for all extractions is reduced.

As an example, lets look at the cost for an extraction that runs on the full HTML.

curl -X POST "https://api.fetchfox.ai/api/extract" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
   "urls": [
     "https://pokemondb.net/pokedex/bulbasaur",
     "https://pokemondb.net/pokedex/ivysaur",
     "https://pokemondb.net/pokedex/venusaur"
   ],
   "template": {
     "name": "Name of the pokemon",
     "number": "National pokedex number",
     "stats": "Base stats as a dictionary"
   },
   extract_context: "full_html"
}'

The response to this call will be something like this:

{
  "job_id": "5ol5q3h2i9",
  "results": {
    "items": [ ...extraction results... ]
  },
  "metrics": {
    "cost": {
      "ai": 0.0189177,
      "network": 1.545e-05,
      "fetchfox": 0.00303,
      "total": 0.02196315
    },
    "ai": [
      {
        "model": "openai:gpt-4o-mini",
        "tokens": {
          "input": 125158,
          "output": 240,
          "total": 125398
        },
        "cost": {
          "input": 0.018773699999999997,
          "output": 0.000144,
          "total": 0.0189177
        },
        "runtime": {
          "sec": 12.846,
          "msec": 12846
        }
      }
    ],
    ...more cost and usage breakdowns...
  },
  "artifacts": [ ...AI generated artifacts... ]
}

Notice the high number of tokens sent to the AI, and corresponding high cost.

Lets run the same request using the text_only context.

curl -X POST "https://api.fetchfox.ai/api/extract" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
   "urls": [
     "https://pokemondb.net/pokedex/bulbasaur",
     "https://pokemondb.net/pokedex/ivysaur",
     "https://pokemondb.net/pokedex/venusaur"
   ],
   "template": {
     "name": "Name of the pokemon",
     "number": "National pokedex number",
     "stats": "Base stats as a dictionary"
   },
   extract_context: "text_only"
}'

The response to this call will be something like this:

{
  "job_id": "irrn2aexop",
  "results": {
    "items": [ ...extraction results... ]
  },
  "metrics": {
    "cost": {
      "ai": 0.00285135,
      "network": 1.535e-05,
      "fetchfox": 0.00303,
      "total": 0.0058967
    },
    "ai": [
      {
        "model": "openai:gpt-4o-mini",
        "tokens": {
          "input": 18049,
          "output": 240,
          "total": 18289
        },
        "cost": {
          "input": 0.00270735,
          "output": 0.000144,
          "total": 0.00285135
        },
        "runtime": {
          "sec": 8.774000000000001,
          "msec": 8774
        }
      }
    ],
    ...more cost and usage breakdowns...
  },
  "artifacts": [ ...AI generated artifacts... ]
}

The number of tokens, and therefore cost, is almost 10x lower. Also, the runtime is lower using the text_only context.

Get Started

Scrape

Crawl

Extract

Reducing AI cost and context