Multiple Pages / Multiple Items
Last updated
Last updated
Explanation: Extract several items from multiple pages. For example, let’s say we want to scrape all the thread titles on r/RealEstate and then scrape all the comments in those threads.
We can insert the starting URL and a prompt such as below.
https://www.reddit.com/r/RealEstate/
find URLs of all the comment threads
Now we need to remove the second step (Find more URLs) and replace it with the Extract data box. And also edit the third step where we will add new fields.
First, let’s start with the second step.
Note: The exact workflow may vary but in our case, let’s delete the current “Find more URLs” box, add a new second step (+ icon), and choose “Extract”.
Now add the following fields below and make sure to change “Items per page” to “Find multiple items per page”. This means that it will scrape all the URL comments from each thread as well as the corresponding thread title.
For the third step, edit the fields and add the following and make sure to once again, choose “Find multiple items per page” so we can extract all the comment texts and authors from the second step:
And there you have it. You just scraped multiple items from multiple pages!
Awesome! Now you’ve learned FetchFox’s scraping setups. 🎉
In the next section, you’ll learn how pagination works.