Questions tagged [apify]
Apify is a service to run docker images in the cloud. It is primarily used for web scraping and crawling with headless Chrome and Puppeteer, but can handle a wide variety of tasks. Apify also maintains Apify SDK, an open-source library for web scraping and crawling in JavaScript.
apify
206
questions
0
votes
0
answers
30
views
Using Apify Proxy With Playwright
So , i Trying To use Playwright Python Script headless With Apify trial Proxy inAnd Run it in Apify Console.
I always Get different Error When Trying To use thire proxies
Nx_proxy error......etc
I ...
0
votes
2
answers
48
views
How can I modify this Python to to return a Property ID on Redfin, given addresses?
I am trying write and run Python code on Google Colaboratory. I want use this Redfin scraper (https://apify.com/tri_angle/redfin-search) to pull a property ID for a given list of addresses. The code ...
0
votes
0
answers
36
views
Apify Actor is not being authorized
I am attempting to pass through some of three actors, and am returning the following errors, per actor ID. Any insight on this would be very much appreciated.
Error running Apify actor ...
0
votes
0
answers
96
views
Crawlee Apify - How to skip links with rel=nofollow?
The goal of my script is to retrieve all links of a website using the PuppetterCrawler from Crawlee. I was wondering how to skip the links who have rel="nofollow" has attribute. I have tried ...
0
votes
1
answer
90
views
Apify: weird costs for using key-value stores
I'm working with Node.js but I think my question is language-agnostic.
In my Apify actor, First I do a few long scrapes, and store the result in some arrays.
Then I do many successive POST requests ...
1
vote
1
answer
1k
views
Playwright Crawler Error: "Target page, context or browser has been closed"
I am using Playwright with Crawlee to crawl web pages and analyze data. However, I'm encountering a persistent issue where, upon trying to access page data using locator methods, I receive the ...
0
votes
1
answer
161
views
How to wait for specific AJAX request in Puppeteer crawler
I need to fetch the data from ajax request made to graphQL. Pages are crawled by PuppeteerCrawler:
const crawler = new Apify.PuppeteerCrawler({
preNavigationHooks: [
async ({ page }): Promise<...
1
vote
0
answers
216
views
How to fix: "Crawler reached the maxRequestsPerCrawl limit of 1 requests and will shut down soon"
Cheerio crawler is not crawling when I set maxRequestPerCrawl to 1.
Even when I set maxRequestPerCrawl to 10 or 100, after the 10th or 100th request nothing will be crawled again anymore. How can I ...
0
votes
0
answers
14
views
How to remove pop-up banner while making a rolling GIF using glenn/gif-scroll-animation on Apify
I wanted to take a GIF rolling screenshot of an entire webpage and found this tool on Apify: glenn/gif-scroll-animation
Unfortunatey, there's a pop-up banner which blocks the view that I need to get ...
0
votes
1
answer
369
views
Crawlee scrapper invoking the same handler multiple times
I've built a Crawlee scrapper, but for some reason it invokes the same handler multiple times, creating a lot of duplicate requests and entries in my dataset. Also:
I've already tried manually ...
1
vote
1
answer
606
views
How to send a cookie to authenticate with Crawlee (Apify)
From the documentation I can read only a function for saving
optionalpersistCookiesPerSession
persistCookiesPerSession?: boolean
Inherited from HttpCrawlerOptions.persistCookiesPerSession
...
0
votes
1
answer
147
views
passing `apify_api_token` as a named parameter
I am new to python, I am trying to pass the API key to apifywrapper here in the notebook. I am getting this error:
ValidationError Traceback (most recent call
last) in <...
0
votes
1
answer
72
views
My function isn't running when using my scraper with Apify
Hello everyone and thanks in advance for the help.
When running the scrapper below locally everything goest well and I get the expected value :
2023-10-08 00:15:41 [scrapy.core.scraper] DEBUG: Scraped ...
0
votes
1
answer
101
views
Apify Scrapy template returns Attribute error
I install the Scrapy template from Apify CLI and run it as intructed. I don't do any changes.
When I deploy it to Apify, the same problem occurs. Even when I used the same template from Apify console ...
0
votes
0
answers
148
views
Why does my web crawler work on localhost but not in the docker container?
I'm trying to create a web crawler using crawlee and apify With NodeJS. This crawler works when I run on the localhost, but when I run on docker container, I receive a timeout error waiting for ...