All Questions
Tagged with web-scraping node.js
1,621
questions
1
vote
1
answer
27
views
Getting unexpected/not_present elements/tags while scraping in node js with cheerio
I am scraping and parsing the content of web page (https://www.mydealz.de/new). Structure is like follows.
<div class="threadGrid-title">
<strong><a href="">...
2
votes
2
answers
51
views
puppeteer scraping dynamic content
I'm trying to scrape data from a Looker Studio web page report using Puppeteer in Node.js, but I'm encountering issues because the report is dynamic. When I fetch the data, the body is empty. Here's
...
0
votes
0
answers
26
views
Can't upload files in Puppeteer, Chromium gives "Aw Snap!" error
I have been working with Puppeteer recently and I have been trying to work with file uploading. I've watched multiple tutorials and read the docs for Puppeteer, but everything I try always results in ...
-1
votes
0
answers
17
views
How to Scrape Dynamic Website Content in a MERN Stack Application?
I'm building a MERN stack application and need to scrape content from websites that use dynamic rendering (e.g., JavaScript-rendered pages). I have successfully used Axios and Cheerio to scrape static ...
0
votes
1
answer
57
views
How to Download video with blob url by Puppeteer
I am trying to download the reels from Instagram I have done all the navigation and everything I just have to write the name while running the file which Instagram Id I wanted to download the reel ...
2
votes
1
answer
46
views
Cheerio Why cant I access elements correctly?
Thee html:
<body style="overflow: hidden">
<div class="cookie-box"></div>
<div id="next">
<div></div>
<div&...
0
votes
0
answers
76
views
Puppeteer doesn't load dynamic content (.setContent method)
I am using Puppeteer to take screenshots of the given HTML, the problem is that the setContent method does not load all the resources that are dynamically loaded (in my opinion). As a result, the ...
0
votes
1
answer
49
views
Scrape reddit post and comments
I'm trying to scrape Reddit posts and comments using Node.js and Cheerio, but I'm not getting the expected results. Here’s a brief overview of my setup and the issues I’m facing.
Setup:
Node.js: v14....
0
votes
0
answers
14
views
problem in getting cookies in new chromium.connectOverCDP instance
I'm starting my main browser like this :
const options = {
headless: true,
args: [
`--ignore-certificate-errors`,
`--remote-debugging-port=${port}`
]
};
const browser = ...
0
votes
0
answers
55
views
TikTok transcript webscrape
Given a TikTok URL that has a video with a transcript I'm trying to write a playwright script that will be able to navigate to grab the transcript. The problem is the hover command doesn't seem to ...
2
votes
1
answer
172
views
XPath Selector in Puppeteer 22.x
I have read the newest Puppeteer v22.x documentation about XPath, still don't know how to use XPath in Puppeteer 22.x.
I want to click an element containing the text 'Next'. Here the HTML that has the ...
1
vote
1
answer
33
views
There was a null output problem during scrape using the Puppeteer
I'm trying to scrape all of meaning and similar word data from this site https://en.dict.naver.com/#/search?query=${variable}
[ex)https://en.dict.naver.com/#/search?query=win]
the scrape result is ...
0
votes
0
answers
32
views
NodeJS: How to build a Realtime worker pool that allow adding and removing a worker dynamically
I have a scrapper that must make custom requests to an api. The point is that these requests need to be dynamic, as they can be managed, that is, added or removed. Every time a request is added, the ...
0
votes
2
answers
138
views
Puppeteer getting 404 when connecting to chrome browser remote debugging link (localhost:9222), how to fix this? [duplicate]
I am using puppeteer on an existing browser window using this code:
const browser = await puppeteer.connect({
browserWSEndpoint: 'ws://localhost:9222'
});
I have started a chrome window ...
0
votes
1
answer
68
views
Obtaining Correct Redirections Before Triggering Listeners in Puppeteer on Node.js
Currently, I'm learning Node.js with the Puppeteer library for scraping. I have a question regarding redirections. I encountered a scenario like this: I want to scrape the URL 'https://www.facebook....