Questions tagged [web-scraping]
Web scraping is the process of extracting specific information from websites that do not readily provide an API or other methods of automated data retrieval. Questions about "How To Get Started With Scraping" (e.g. with Excel VBA) should be *thoroughly researched* as numerous functional code samples are available. Web scraping methods include 3rd-party applications, development of custom software, or even manual data collection in a standardized way.
web-scraping
50,890
questions
-1
votes
0
answers
18
views
Trying to scrap this web page that contents this type of authentication
I need help, I am trying to scrap a web page but I need log, the problem is this type of authentication. The authentication form shown in the screenshot is a basic browser authentication popup, which ...
0
votes
0
answers
9
views
playwrigh locator doesn't update the refernce for every looping time
The locator doesn't be updated for every loop
# loop scrolling last element into view until no more new elements are created`
stream_boxes = None
while True:
stream_boxes = page.locator("//...
0
votes
0
answers
39
views
How to Resolve Verify Traffic Error When Scraping Data from Shopee?
I'm currently working on a project where I need to scrape product data from Shopee. I'm using Python with the requests and BeautifulSoup libraries. However, I keep encountering a Verify Traffic error ...
-1
votes
0
answers
26
views
An element exist in web but it turns out to be null when I scrape it. How could I solve it?
I am scraping a dynamic web which applied JS and react function on it (a blockchain explorer). I attempted to build a program which supposed to be able to scrape with JS running. However, it return ...
0
votes
2
answers
37
views
How to obtain data from an IFRAME with Python and Selenium
I am trying to obtain a value from this page: https://www.bbva.com.co/personas/productos/inversion/fondos/pais.html
I the imagen, I show you what I need to obtain.
Inspected page
The first thing that ...
-1
votes
0
answers
25
views
Puppeteer - check if a html element is visible 'on-screen'
Hello
My goal is to imitate human behaviour and make sure that only scrolled-to links/hyperlinks are clicked with puppeteer. Just like a human would click only the hyperlinks they can see on the ...
-1
votes
0
answers
21
views
Is it possible to scrape a single text field from an external website URL & display that text on a Wordpress site?
For context: I'd like to be able to scrape 'Population' data from publicly available Australian government census information for each suburb, and have that number display dynamically in a text field ...
-5
votes
0
answers
24
views
How can I scrape websites to find those using a specific SaaS software? [closed]
I am trying to identify websites that use a specific SaaS software. I have a sample of the HTML structure where the SaaS software is integrated:
<div id="community_plans">
<div ...
0
votes
0
answers
34
views
html_nodes always return {xml_nodeset (0)} [duplicate]
I'm trying to scrape this page with the rvest R package: https://www.bienici.com/recherche/achat/dessin-669cc780ec9a6600b7687ce8/2-pieces?prix-max=260000&surface-min=40&surface-max=55&neuf=...
-4
votes
0
answers
38
views
Scraping image Challenge [closed]
I FAILED scraping PROFILE PICTURE FROM THIS WEBSITE:
https://www.football.org.il/en/players/player/?player_id=113625
The pictures are being saved not as .png but as unusual "ImageServer/GetImage....
-2
votes
1
answer
49
views
Selenium webscraping recaptcha [closed]
I want to scrape a website but before that there is a recaptcha and I even got the data using api,and i also injected it into the website as the webpage has no submit button I couldn't submit. the ...
1
vote
0
answers
35
views
Scraping a website that contains _dopostback method written
I am trying to scrape with jsoup in java, obtain information from the "stdregistro" table and save it in a table in my database at this URL:
pad.minem.gob.pe/REINFO_WEB/Index.aspx
but I only ...
-3
votes
1
answer
61
views
How do I fix my code, it is returning an empty list?
I am scraping an ecommerce website and its returning an empty list
This is the code I wrote.
import requests
from bs4 import BeautifulSoup
baseurl = 'https://www.thewhiskyexchange.com/'
headers = {'...
-4
votes
1
answer
36
views
Best approach to simulate purchases for stock level information using Python and Selenium [closed]
I'm developing a web scraping service, primarily focusing on the fashion industry. My goal is to provide comprehensive data about products, including their stock levels. To achieve this, I need to ...
-3
votes
1
answer
39
views
Crawl data in Top 250 Movies IDMb
Please, i need someone help me. I can't understand why I only crawl 25 movies instead of 250. My code:
import pandas as pd
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': '...