All Questions
Tagged with web-scraping java
1,063
questions
1
vote
0
answers
35
views
Scraping a website that contains _dopostback method written
I am trying to scrape with jsoup in java, obtain information from the "stdregistro" table and save it in a table in my database at this URL:
pad.minem.gob.pe/REINFO_WEB/Index.aspx
but I only ...
0
votes
1
answer
82
views
click() method from element object (HtmlUnit) is doing nothing
I am trying to crawl on this website: https://www.softpedia.com/get/Programming/Other-Programming-Files/Apidog.shtml using html unit.
I want to "click" on "Download now" button, it ...
-1
votes
1
answer
39
views
HtmlUnitAndroid click does not do anything
I am using HtmlUnit Android to scrap data from a website, I am able to create the web client and connect it to the website, but after clicking a button I cannot obtain further elements. In the code, ...
-1
votes
1
answer
41
views
error: cannot assign a value to final variable android java
I am working on android project and I'm stuck on this error: cannot assign a value to final variable count.
I have used following code.
public final Thread z = new Thread(new Runnable() {
...
2
votes
2
answers
99
views
Extract business information from publicly available G Maps search data but getting NullPointerException
I am using the given code for web scraping. Basically it is a web page scraping app in which the user searches for the target by providing specific data. A webview is used in this code. The listed ...
0
votes
1
answer
110
views
How to scrape a Facebook page posts using jsoup?
I'm trying to scrape a Facebook page in Spring boot using jsoup.
The method bellow returns an empty JSON:
@GetMapping("/test-json")
public String scrapeFacebookPageJson() throws IOException {...
0
votes
1
answer
215
views
HtmlUnit: Handling Hidden Google reCAPTCHA Token in Login Automation
I'm using HtmlUnit to automate the login process on a website. The website employs a Google reCAPTCHA to protect its login form, and the reCAPTCHA token is hidden in the HTML. I need to obtain and use ...
0
votes
2
answers
108
views
Could not start a new session - incompatible version- Selenium Java
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
</dependency>
<dependency>
&...
0
votes
0
answers
58
views
Java Selenium Stuck in Login
I have a bot that running almost 1 year but lately it was stuck on login, it seems that the website is updated I'm not sure but I feel they put a bot detection features. When I make a "new tab&...
0
votes
0
answers
798
views
Error reading commands.properties, using local instead - Error Selenium Java
My dependecies
`<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
</dependency>
<...
0
votes
0
answers
54
views
How do I solve this 403 Error on Java Webscraper?
I am running this webscraper to try and find data about job values from Indeed.Com.
I keep running into a 403 error (: HTTP error fetching URL. Status=403, URL=[https://www.indeed.com/cmp/Critical-...
0
votes
1
answer
79
views
Jsoup can't find existing element by class name
I'm trying to parse pastebin page with all params(likes, views_count, etc). And all fine except raw text.
I know that there is a \<paste_id\>/raw, but I don't want to use it since I will have to ...
1
vote
1
answer
232
views
Using HTMLUnit in Android Studio to scrap website in android app, but WebClint is not recognizing as import in ActivityClass
I want to scrap a website, but I can't use jsoup because jsoup don't have JavaScript execution. I am trying to run HTMLUnit in my Android app with version: 3.3.0, but in activity class, its not ...
1
vote
0
answers
44
views
Jsoup gives the same document from two different urls
I am making a java aplication for kakuro and I wanted to scrape the boards from a site. Now the dimensions "3x3", and "4x4" don't work but in the meantime "5x5", "...
0
votes
0
answers
175
views
Apache Nutch 1.19 Getting Error: 'boolean org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(java.lang.String, int)'
So, I set up Apache Nutch 1.19, Apache Solr 8.11.2, and Hadoop 3.3.5, to the best of my knowledge on a Windows 11 PC.
After, I went into the nutch directory and ran this command:
bin/nutch generate ...