Try scraping using java [Notes]

What I tried

Obtained IT articles from Yahoo News.


Note writing


jdk14.0.1 jsoup-1.13.1.jar


Since the environment setting of vscode was not completed, jsoup was not loaded at first and an error occurred. Correspond by setting in the following file.

settings.json


    "java.project.referencedLibraries": [
        "lib/**/*.jar",
        "C:\\path\\jsoup-1.13.1.jar"
     ],

demojava/demo3/Web.java


package demojava.demo3;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;

public class Web {
	public static void main(String[] args) throws IOException {
		Document document = Jsoup.connect("https://news.yahoo.co.jp/topics/it").get();
		Elements courses = document.select(".newsFeed_item_link");
		for (Element course : courses) {
			System.out.println(course.attr("href")+ " [[::]] " + course.text());
		}
	}
}

Acquisition result

2020-10-10_13-25-42.png

Recommended Posts

Try scraping using java [Notes]
Scraping practice using Java ②
Scraping practice using Java ①
Try using RocksDB in Java
java notes
Try using Redis with Java (jar)
[Java] Try to implement using generics
Try using IBM Java method tracing
Try using Hyperledger Iroha's Java SDK
[Java] Where did you try using java?
Try using libGDX
Try using powermock-mockito2-2.0.2
Try using GraalVM
Java Generics (Notes)
Try using the Stream API in Java
Try Java 8 Stream
Try using jmockit 1.48
Try using sql-migrate
Notes on operators using Java ~ String type ~
[Java] Array notes
Study Java Try using Scanner or Map
Try using JSON format API in Java
Try using SwiftLint
Try using Log4j 2.0
[Java] Study notes
Java serialization notes
Try using JobScheduler's REST-API --Java RestClient implementation--
Roughly try Java 9
Try using the Wii remote with Java
Try using Firebase Cloud Functions on Android (Java)
Try using JobScheduler's REST-API --Java RestClient Test class-
Try scraping about 30 lines in Java (CSV output)
Try using Sourcetrail (win version) in Java code
Try using GCP's Cloud Vision API in Java
Try using Sourcetrail (macOS version) in Java code
Try similar search of Image Search using Java SDK [Search]
Try accessing the dataset from Java using JZOS
Try communication using gRPC on Android + Java server
Try using the COTOHA API parsing in Java
Try using Axon Framework
Sorting using java comparator
[Java] Stream Collectors notes
Try using JobScheduler's REST-API
Java formatted output [Notes]
Try using java.lang.Math methods
Try using PowerMock's WhiteBox
[Java] Control syntax notes
Try using Talend Part 2
Java NIO 2 review notes
Try using Talend Part 1
Try using F # list
[Java] Basic method notes
Try using each_with_index method
Try Java return value
Try using Spring JDBC
Try implementing the Eratosthenes sieve using the Java standard library
Try Image Search's similar search using Java SDK [Registration]
Try image classification using TensorFlow Lite on Android (JAVA)
Try global hooking in Java using the JNativeHook library
Try to build a Java development environment using Docker
I tried scraping a stock chart using Java (Jsoup)