以下实例演示了如何使用 net.URL 类的 URL() 构造函数来抓取网页:
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.InputStreamReader;
import java.net.URL;
public class Main {
public static void main(String[] args)
throws Exception {
URL url = new URL("https://www.itzixishi.com");
BufferedReader reader = new BufferedReader
(new InputStreamReader(url.openStream()));
BufferedWriter writer = new BufferedWriter
(new FileWriter("data.html"));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
writer.write(line);
writer.newLine();
}
reader.close();
writer.close();
}
}
以上代码运行输出结果为(网页的源代码,存储在当前目录下的 data.html 文件中):
<!doctype html>
<html lang="zh-cn">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>IT自习室 - 从这里进入零和一的世界!</title>
……
©2020 IT自习室京ICP备20010815号