IDWR bulletin data scraping the number of reports per fixed point of influenza and by prefecture

Data wrangling PDF about influenza outbreaks by the Ministry of Health, Labor and Welfare
PDF data on influenza outbreaks from the Ministry of Health, Labor and Welfare (pdfplumber)

National Institute of Infectious Diseases has CSV of the same data, so scraping

from urllib.parse import urljoin

import requests
from bs4 import BeautifulSoup

url = "https://www.niid.go.jp/niid/ja/data.html"

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
}

r = requests.get(url, headers=headers)
r.raise_for_status()

soup = BeautifulSoup(r.content, "html.parser")

tag = soup.select_one(
    'div.leading-0 > table > tbody > tr > td > p.body1 > a[href$="-teiten.csv"]'
)

link = urljoin(url, tag.get("href"))

import pandas as pd

df = pd.read_csv(
    link,
    encoding="cp932",
    skiprows=3,
    index_col=0,
    header=0,
    usecols=[0, 1, 2],
    na_values="-",
)

df1 = df[df.index.notna()]

Recommended Posts

IDWR bulletin data scraping the number of reports per fixed point of influenza and by prefecture

Data Langling PDF on the outbreak of influenza by the Ministry of Health, Labor and Welfare

Visualization of data by prefecture

I checked the number of closed and opened stores nationwide by Corona

Let's calculate the transition of the basic reproduction number of the new coronavirus by prefecture

[Python] Precautions when retrieving data by scraping and putting it in the list

Divides the character string by the specified number of characters. In Ruby and Python.

Scraping the rainfall data of the Japan Meteorological Agency and displaying it on M5Stack

Scraping the number of downloads and positive registrations of the new coronavirus contact confirmation app

Paste a link to the data point of the graph created by jupyterlab & matplotlib

[Python] Plot data by prefecture on a map (number of cars owned nationwide)

Minimize the number of polishings by combinatorial optimization

Scraping the winning data of Numbers using Docker

Let's put out a ranking of the number of effective reproductions of the new coronavirus by prefecture