I tried scraping

Nice to meet you This is Nagamasa Yamada. I tried to make a program to look up words using scraping guys I did it with google colaboratory https://colab.research.google.com/notebooks/welcome.ipynb?hl=ja

Purpose

Create a program to search the meaning of words using python

How this works

input ➡ Get information in online dictionary with scraping ➡ Display

Online dictionary to use

1C5E273B-9F1A-4C57-9270-162562B6628E.png

program

python


from bs4 import BeautifulSoup
import urllib
import urllib.parse
#Uenoha is the one that encodes Japanese
g=input()
m= urllib.parse.quote(g)
url =f'https://dictionary.goo.ne.jp/srch/all/{m}/m0u/'
headers = {
          "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0",
          }
request = urllib.request.Request(url, headers=headers)
html = urllib.request.urlopen(request)
soup = BeautifulSoup(html, 'html.parser')
a = soup.select('div[class="example_sentence"] ul[class="content_list idiom lsize"] p[class="text"]')
for x in a:
  print(x.text)

python


from bs4 import BeautifulSoup
import urllib

This is the one you need to scrape

python


import urllib.parse

This is the one you need to encode

python


g=input()
m= urllib.parse.quote(g)

This encodes what you input. I forgot here and it didn't work

python


url =f'https://dictionary.goo.ne.jp/srch/all/{m}/m0u/'
headers = {
          "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0",
          }
request = urllib.request.Request(url, headers=headers)
html = urllib.request.urlopen(request)
soup = BeautifulSoup(html, 'html.parser')
a = soup.select('div[class="example_sentence"] ul[class="content_list idiom lsize"] p[class="text"]')
for x in a:
  print(x.text)

The url of the online dictionary is in the url The word you want to look up is in {m}

python


a = soup.select('div[class="example_sentence"] ul[class="content_list idiom lsize"] p[class="text"]')

This identifies the cousin you want to know.

python


for x in a:
  print(x.text)

Now I'm taking out everything in a (the end) It may be difficult to understand because it was my first time writing, but thank you for reading.

Recommended Posts

I tried scraping
I tried scraping with Python
I tried scraping with python
I tried PyQ
I tried web scraping with python.
I tried AutoKeras
I tried papermill
I tried django-slack
I tried Django
I tried spleeter
I tried cgo
I tried scraping conversation data from Askfm
I tried scraping Yahoo News with Python
I tried scraping Yahoo weather (Python edition)
I tried using argparse
I tried using anytree
I tried competitive programming
I tried using aiomysql
I tried using Summpy
I tried Python> autopep8
I tried using coturn
I tried using Pipenv
I tried using matplotlib
I tried using "Anvil".
I tried using Hubot
I tried using ESPCN
I tried PyCaret2.0 (pycaret-nightly)
I tried using openpyxl
I tried deep learning
I tried AWS CDK!
I tried using Ipython
I tried to debug.
I tried using PyCaret
I tried using cron
I tried Kivy's mapview
I tried using ngrok
I tried using face_recognition
I tried to paste
I tried using Jupyter
I tried using PyCaret
I tried moving EfficientDet
I tried shell programming
I tried using Heapq
I tried using doctest
I tried Python> decorator
I tried running TensorFlow
I tried Auto Gluon
I tried using folium
I tried using jinja2
I tried AWS Iot
I tried Bayesian optimization!
I tried using folium
I tried using time-window
I tried web scraping to analyze the lyrics.
I tried web scraping using python and selenium
I tried to get an image by scraping
I tried Value Iteration Networks
I tried fp-growth with python
I tried AutoGluon's Image Classification
I tried Learning-to-Rank with Elasticsearch!
[I tried using Pythonista 3] Introduction