Creating a scraping tool

This is Qiita's first post. Thank you.

Self-introduction 31 years old ♂ Graduated from the Department of Information Science at a national university. 22 years old Joined an independent SIer. Resident at a food wholesale company. 26 years old Moved to the information system department of a food wholesale company. Until now

How did you want to learn Python Transferred to the planning of the information system department of a food wholesale company I made a proposal to introduce AI.

However, AI vendors are too expensive, I was dismissed by the user department if it was not cost-effective.

It was a complex that was only legacy development, I wondered if I could make a proposal by incorporating deep learning by myself. I started studying Python.

I started studying Python, learned about scraping, I thought this was in demand, so I tried to make it a tool.

About scraping tools The food wholesale company has more than 50 branches and more than 100 stores, each of which has different customers, and it was impossible for the information system department to handle everything, so each store has some IT knowledge and patience. Designed to be usable if there is.

Execution method Distribute the batch file to the startup, and when you start your PC in the morning, execute the Python program from the batch file. The information of each customer is acquired, and if there is a difference from the previous acquisition contents, the URL and new information are displayed in a pop-up.

File structure Input a simple csv file so that you can create it yourself. The output is also csv, making it easy to compare with the previous acquisition.

Specified content 1.URL 2. Acquisition item class (up to 3 can be specified) 3. Output file name

Challenges 1. Items that do not define a class cannot be taken → If the class is not defined for the item you want to take, you cannot get it. I considered taking the ID and name as well, but it would be confusing, so I decided not to. For future improvement 2. I can get extra items other than the target → Since it is not used as input data, ask them to delete unnecessary parts. We have supported the acquisition of weather information, but it will be complicated if it is generalized, so we will not accept it. For future improvement 3. You must be aware that you do not violate the terms of service and do not overload.

And to change jobs I've been writing for a long time so far, but now that I have acquired web-based development technology and want to become a company-independent engineer, I started to change jobs. Use this scraping tool as a portfolio.

It will be published on GitHub. I would be very grateful if you could give me some advice. https://github.com/yamamasa2020/scraping-tool

Recommended Posts

Creating a scraping tool
[Python] Creating a scraping tool Memo
Memo for creating a text formatting tool
A tool for creating symbolic links on Windows
[Day 9] Creating a model
Creating a Home screen
4. Creating a structured program
Creating a dataset loader
Problems when creating a csv-json conversion tool with python
Try creating a CRUD function
[Python] What a programming inexperienced person did before creating a tool
Create a tool to check scraping rules (robots.txt) in Python
Block device RAM Disk Creating a device
Creating a web application using Flask ②
Creating a wav file split program
Step by Step for creating a Dockerfile
Creating a decision tree with scikit-learn
Creating a Flask server with Docker
Creating a voice transcription web application
Creating a simple table using prettytable
Creating a web application using Flask ①
Precautions when creating a Python generator
Creating a learning model using MNIST
Creating a web application using Flask ③
When creating a matrix in a list
A tool to convert Juniper config
Creating a web application using Flask ④
Scraping 1
[Python] Chapter 03-01 turtle graphics (creating a turtle)
Creating a simple PowerPoint file with Python
Commands for creating a new django project
Creating a python virtual environment on Windows
Creating a login screen with Django allauth
Scraping a website using JavaScript in Python
Creating a data analysis application using Streamlit
[Python] Creating a stock price drawdown chart
Try HTML scraping with a Python library
A tool for easily entering Python code
[Python] Scraping a table using Beautiful Soup
I made a browser automatic stamping tool.
Creating a shell script to write a diary
Memo about Sphinx Part 1 (Creating a project)
Creating a cholera map for John Snow
Creating a virtual environment in an Anaconda environment
I created a password tool in Python.
Creating a development environment for machine learning
python: Creating a ramen timer (pyttsx3, time)
Creating a Python document generation tool because it is difficult to use sphinx