Machine learning in Delemas (data acquisition)

The other day, I completed the topic cousera machine learning course, so I want to try it in practice Idolmaster Cinderella Girls .wikipedia.org/wiki/%E3%82%A2%E3%82%A4%E3%83%89%E3%83%AB%E3%83%9E%E3%82%B9%E3%82%BF% E3% 83% BC_% E3% 82% B7% E3% 83% B3% E3% 83% 87% E3% 83% AC% E3% 83% A9% E3% 82% AC% E3% 83% BC% E3% I will try to predict three types (Cu, Co, Pa) using the profile data of 83% AB% E3% 82% BA).

Data acquisition

First is the acquisition of data used for learning. I searched for a Delemas version of Pokemon api, but it didn't look good, so I usually use the Delemas wiki. I got the data from wiki.gamerch.com/).

For the scraping method, I referred to the following pages. http://qiita.com/Azunyan/items/9b3d16428d2bcc7c9406

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import urllib2
import csv
from bs4 import BeautifulSoup

#URL to access
url = "https://imascg-slstage-wiki.gamerch.com/%E3%82%A2%E3%82%A4%E3%83%89%E3%83%AB%E4%B8%80%E8%A6%A7"
#Read URL
html = urllib2.urlopen(url)
#Handle html with Beautiful Soup
soup = BeautifulSoup(html, "html.parser")
#Get all the contents of the first table
table = soup.findAll("table")[0]
#Decompose table row by row
rows = table.findAll("tr")

csvFile = open("aimasudata.csv", 'wt')
writer = csv.writer(csvFile)
for row in rows:
    csvRow = []
    for cell in row.findAll(['td', 'th']):
        csvRow.append(cell.get_text().encode('utf-8'))
    writer.writerow(csvRow)

result

Like this スクリーンショット 2017-04-01 23.10.30.png

Note

--I didn't know how to read the html tag, so it took a long time to find the acquisition destination of soup.findAll. If you want to get the data of the table for the time being, you can specify the table and know the number of the table in the same page. --It is said that ASCII code cannot be used when cell.get_text () is used for Japanese data, so encoding to utf-8 is required.

Recommended Posts

Machine learning in Delemas (data acquisition)

Preprocessing in machine learning 2 Data acquisition

Python: Preprocessing in machine learning: Data acquisition

Machine learning in Delemas (practice)

Preprocessing in machine learning 4 Data conversion

Python: Preprocessing in machine learning: Data conversion

Preprocessing in machine learning 1 Data analysis process

Data supply tricks using deques in machine learning

Data set for machine learning

Used in machine learning EDA

Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization

Automate routine tasks in machine learning

Classification and regression in machine learning

Machine learning

Pre-processing in machine learning 3 Missing values, outliers, and imbalanced data

Python: Preprocessing in Machine Learning: Overview

Random seed research in machine learning

Basic machine learning procedure: ② Prepare data

How to collect machine learning data

Machine learning imbalanced data sklearn with k-NN

[python] Frequently used techniques in machine learning

[Python] First data analysis / machine learning (Kaggle)

[Python] Saving learning results (models) in machine learning

[Memo] Machine learning

Machine learning classification

Python: Preprocessing in machine learning: Handling of missing, outlier, and imbalanced data

Machine Learning sample

Full disclosure of methods used in machine learning

[Python] Data analysis, machine learning practice (Kaggle) -Data preprocessing-

Machine learning Training data division and learning / prediction / verification

[Python3] Let's analyze data using machine learning! (Regression)

Summary of evaluation functions used in machine learning

Get a glimpse of machine learning in Python

I started machine learning with Python Data preprocessing

Stock price forecast using deep learning [Data acquisition]

A story about data analysis by machine learning

[For beginners] Introduction to vectorization in machine learning

Machine learning tutorial summary

About machine learning overfitting

Machine learning ⑤ AdaBoost Summary

Sampling in imbalanced data

Tool MALSS (application) that supports machine learning in Python

Machine learning logistic regression

About data preprocessing of systems that use machine learning

How to split machine learning training data into objective variables and others in Pandas

Tool MALSS (basic) that supports machine learning in Python

Machine learning support vector machine

About testing in the implementation of machine learning models

Studying Machine Learning ~ matplotlib ~

Machine learning linear regression

Machine learning course memo

Machine learning library dlib

Machine learning (TensorFlow) + Lotto 6

Coursera Machine Learning Challenges in Python: ex1 (Linear Regression)

Somehow learn machine learning

Time series data prediction by AutoML (automatic machine learning)

Attempt to include machine learning model in python package

Cross-entropy to review in Coursera Machine Learning week 2 assignments

xgboost: A valid machine learning model for table data

Machine learning library Shogun

Machine learning rabbit challenge