Creating a decision tree with scikit-learn

I was making a decision tree using Weka, but it was troublesome to make the data arff. I created a decision tree with the Python machine learning library scikit-learn. I wanted to use it Installation of scikit-learn will be politely taught on other web sites

The decision tree object itself is fairly easy to do. Install Graphviz with brew (Mac) Because the library called pyparsing has been updated When you want to draw

sudo pip install -U pydot pyparsing==1.5.7

Please downgrade I don't understand Windows (low voice)

`tree_ex.py`


#-*-coding:utf-8 -*-

#Null value cannot be used → What should I do?
# yes,no is 1,-1
#Characters cannot be used
from sklearn import tree
from sklearn.externals.six import StringIO
import pydot

if __name__ == '__main__':

    X = [
        [0,1],
        [0,-1],
        [1,1]
        ]
    Y = [1,2,3] #Corresponds in order from the top
    clf = tree.DecisionTreeClassifier()
    clf = clf.fit(X,Y) #This completes the decision tree object

    #Magic for drawing
    dot_data = StringIO()
    tree.export_graphviz(clf,out_file = dot_data)
    graph = pydot.graph_from_dot_data(dot_data.getvalue())
    graph.write_pdf("tree_ex.pdf")
    
    #pre = clf.predict([0,1])
    #print pre #The result is 1

X is the data and Y is the label for each data. Mass docking X and Y with fit function (maybe)

Since clf is a decision tree object and classifier when the fit function is applied. You can classify which class the new data belongs to with the commented out predict function.

After that, it should be a magic to call pydot and draw.

The above drawing result looks like this

Weka's decision tree is hard to see and I tried to make a decision tree in Python. Creating a decision tree itself is very easy. It's easy to see

・ No branching conditions (cannot be issued due to lack of ability)

・ Questionnaires with 1, 2 or 3 types of answers in one item cannot be sorted (yes and no can be realized with [1, -1])

・ Null value ・ Do not accept character strings

For now, I feel that Weka is easier to use. Should I add an argument option? .. ..

Recommended Posts

Creating a decision tree with scikit-learn

Create a decision tree from 0 with Python (1. Overview)

What is a decision tree?

Creating a Flask server with Docker

Creating a simple app with flask

Creating a simple PowerPoint file with Python

Creating a login screen with Django allauth

Implement a minimal self-made estimator with scikit-learn

Visualize scikit-learn decision trees with Plotly's Treemap

2. Make a decision tree from 0 with Python and understand it (2. Python program basics)

Isomap with Scikit-learn

Decision tree (classification)

DBSCAN with scikit-learn

Clustering with scikit-learn (1)

Clustering with scikit-learn (2)

PCA with Scikit-learn

Make a decision tree from 0 with Python and understand it (4. Data structure)

kmeans ++ with scikit-learn

Create a decision tree from 0 with Python and understand it (5. Information Entropy)

[Piyopiyokai # 1] Let's play with Lambda: Creating a Lambda function

Procedure for creating a LineBot made with Python

A memo when creating a python environment with miniconda

Commands for creating a python3 environment with virtualenv

Flow of creating a virtual environment with Anaconda

Try creating a FizzBuzz problem with a shell program

[Grasshopper] When creating a data tree on Python script

Cross Validation with scikit-learn

[Day 9] Creating a model

Looking back on creating a web service with Django 1

A4 size with python-pptx

Multi-class SVM with scikit-learn

I made a Christmas tree lighting game with Python

Clustering with scikit-learn + DBSCAN

How to visualize the decision tree model of scikit-learn

Learn with chemoinformatics scikit-learn

Drawing a tree structure with D3.js in Jupyter Notebook

Problems when creating a csv-json conversion tool with python

Creating a scraping tool

Looking back on creating a web service with Django 2

2. Multivariate analysis spelled out in Python 7-1. Decision tree (scikit-learn)

Machine learning beginners try to make a decision tree

Creating a dataset loader

DBSCAN (clustering) with scikit-learn

Current directory when creating a new one with Jupyter

Notes on creating a virtual environment with Anaconda Navigator

Decorate with a decorator

[Piyopiyokai # 1] Let's play with Lambda: Creating a Python script

Install scikit.learn with pip

Calculate tf-idf with scikit-learn

Create a decision tree from 0 with Python and understand it (3. Data analysis library Pandas edition)

Visualize the results of decision trees performed with Python scikit-learn

The first step to creating a serverless application with Zappa

Perform (Visualization> Clustering> Feature Description) with (t-SNE, DBSCAN, Decision Tree)

Creating a GUI as easily as possible with python [tkinter edition]

Scikit-learn decision Generate Python code from tree / random forest rules

Creating a temperature / humidity monitor with Raspberry Pi (pigpio version)

(For beginners) Try creating a simple web API with Django

Learn librosa with a tutorial 1

Draw a graph with NetworkX

Neural network with Python (scikit-learn)

Try programming with a shell!