How to enjoy Coursera / Machine Learning (Week 10)

(Regarding the benchmark test, [Addition](http://qiita.com/TomokIshii/items/b5708a02895847e3588c#%E8%BF%BD%E8%A8%98-theano-gpu%E8%A8%88%E7%AE% 97% E3% 81% A7mini-batch% E3% 82% B5% E3% 82% A4% E3% 82% BA% E3% 81% AB% E7% 9D% 80% E7% 9B% AE% E3% 81% 97% E3% 81% A6% E3% 83% 99% E3% 83% B3% E3% 83% 81% E3% 83% 9E% E3% 83% BC% E3% 82% AF).

The other day, I posted to Qiita about how to enjoy the programming task (Matlab) of Coursera, Machine Learning course (by Stanford University, Prof. Andrew Ng) while porting it to Python. After that, I took the course, but I learned that there are no programming tasks I have been looking forward to since Week 10. (Of the 11 weeks in total, Program Assignment is available from Week 1 to Week 9. By the way, Quiz is also Week 10 and 11.)

Week 10 was interesting in lectures on Stochastic Gradient Descent (SGD, Stochastic Gradient Descent) and Online Learning because it was "Large Scale Machine Learning", but if there are no programming tasks, I will study by myself. So, I implemented SGD with Python. (This is an attempt to study Deep Learning Framework, "Theano" along with the implementation of SGD.) Also, I will introduce some interesting results (tips) in the benchmark test conducted after the implementation of SGD. ..

Video Lecture Outline (Week 10)

In the video, the stochastic gradient descent method is explained in contrast to the normal gradient descent method (Batch Gradient Descent).

Batch Gradient Descent Cost function:

J_{train}(\theta) = \frac{1}{2m} \sum_{i=1}^{m} (h_{\theta} (x^{(i)}) 
- y^{(i)} ) ^2

The following iterative calculation is performed to minimize this cost function. Repeat {

{\theta}_j := {\theta}_j - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_{\theta} (x^{(i)}
-y^{(i)} ) x_j^{(i)}
\\ \ \ \ \ \ \ \ \ \ (\textbf{for every } j=0, ..., n)

}

Stochastic Gradient Descent 1. Randomly shuffle (reorder) training examples Randomly shuffle the training data.

** 2. ** Below, update $ \ theta $ by referring to the training data one by one. Repeat {


for\ i:= 1,...,m {\ \ \ \ \ \ \{\\

{\theta}_j := {\theta}_j - \alpha (h_{\theta} (x^{(i)}) - y^{(i)}) x_j^{(i)}
\\\ \ \ (\textbf{for every } j=0, ...,n)
\\\ \ \ \}

\ \ }

}

In the lecture, I explained how to update the parameters by referring to the training data one by one, and then there was a Mini-Batch Gradient Descent (as a method between Batch Gradient Descent and Stochastic GD). It was.

Implementation of normal gradient descent (logistic regression)

As for the data used for checking the code, I selected the "Adult" dataset from the UCI Machine Learning Repository. This is extracted from the US Census databese and seems to be popular data in machine learning.

39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K
53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K
28, Private, 338409, Bachelors, 13, Married-civ-spouse, Prof-specialty, Wife, Black, Female, 0, 0, 40, Cuba, <=50K

Age, educational background, occupation type, marriage history, etc. are lined up, but at the end of each line is the label of income class "<= 50K" or "> 50K". This is the explained variable used for classification. I'm wondering what to choose as the explanatory variable (feature) to use for regression, but this time I chose only one year of school enrollment. It is thought that the resolution of educational background is rather high, but it seems that this is linked to income in the world.

We will start by defining a cost function and a function to calculate its partial derivative (gradient) according to the method of the previous tasks of Coursera Machine Learning.

import numpy as np
import pandas as pd
import timeit

import theano
import theano.tensor as T

def load_data():
(Omitted)
    return xtr, ytr, xte, yte

def compute_cost(w, b, x, y):
    p_1 = 1 / (1 + T.exp(-T.dot(x, w) -b))  # same as sigmoid(T.dot(x,w)+b)
    income_class = lambda predictor: T.gt(predictor, 0.5)  # 0.5 is threshold
    prediction = income_class(p_1)
    
    xent = -y * T.log(p_1) - (1-y) * T.log(1- p_1)
    cost = xent.mean() + 0.01 * (w ** 2).sum()  # regularization
    
    return cost, prediction

def compute_grad(cost, w, b):
    gw, gb = T.grad(cost, [w, b])
    
    return gw, gb

A feature of the framework "Theano" is that once you get used to it (it's hard to understand if you're not used to it), you can put together a statement very concisely. (In particular, the calculation of gradient can be done in one line.)

The main processing is performed using these functions.

    xtr, ytr, xte, yte = load_data()
   
    # Declare Theano symbolic variables
    xtr_shape = xtr.shape
    if len(xtr_shape) == 2:
        w_len = xtr_shape[1]
    else:
        w_len = 1
    
    x = T.matrix('x')    # for xmat
    y = T.vector('y')    # for ymat, labels
    w = theano.shared(np.zeros(w_len), name='w')    # w, b <- all zero
    b = theano.shared(0., name='b')

    print ' Initial model: '
    wi = w.get_value()
    bi = w.get_value()
    print 'w : [%12.4f], b : [%12.4f]' % (wi[0], bi)

    cost, prediction = compute_cost(w, b, x, y)  # ... Cost-J
    gw, gb = compute_grad(cost, w, b)            # ... Gradients
    
    # Compile
    train = theano.function(
          inputs=[x,y],
          outputs=[cost, prediction],
          updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)),
          allow_input_downcast=True)
    predict = theano.function(inputs=[x], outputs=prediction,
          allow_input_downcast=True)
    
    # Train (Optimization)
    start_time = timeit.default_timer()
    training_steps = 10000
    xtr= xtr.reshape(len(xtr), 1)  # shape: (m,) to (m,1)
    for i in range(training_steps):
        cost_j, pred = train(xtr, ytr)

As described above, the parameters (w, b) that minimize the cost function were obtained by the gradient descent method (Batch Gradient Descent). The convergence test is not performed, and the solution is obtained by updating the parameters a predetermined number of times.

Implementation of Stochastic Gradient Descent

Now, this is the implementation of Stochastic Gradient Descent (stochastic gradient descent). In Cousera's lecture, there was an explanation of Stochastic Gradient Descent, which scans data used for training one set at a time, and Mini-Batch Stochastic Gradient Descent, which scans data of 2 to 100 sets of small size. Choose the Mini-Batch.

In SGD, training data is randomly shuffled as preprocessing. In addition, in order to speed up the processing, we decided to put the data in the shared variable of Theano.

def setup_data(xmat, ymat):
    # store the data into 'shared' variables to be accessible by Theano
    def shared_dataset(xm, ym, borrow=True):
        shared_x = theano.shared(np.asarray(xm, dtype=theano.config.floatX),
                                        borrow=borrow)
        shared_y = theano.shared(np.asarray(ym, dtype=theano.config.floatX),
                                        borrow=borrow)
        #
        return shared_x, shared_y
    
    def data_shuffle(xm, ym, siz):
        idv = np.arange(siz)
        idv0 = np.array(idv)    # copy numbers
        np.random.shuffle(idv)
        xm[idv0] = xm[idv]
        ym[idv0] = ym[idv]
        
        return xm, ym
     
    total_len = ymat.shape[0]
    n_features = np.size(xmat) / total_len
    # Random Shuffle
    xmat, ymat = data_shuffle(xmat, ymat, total_len)
    train_len = int(total_len * 0.7)
    test_len = total_len - train_len
    
    xtr, ytr = shared_dataset((xmat[:train_len]).reshape(train_len, n_features), 
                               ymat[:train_len])
    xte, yte = shared_dataset((xmat[train_len:]).reshape(test_len, n_features), 
                               ymat[train_len:])
    
    rval = [(xtr, ytr), (xte, yte)]
    return rval

The description of theano.function is changed because the storage destination of the data set is moved to the shared variable. Arguments must be given indirectly with the keyword ** givens **, not directly with ** inputs **.

Input (reprint) data with Theano variable (not shared variable)

    # Compile
    train = theano.function(
          inputs=[x,y],
          outputs=[cost, prediction],
          updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)),
          allow_input_downcast=True)
    predict = theano.function(inputs=[x], outputs=prediction,
          allow_input_downcast=True)

Input data from shared variables (SGD version)

    # Compile
    batch_size = 10
    train_model = theano.function(
          inputs=[index, learning_rate],
          outputs=[cost, prediction],
          updates=((w, w - learning_rate * gw), (b, b - learning_rate * gb)),
          givens=[(x, xtr[index * batch_size:(index + 1) * batch_size]), 
                  (y, ytr[index * batch_size:(index + 1) * batch_size])],
          allow_input_downcast=True
    )
    predict = theano.function(
          inputs=[],
          outputs=prediction,
          givens=[(x, xte)],
          allow_input_downcast=True
    )


Iterative calculation is performed using the Theano function defined above.

    # Train (Optimization)
    start_time = timeit.default_timer()
    n_epochs = 20
    epoch = 0
    lrate_base = 0.03
    lrate_coef = 20
    n_train_batches = int(ytr.get_value().shape[0] / batch_size)
    
    while (epoch < n_epochs):
        epoch += 1
        for mini_batch_index in range(n_train_batches):
            l_rate = lrate_base * lrate_coef / (epoch + lrate_coef)
            cost_j, pred = train_model(mini_batch_index, l_rate)
        
        print 'epoch[%3d] : cost =%f ' % (epoch, cost_j)
  

Execution result.

 Initial model: 
w : [      0.0000], b : [      0.0000]
epoch[  1] : cost =0.503755 
epoch[  2] : cost =0.510341 
epoch[  3] : cost =0.518218 
epoch[  4] : cost =0.524344 
epoch[  5] : cost =0.528745 
epoch[  6] : cost =0.531842 
epoch[  7] : cost =0.534014 
epoch[  8] : cost =0.535539 
epoch[  9] : cost =0.536614 
epoch[ 10] : cost =0.537375 
epoch[ 11] : cost =0.537913 
epoch[ 12] : cost =0.538294 
epoch[ 13] : cost =0.538563 
epoch[ 14] : cost =0.538751 
epoch[ 15] : cost =0.538880 
epoch[ 16] : cost =0.538966 
epoch[ 17] : cost =0.539021 
epoch[ 18] : cost =0.539053 
epoch[ 19] : cost =0.539067 
epoch[ 20] : cost =0.539069 

 Final model: 
w : [      0.3680], b : [     -4.9370]
Elapsed time:     26.565 [s]
accuracy =       0.7868 

I plotted the changes in the parameters in the calculation.

** Fig. Plot for each Epoch ** converge_plot1.png

** Fig. Plot for each Mini-Batch ** converge_plot2.png

When the resolution is increased, the movement characteristic of the stochastic gradient descent (SGD) can be observed.

Increase the independent variable (feature) of logistic regression

Since the dataset "Adult" has many independent variables (features), we decided to increase the features for calculation. As a Code, only the input processing of the training data x is changed. The features used are as follows. --Year of education (Educational background, 12 years if you have been educated to high school in Japan.) (Used in the first regression model) --Role in the household (husband, wife, child-bearing, single, etc.) (Added in this regression model.) --Working hours per week (added in this regression model.)

I expected that the accuracy of the classification would improve to some extent, but unfortunately the accuracy did not improve from the first regression model. (This time, the purpose is to implement the program, so we have not considered the data analysis results.)

The results of the benchmark regarding the calculation time are as follows.

** Comparison of calculation time **

Optimize method Model feature number CPU / GPU epoch number mini-batch size time [s]
Batch Gradient Descent 1 CPU 10,000 - 76.75
Batch Gradient Descent 1 GPU 10,000 - 91.14
Stochastic Gradient Descent 1 CPU 20 10 1.76
Stochastic Gradient Descent 1 GPU 20 10 23.87
Stochastic Gradient Descent 3 CPU 20 10 4.51
Stochastic Gradient Descent 3 GPU 20 10 88.38

! No convergence test is performed in any calculation. Calculate the specified number of loops. ! Batch Gradient Descent required about 10,000 calculations to obtain a convergent solution. (Learning rate learning rate = 0.1)

If you don't look at GPU calculation, Batch G.D. vs. SGD will save you a lot of calculation time. We were able to confirm the high calculation efficiency of SGD.

Effect of Mini-Batch size on GPU calculation

Now, the problem is "inefficient calculation of GPU". Since the calculation time is set by setting the timer before training and acquiring the timer value after completion, there is no doubt that this learning part is the cause. Usually, it is suspected that CPU calculation (especially numpy processing) is mixed in the GPU calculation part. With this in mind, I looked around the code in detail, but couldn't find the cause. (Actually, I'm referring to the code in Theano Tutorial, Deep Learning 0.1 documentation, which is a model, so I don't think I can make a simple mistake.)

After that, I came up with the overhead of calling Theano Function. I changed (increased) the Mini-Batch size and measured the calculation time.

bm_plot1.pngbm_plot3.png

The horizontal axis is the Mini-Batch size and the vertical axis is the Training time. The scale on the left is Linear-Linear, and the scale on the right is Log-Linear. It is necessary to consider that the number of loops decreases in proportion to the Mini-Batch size, but the above result shows that the calculation time decreases "exponentially", and it is considered that the influence of the training function call is large.

In Coursera's lecture, it was explained that "Mini-Batch size should be decided with consideration for parallel computing (vectorization) of the processor, about 2 to 100 is practical". In addition, there seems to be a proposal that "in the case of class classification, it is appropriate to decide according to the number of classes of the sorting destination (2 because it is 2 classifications this time, 10 for MNIST handwritten digit classification). However, this time. From the result of, it means that "it is better to set the size of Mini-Batch to some extent in GPU calculation".

This time, since it is a logistic regression, the amount of calculation per batch is considerably smaller than that of a neural network. I would like to investigate the effect of Mini-Batch size on a slightly larger calculation such as a neural network at a later date. In addition, the overhead of Function call is due to data transfer between memories, so the situation may differ depending on the hardware. (Is it impossible with a Laptop PC?)

(The programming environment of this article is as follows, python 2.7.8, theano 0.7.0, CUDA Driver / Runtime 7.5 / 7.0)

References (web site)

--Coursera, Machine Learning (especially Week 10)

(Addition) Benchmark focusing on Mini-Batch size in Theano GPU calculation

In the above article, I wrote that "when performing stochastic gradient descent with Mini-Batch, it seems to be affected by Mini-Batch size", but since I received a comment about this, I increased the conditions and benchmark test I tried.

Benchmark issues-Adult Dataset

As in this article, I selected and used "Adult" from the UCI Machine Learning repository. "Adult" data is a problem to classify the annual income of ʻUS $ 50k or less and ʻUS $ 50k or more based on the" family structure "and" educational background "of American residents. This time, we used two classification codes.

  1. Classification by logistic regression. Create a regression model by selecting 3 of the 14 features included in the "Adult" dataset. Also, the data in the file'adult.data'is divided into 70% / 30% and used for Train data and Test data, respectively. (Last time, I didn't notice the existence of the test file'adult.test'included in the dataset, so I performed the above operation.)

  2. Classification by Multi-layer Perceptron (MLP) model. Select 11 of the 14 features included in the "Adult" dataset and enter them in the MLP net model. The composition of the MLP was hidden layer 1 (22 units) + hidden layer 2 (20 units) + output layer (1 unit). The file'adult.data'was used as Train data and'adult.test' was used as Test data. The number of instances is Train-32561, Test-16281.

The optimizer used the stochastic gradient descent method, which adjusts the parameters while supplying data with Mini-Batch.

Calculation process

The training data is set as one set, divided into the specified Mini-Batch size, and then input to the classifier. The calculation step of inputting the entire set is called epoch, and the calculation of the determined number of epochs (epoch = 50 this time) was performed without performing the convergence test. Below is the code for that part.

    #############################################
    batch_size = 100
    #############################################

    # Compile
    train_model = theano.function(
        inputs=[index],
        outputs=[cost, accur],
        updates=one_update,
        givens=[(x, trXs[index * batch_size:(index + 1) * batch_size]), 
                (y_, trYs[index * batch_size:(index + 1) * batch_size])],
        allow_input_downcast=True
    )
    accuracy = theano.function(
        inputs=[],
        outputs=accur,
        givens=[(x, teXs), (y_, teYs)],
        allow_input_downcast=True
    )

    # Train (Optimization)
    start_time = timeit.default_timer()

    n_epochs = 50
    epoch = 0

    n_train_batches = int(trY.shape[0] / batch_size)
    
    while (epoch < n_epochs):
        epoch += 1
        for mini_batch_index in range(n_train_batches):
            cost_j, accur = train_model(mini_batch_index)
        
        print('epoch[%3d] : cost =%8.4f' % (epoch, cost_j))
    
    elapsed_time = timeit.default_timer() - start_time
    print('Elapsed time: %10.3f [s]' % elapsed_time)
    
    last_accur = accuracy()
    print('Accuracy = %10.3f ' % last_accur)

One caveat is the number of Mini-Batch in one epoch.

#   Mini-Batch count=Number of instances of Train data/ Mini-Batch size
    n_train_batches = int(trY.shape[0] / batch_size)

In some cases, the surplus is truncated. This has a relatively large effect as the Mini-Batch size increases. For example, in the case of 32,561 Mini-Batch size 10000 instances, 30,000 instances are referenced, but 2561 is skipped.

Benchmark test results

The computer environment is as follows.

  1. Laptop PC (with GPU), OS: Windows 10, Python 2.7.11, Theano 0.7.0
  2. Desktop PC (with GPU), OS: Linux, Ubuntu 14.04LTS, Python 2.7.11, Theano 0.7.0

** Test result (raw data) ** (Unit is seconds [s])

batch_siz Laptop_LR_fastc Laptop_MLP_fastc Laptop_LR_fastr Laptop_MLP_fastr Desktop_LR_fastr Desktop_MLP_fastr
10 113.3 1546.6 108.8 362.7 15.3 57.4
20 56.9 758.6 55.5 176.1 8.0 28.5
50 22.6 321.6 22.2 91.4 3.2 16.6
100 11.6 159.8 11.5 47.0 3.1 8.6
200 6.2 77.0 5.9 23.8 1.6 4.5
500 4.4 30.6 4.3 7.9 1.0 1.8
1000 2.2 15.4 2.3 4.6 0.5 1.2
2000 1.2 9.3 1.3 3.5 0.3 0.9
5000 0.4 4.6 0.5 1.9 0.2 0.6
10000 0.3 4.0 0.4 1.6 0.1 0.5

Description of each column: --batch_siz: Mini-Batch size --Laptop_LR_fastc: Logistic regression on Laptop PC, theano.config.Mode = fast_compile --Laptop_MLP_fastc: MLP model classification on Laptop PC, theano.config.Mode = fast_compile --Laptop_LR_fastr: Logistic regression on Laptop PC, theano.config.Mode = fast_run --Laptop_MLP_fastr: MLP model classification on Desktop PC, theano.config.Mode = fast_run --Desktop_LR_fastr: Logistic regression on Desktop PC, theano.config.Mode = fast_run --Desktop_MLP_fastr: MLP model classification on Desktop PC, theano.config.Mode = fast_run

'theano.config.Mode' is an option that the optimization level is increased and the execution speed is increased by'fast_run', and'fast_compile' is an option that some optimization is performed (compilation time is shortened).

Next, we will look at the details while referring to the plot.

Fig. Logistic Regression vs. MLP model (Laptop_LR_fastr vs. Laptop_MLP_fastr) benchmark_01.png

The horizontal axis is the Mini-Batch size, and the vertical axis is the time required for the learning part. First, the difference in classification code and the comparison between logistic regression and classification by MLP model. As expected, the calculation time is about 3 to 4 times longer due to the increase in the amount of calculation in the MLP model. In addition, the influence of the Mini-Batch size is the same for both, and it can be seen that the calculation time decreases as the Mini-Batch size increases.

Fig. Theano mode FAST_COMPILE vs. FAST_RUN (Logistic Regression) benchmark_02.png

This is a comparison between the mode FAST_COMPILE, which does not perform much CUDA-related optimization, and the mode FAST_RUN, which performs more optimization, but as shown in the above figure, there is not much difference between logistic regressions.

Fig. Theano mode FAST_COMPILE vs. FAST_RUN (MLP classification) benchmark_03.png

On the other hand, in the classification of MLP models with a large amount of calculation, the effect of FAST_RUN, which has been optimized, has come out, leading to a reduction in calculation time.

FIG. Laptop PC vs. Desktop PC (MLP classification) benchmark_04.png

This is thought to be the result of a simple difference in hardware performance. (In both cases, Theano mode is FAST_RUN. Also, I haven't looked at the difference in OS in detail, but the effect will be small.)

Consideration

As mentioned above, it has been observed that the learning calculation time tends to decrease significantly as the Mini-Batch size increases under all conditions. The cause is that when the Mini-Batch size is carved into small pieces, the number of calls to the function train_model () in the learning while loop increases, and the overhead of this function call is considered to be significant. (If you want to investigate in more detail, I think you need to use a profiler. For the time being, I looked at the situation with the Cprofile of the standard profiler, but I was able to grasp the part that took time in the part inside the Theano code. However, I gave up on the details due to lack of skill.)

In this test (as mentioned above, there is a problem of data "truncating" that occurs during data feed), the outline and the amount of data to be fed are the same conditions. Originally, at the time of learning, the point is how to improve the accuracy of the classifier as quickly as possible, so it is possible to take a strategy to adjust the calculation parameters (learning rate, optimizer parameters, etc.) for each Mini-Batch. There are many. It is considered important to set the size of Mini-Batch appropriately in consideration of this case and the overhead generated at the time of the observed GPU calculation function call in order to realize efficient learning.

Recommended Posts

How to enjoy Coursera / Machine Learning (Week 10)
Cross-entropy to review in Coursera Machine Learning week 2 assignments
Enjoy Coursera / Machine Learning materials twice
How to collect machine learning data
Introduction to Machine Learning: How Models Work
scikit-learn How to use summary (machine learning)
Introduction to machine learning
Coursera Machine Learning Challenges in Python: ex6 (How to Adjust SVM Parameters)
An introduction to machine learning
Super introduction to machine learning
How to adapt multiple machine learning libraries in one shot
How to use machine learning for work? 03_Python coding procedure
How to increase the number of machine learning dataset images
Introduction to machine learning Note writing
Introduction to Machine Learning Library SHOGUN
Machine learning
How to use machine learning for work? 01_ Understand the purpose of machine learning
People memorize learned knowledge in the brain, how to memorize learned knowledge in machine learning
How to create a serverless machine learning API with AWS Lambda
Record the steps to understand machine learning
I installed Python 3.5.1 to study machine learning
How to study deep learning G test
Introduction to ClearML-Easy to manage machine learning experiments-
An introduction to Python for machine learning
How to use machine learning for work? 02_Overview of AI development project
[Python] Easy introduction to machine learning with python (SVM)
[Super Introduction to Machine Learning] Learn Pytorch tutorials
An introduction to machine learning for bot developers
Try to forecast power demand by machine learning
[Super Introduction to Machine Learning] Learn Pytorch tutorials
[Memo] Machine learning
Machine learning classification
How to Introduce IPython (Python2) to Mac OS X-Preparation for Introduction to Machine Learning Theory-
[For beginners] Introduction to vectorization in machine learning
Arrangement of self-mentioned things related to machine learning
Machine Learning sample
How to set up a Google Colab environment with Coursera's advanced machine learning courses
How to split machine learning training data into objective variables and others in Pandas
How to quickly create a machine learning environment using Jupyter Notebook with UbuntuServer 16.04 LTS
I tried to move machine learning (ObjectDetection) with TouchDesigner
Learn collaborative filtering along with Coursera Machine Learning materials
Machine learning python code summary (updated from time to time)
Coursera Machine Learning Challenges in Python: ex2 (Logistic Regression)
How to perform learning in SageMaker without session timeout
Machine learning beginners try to make a decision tree
Coursera Machine Learning Challenges in Python: ex1 (Linear Regression)
Site summary to learn machine learning with English video
Attempt to include machine learning model in python package
[Machine learning] Try to detect objects using Selective Search
Artificial intelligence, machine learning, deep learning to implement and understand
How to enjoy Python on Android !! Programming on the go !!
An introduction to machine learning from a simple perceptron
I tried to compress the image using machine learning
[TF] How to save and load Tensorflow learning parameters
Everything for beginners to be able to do machine learning
How to use xml.etree.ElementTree
Machine learning tutorial summary
How to use Python-shell
How to build Anaconda virtual environment used in Azure Machine Learning and link with Jupyter
How to use tf.data
Scraping 2 How to scrape