We will explain how to perform verification using the model and test data trained in Pylearn2.

Read test data

Test data can be read from the pylearn2.datasets module or a pkl file. After reading, you can refer to the input value in data.X and the output value in data.y.

When reading MNIST datasets from the pylearn2.datasets module:

from pylearn2.datasets import mnist

data = mnist.MNIST(which_set="train")

When reading a pkl file:

import pickle

data = pickle.load(open("path/to/test_dataset.pkl"))

Load the model

The model after training is a pkl file, and you can read it.

import pickle

model= pickle.load(open("path/to/model.pkl"))

Calculate the predicted value

Create a function to calculate the predicted value. You can get the predicted value by generating theano function using the model's fprop method and passing the input value to that function. However, the space may be different between the fprop input and the dataset input, so use Space # format_as to perform the conversion. The generated function can calculate the predicted value for multiple input values at once. The predicted value is calculated in multiple times, because if you calculate it all at once, you may run out of memory. In the following function, input is X of the dataset and model is the model.

import theano
from pylearn2.space import VectorSpace

...

def simulate(inputs, model):
    space = VectorSpace(inputs.shape[1])
    X = space.get_theano_batch()
    Y = model.fprop(space.format_as(X, model.get_input_space()))
    f = theano.function([X], Y)
    result = []
    batch_size = 100
    for x in xrange(0, len(inputs), batch_size):
      result.extend(f(inputs[x:x + batch_size]))
    return result

Aggregate the results

In the case of statistical classification, it can be aggregated as follows. Pass the return value of simulate in the outputs of the following function, and the y of the dataset in labels. Use numpy's argmax to find the element that maximizes the output and compare it to y in the dataset.

import numpy as np

...

def count_correct(outputs, labels):
    correct = 0;
    for output, label in zip(outputs, labels):
        if np.argmax(output) == label:
            correct += 1
    return correct

Example of use

Here is an example when using the MNIST dataset.

import pickle
from pylearn2.datasets import mnist

model = pickle.load(open("path/to/model.pkl"))
data = mnist.MNIST(which_set="test")

predicts = simulate(data.X, model)
correct = count_correct(predicts, data.y)
print "{} / {}".format(correct, len(data.X))

References

http://fastml.com/how-to-get-predictions-from-pylearn2/