Introduction

Python is a programming language that is often used in machine learning research. This is often difficult to use, especially if you want to use it in the field of basic research. This is because writing a lot of for statements slows down. So, if you try to do your best without writing a for statement, you will come across various troubles, but I would like to introduce how to use np.newaxis, which I had trouble with. You will often come across this when you want to experiment with various parameters.

It is assumed that you have prior knowledge about slicing. Please put your knowledge in advance by checking it.

[Python] Numpy reference, extraction, combination Indexing - docs.scipy.org

Main story

1. First, let's understand the data to be used

At first, it's a good idea to sort out what kind of data you have and what kind of goals you want to reach. Keep in mind what kind of data, what the shape of the array is, and what dimension it is.

For the sake of simplicity, consider the following situation.

`init.py`


import numpy as np
assert a.shape == (N,)  # ndarray
assert b.shape == (N, K)  # ndarray

ʻAandb are both vectors, and I want to do ʻa + b

a+b_ex.png

Since b has K samples, the entity is stored as a two-dimensional array of (N, K).

a+b_but.PNG

If you write this in a for statement,

`naive.py`


c = np.zeros(N, K)
for k in range(K):
    c[:, k] = a + b[:, k]

However, I don't want to write a for statement as much as possible. You can also simply do ʻa + b`, but this will not be able to handle complicated problems at all.

2. Think about your goals.

The goal this time is to do ʻa + bfor eachK sample of b` and get the result for each sample. Let's firmly define the shape of the final result = (arrangement of goals).

`goal.py`


c = np.zeros(N, K)
hogehoge()
assert c.shape == (N, K)

It will be like this.

3. Insert np.newaxis

Normally, when I put ʻa + b` in my head, it looks like this,

a+b.PNG

Since b is a two-dimensional array with multiple samples, I don't know how to calculate it.

Actually, it will guess automatically, but it is not good to leave it to this in the future.

a+b_but.PNG

Let me give you instructions on how to calculate this. That is np.newaxis. Actually, the substance of np.newaxis is None, but let's use np.newaxis without worrying too much. When an operation (here +) that requires the array to have the same shape comes, we will instruct "Please make it the same shape" from here. First, since the dimension of the array is ʻa.shape == (N,), at least ʻa.shape == (N, 1). To do this, use ʻa [:, np.newaxis] . This operation allows Numpy to automatically determine from the other argument of the operator, which stretches from (N, 1)to(N, K)` and performs the calculation.

a+b=c.PNG

`add_samples.py`


import numpy as np
assert a.shape == (N,)  # ndarray
assert b.shape == (N, K)  # ndarray

# a[:, (Not enough here)] 
c = a[:, np.newaxis] + b[:, :]
assert c.shape == (N, K)  #Let's have the role of error check and memo at the same time

Summary

In this way, a calm understanding of what Numpy does automatically will make it easier for you to deal with complex problems.

I will add it later even if it is a more difficult problem.
If you find the explanation difficult to understand, please let us know. Questions etc. are also accepted in the comments.

Python / Numpy np.newaxis thinking tips