Comparison of matrix transpose speeds with Python

1. Background

I wanted to get a 2D array (transposed matrix) with rows and columns swapped from a 2D array based on the Python standard list type, and ** because my code was so slow **, which transposed matrix method I was wondering if is the fastest, so I compared it.

Transpose Matrix-Wikipedia

Since the author is a beginner with less than half a year of programming experience, I would appreciate it if you could point out any mistakes or improvement methods. [** To speed comparison result **](# 5-Speed comparison result)

2. Environment

name version
Python 3.7.4
Jupyter Notebook 6.0.1
NumPy 1.16.5
Pandas 0.25.1

3. Conditions

--Input is a two-dimensional array with Python standard list type --Output is Python standard list type, numpy.array or pandas.DataFrame

3-1. Transpose matrix

The name of the matrix to be transposed is ʻin_matrix`, and this time a 100x100 matrix is generated.

Transpose matrix


#Transpose matrix in_Create matrix
n = 100
m = 100
in_matrix = [[i for i in range(n)] for j in range(m)]

3-2. Measurement of processing time

It is done using % timeit of Jupyter Notebook (IPython). Built-in magic commands — %timeit
The number of loops is -n 10000, and the repetition is -r 10.

Measurement of processing time


#Measurement of processing time
%timeit -n 10000 -r 10 turn_matrix(in_matrix)

4. Considered transposition method

--Transpose by double loop [[turn_matrix1 ()](# 6-1-Transpose by double loop 1), [turn_matrix2 ()](# 6-2-Transpose by double loop 2)] --Transpose by built-in function zip () [[turn_matrix3 ()](# 6-3-Transpose by built-in function zip), [turn_matrix4 ()](# 6-4-Transpose by built-in function zip list comprehension notation) )]] --Transpose by NumPy [[turn_matrix5 ()](Transpose by # 6-5-numpy)] --Transpose by Pandas [[turn_matrix6 ()](Transpose by # 6-6-pandas)]

Of the above, the transposition using the built-in functions zip (), NumPy, and Pandas is based on the following. Swap rows and columns of Python list type 2D array (transpose) | note.nkmk.me

5. Speed comparison result

In comparison of transposition speeds, [Transpose by NumPy](Transpose by # 6-5-numpy) is ** overwhelmingly faster (1.67 µs) **, followed by [Transpose by Pandas](Transpose by # 6-6-pandas). ) (86.3 µs), [Function zip () + list comprehension notation](# 6-4-Transposed by built-in function zip list comprehension notation) (99.4 µs).

No. def description Transpose speed
1 turn_matrix1() Double loop 1 777 µs ± 43.6 µs
2 turn_matrix2() Double loop 2 654 µs ± 83 µs
3 turn_matrix3() Functionzip() 105 µs ± 4.22 µs
4 turn_matrix4() Functionzip()+List comprehension 99.4 µs ± 1.36 µs
5 turn_matrix5() NumPy 1.67 µs ± 38.9 ns
6 turn_matrix6() Pandas 86.3 µs ± 4.54 µs

sp.png

However, in [Transpose by NumPy](Transpose by # 6-5-numpy) and [Transpose by Pandas](Transpose by # 6-6-pandas), the conversion from list type to numpy.array is ** 486 µs *. *, Conversion to pandas.DataFrame takes ** 6.19 ms , so The time including conversion from list type is [Transpose by NumPy](Transpose by # 6-5-numpy) ( 487.67 µs ), [Transpose by Pandas](Transpose by # 6-6-pandas) It became ( 6.2763 ms **). Therefore, if the given matrix is a list type, the total processing time to obtain the transposed matrix from the list type matrix is ** [function zip () + list comprehension notation](# 6-4-built-in function zip). Transpose by list comprehension) (99.4 µs) ** was considered to be the fastest.

No. def description Transpose speed np.array or pd.Conversion to DataFrame
1 turn_matrix1() Double loop 1 777 µs ± 43.6 µs
2 turn_matrix2() Double loop 2 654 µs ± 83 µs
3 turn_matrix3() Functionzip() 105 µs ± 4.22 µs
4 turn_matrix4() Functionzip()+List comprehension 99.4 µs ± 1.36 µs
5 turn_matrix5() NumPy 1.67 µs ± 38.9 ns 486 µs ± 10.1 µs
6 turn_matrix6() Pandas 86.3 µs ± 4.54 µs 6.19 ms ± 43.1 µs

sp2.png

6. Transpose function

6-1. Transpose by double loop 1

Function name: turn_matrix1 () Transpose by double loop created without any reference. Pass the number of rows x and the number of columns y of the argument matrix to the for loop, and extract from the argument matrix with the number of rows as the number of columns and the number of columns as the number of rows.

Transpose by double loop 1


#Transpose matrix in_Create matrix
n = 100
m = 100
in_matrix = [[i for i in range(n)] for j in range(m)]

#Transpose function turn_matrix1()
def turn_matrix1(matrix):
    x = len(matrix)
    y = len(matrix[0])    
    turned = []
    for i in range(y):
        tmp = []
        for j in range(x):
            tmp.append(matrix[j][i])
        turned.append(tmp)
    return turned

#Measurement of processing time
%timeit -r 10 -n 10000 turn_matrix1(in_matrix)

Execution result 777 µs ± 43.6 µs per loop (mean ± std. dev. of 10 runs, 10000 loops each)

6-2. Transpose by double loop 2

Function name: turn_matrix2 () In [6-1. Transpose by double loop 1](# 6-1-Transpose by double loop 1), the number of rows x and the number of columns of the argument matrix were acquired, but the argument matrix From, take out row by row with a for loop, create a tmp row with the same column number value, and add it to turned.

Transpose by double loop 2


#Transpose matrix in_Create matrix
n = 100
m = 100
in_matrix = [[i for i in range(n)] for j in range(m)]

#Transpose function turn_matrix2()
def turn_matrix2(matrix):
    y = len(matrix[0])
    turned = []
    for i in range(y):
        tmp = []
        for j in matrix:
            tmp.append(j[i])
        turned.append(tmp)
    return turned

#Measurement of processing time
%timeit -r 10 -n 10000 turn_matrix2(in_matrix)

Execution result 654 µs ± 83 µs per loop (mean ± std. dev. of 10 runs, 10000 loops each)

6-3. Transpose by built-in function zip ()

Function name: turn_matrix3 ()

Built-in function zip()Transpose by


#Transpose matrix in_Create matrix
n = 100
m = 100
in_matrix = [[i for i in range(n)] for j in range(m)]

#Transpose function turn_matrix3(matrix)
def turn_matrix3(matrix):
    turned = []
    for i in zip(*matrix):
        turned.append(list(i))
    return turned

#Measurement of processing time
%timeit -r 10 -n 10000 turn_matrix3(in_matrix)

Execution result 105 µs ± 4.22 µs per loop (mean ± std. dev. of 10 runs, 10000 loops each)

6-4. Intrinsic function zip () + list comprehension transpose

Function name: turn_matrix4 ()

Built-in function zip()+Transpose by list comprehension


#Transpose matrix in_Create matrix
n = 100
m = 100
in_matrix = [[i for i in range(n)] for j in range(m)]

#Transpose function turn_matrix4()
def turn_matrix4(matrix):
    return [list(x) for x in zip(*matrix)]

#Measurement of processing time
%timeit -r 10 -n 10000 turn_matrix3(in_matrix)

Execution result 99.4 µs ± 1.36 µs per loop (mean ± std. dev. of 10 runs, 10000 loops each)

List comprehension


turned_matrix = [list(x) for x in zip(*in_matrix)] 

Swap rows and columns of Python list type 2D array (transpose) | note.nkmk.me

6-5. Transpose by NumPy

Function name: turn_matrix5 ()

Transpose by NumPy


import numpy as np

#Transpose matrix in_Create matrix
n = 100
m = 100
in_matrix = [[i for i in range(n)] for j in range(m)]

#in_numpy from matrix.array numpy_in_Create matrix
%timeit numpy_in_matrix = np.array(in_matrix)

#Transpose function turn_matrix5()
def turn_matrix5(matrix):
    return matrix.T

#Measurement of processing time
%timeit -r 10 -n 10000 turn_matrix5(numpy_in_matrix)

Execution result Conversion from in_matrix to numpy.array 486 µs ± 10.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Transpose 1.67 µs ± 38.9 ns per loop (mean ± std. dev. of 10 runs, 10000 loops each)

Transpose by NumPy


import numpy as np

turned_matrix = np.array(in_matrix).T

Swap rows and columns of Python list type 2D array (transpose) | note.nkmk.me

6-6. Transpose by Pandas

Function name: turn_matrix6 ()

Transpose by Pandas


import pandas as pd

#Transpose matrix in_Create matrix
n = 100
m = 100
in_matrix = [[i for i in range(n)] for j in range(m)]

#in_pandas from matrix.DataFrame pandas_in_Create matrix
%timeit pandas_in_matrix = pd.DataFrame(in_matrix)

#Transpose function turn_matrix6()
def turn_matrix5(matrix):
    return matrix.T

#Measurement of processing time
%timeit -r 10 -n 10000 turn_matrix5(pandas_in_matrix)

Execution result · Conversion from in_matrix to pandas.DataFrame 6.19 ms ± 43.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

・ Transpose 86.3 µs ± 4.54 µs per loop (mean ± std. dev. of 10 runs, 10000 loops each)

Transpose by Pandas


import pandas as pd

turned_matrix = pd.DataFrame(in_matrix).T

Swap rows and columns of Python list type 2D array (transpose) | note.nkmk.me

Recommended Posts

Comparison of matrix transpose speeds with Python
Performance comparison of face detector with Python + OpenCV
Comparison of 4 Python web frameworks
Comparison of CoffeeScript with JavaScript, Python and Ruby grammar
Getting Started with Python Basics of Python
Life game with Python! (Conway's Game of Life)
Implementation of Dijkstra's algorithm with python
Speed comparison of Python XML parsing
Coexistence of Python2 and 3 with CircleCI (1.0)
Basic study of OpenCV with Python
Speed comparison of Wiktionary full text processing with F # and Python
Basics of binarized image processing with Python
[Examples of improving Python] Learning Python with Codecademy
Execute Python script with cron of TS-220
Comparison of Japanese conversion module in Python3
Check the existence of the file with python
Algorithm learned with Python 8th: Evaluation of algorithm
Clogged with python update of GCP console ①
Easy introduction of speech recognition with Python
python string comparison / use'list'and'in' instead of'==' and'or'
[EDA] Introduction of Sweetviz (comparison with + pandas-profiling)
Playing card class in Python (with comparison)
Learn Python! Comparison with Java (basic function)
UnicodeEncodeError struggle with standard output of python3
Create Excel file with Python + similarity matrix
1. Statistics learned with Python 1-3. Calculation of various statistics (statistics)
Drawing with Matrix-Reinventor of Python Image Processing-
Recommendation of Altair! Data visualization with Python
Comparison of Python serverless frameworks-Zappa vs Chalice
Matrix Convolution Filtering-Reinventor of Python Image Processing-
[AtCoder] Solve A problem of ABC101 ~ 169 with Python
I tried hundreds of millions of SQLite with python
FizzBuzz with Python3
Introduction of Python
Scraping with Python
Library comparison summary to generate PDF with Python
Prepare the execution environment of Python3 with Docker
First Python 3 ~ First comparison ~
Statistics with python
Automatic operation of Chrome with Python + Selenium + pandas
[Python] limit axis of 3D graph with Matplotlib
2016 The University of Tokyo Mathematics Solved with Python
Scraping with Python
Algorithm learned with Python 13th: Tower of Hanoi
Python with Go
Color page judgment of scanned image with python
[Note] Export the html of the site with python.
Clogged with python update of GCP console ② (Solution)
[Python3] Coarse graining of numpy.ndarray Speed comparison etc.
[Python] 90 degree clockwise rotation, 90 degree counterclockwise rotation, 180 degree rotation of matrix [AtCoder]
Calculate the total number of combinations with python
Twilio with Python
Integrate with Python
Use multiple versions of python environment with pyenv
Check the date of the flag duty with Python
Play with 2016-Python
Solve A ~ D of yuki coder 247 with python
AES256 with python
Tested with Python
Basics of Python ①
1. Statistics learned with Python 1-2. Calculation of various statistics (Numpy)