As the title suggests, when the ndarray is initialized with numpy.zeros ([number of rows, columns]), the memory area for storing the data of ** [number of rows, columns] size ndarray is not secured. ** Internally, it seems that memory is actually allocated only to non-zero data. Therefore, even a huge ndarray such as 1 million x 1 million can be handled lightly even on a PC with 32GB or 64GB of memory when initialized with numpy.zeros (). It may be useful when dealing with sparse matrices in memory-limited environments. (I think it would be nice to use scipy.sparse obediently)
import numpy as np
import sys
#1 million x 1 million,Type creates float64 ndarray
zero_array = np.zeros([1000000,1000000])
#Check memory size
sys.getsizeof(zero_array)
# 8000000000112(About 8TB)
#Works lightly on a PC with 32GB of memory
print(zero_array)
#[[0. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]
# ...
# [0. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]]
zero_array[0,0] = 1
print(zero_array)
#[[1. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]
# ...
# [0. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]
# [0. 0. 0. ... 0. 0. 0.]]
When I try to initialize it with numpy.ones (), there is not enough memory and the process goes down.
one_array = np.ones([1000000,1000000])
#Abnormal process termination
Recommended Posts