When there are w and x of n-dimensional vector, it is inefficient if the for statement is used to calculate $ z = w_1x_1 + w_2x_2 ... w_nx_n $. When vectorized, it can be calculated efficiently by setting $ z = w ^ Tx + b $. In Python, you can compare with the following code.
import numpy as np
import time
a = np.random.rand(1000000)
b = np.random.rand(1000000)
tic = time.time()
c = np.dot(a, b)
toc = time.time()
print(f'{"Vectrized version":20}:{str(1000 * (toc-tic))} ms')
c = 0
tic = time.time()
for i in range(1000000):
c += a[i] * b[i]
toc = time.time()
print(f'{"For loop":20}:{str(1000 * (toc-tic))} ms')
Vectrized version :3.9501190185546875 ms
For loop :1007.7228546142578 ms
Suppose you want to perform an exponential operation on a 1000000 dimensional vector $ v $.
import numpy as np
import time
import math
v = np.random.rand(1000000)
u = np.zeros((1000000, 1))
tic = time.time()
u = np.exp(v)
toc = time.time()
print(f'{"Vectrized version":20}:{str(1000 * (toc-tic))} ms')
c = 0
tic = time.time()
for i in range(1000000):
u[i] = math.exp(v[i])
toc = time.time()
print(f'{"For loop":20}:{str(1000 * (toc-tic))} ms')
Vectrized version :3.992319107055664 ms
For loop :857.1090698242188 ms
There is a difference of about 300 times between the vectorized version and the non-vectorized version. When I want to use for-loop, I want to think of a way to avoid using it.
Recommended Posts