The newly released Raspberry Pi 2 has a 4-core CPU. It's a good idea, so I'd like to do distributed processing and use up 4 cores to the full.
I ran a simple distributed processing program using the distributed processing framework Parallel Python and compared the processing speeds of the old model and the new model. We also checked the usage rate of each core with mpstat.
Compared to the old model, when 4 cores are used up, it is ** about 8.5 times **, and even with 1 core alone, it is ** about 2.4 times ** faster. When 4 cores were used up, the result of mpstat was 100% of all cores.
I am satisfied because I have used up 4 cores. Also, it turned out that even one core is fast enough. [There is a report that the desktop screen and Chrome browser start up 2.2 to 2.6 times faster](http://itpro.nikkeibp.co.jp/atcl/column/14/565123/020600005/?ST= oss & P = 2) Since the speed improvement rate is close, the performance improvement for one core may have contributed significantly.
Count an even number of natural numbers from 1 to N. The counting method is determined by finding the remainder divided by 2 one by one. A natural number from 1 to N is divided into k, and each is shared by k cores for distributed processing. It's a problem that can be solved with one shot if you do N / 2, but I apologize to Raspberry Pi first for letting me do a boring job.
This time, N = 10 million.
pptest.py
# coding: UTF-8
import pp
#Determine if n is a multiple of m
def ismultiples(n,m):
return n % m == 0
#from n1 to n2(Including n2)Find out how many multiples of m are among the natural numbers of
def sum_multiples(n1,n2,m):
cnt = 0
for x in range(n1,n2+1):
if ismultiples(x,m):
cnt += 1
return cnt
#IP address of Raspberry Pi to process
ppservers = ("192.168.1.241","192.168.1.241","192.168.1.241","192.168.1.241",) #Use 4 cores
# ppservers = ("192.168.1.241",) #1 core used
#Maximum value of natural numbers
N = 10000000
#Multiples you want to count
M = 2
#Number of nodes
num_node = len(ppservers)
#Create a server object by registering the connection destination node
job_server = pp.Server(0, ppservers)
#Task generation
jobs = []
for i in range(num_node):
#Find the range of natural numbers handled by the i-th node
indSt = N/num_node*i+1
if (i==num_node-1):
indEnd = N
else:
indEnd = indSt+N/num_node-1
#Throw a task to a node
jobs.append(job_server.submit(sum_multiples, (indSt, indEnd, M), (ismultiples,), ("math",)))
print("task%d args:(%d,%d,%d)" % (i,indSt,indEnd,M))
#Collect execution results. sum_mutiples()Image to get the return value of.
#If the processing on the node is not finished yet, the processing here will be blocked until it is finished..
result = 0;
for i in range(num_node):
result += jobs[i]()
#View results
print "%Of natural numbers less than or equal to d%Number of multiples of d= %d" % (N, M, result)
job_server.print_stats()
Four same IP addresses are described in ppservers. By doing this, you can throw the task divided into 4 on the same Raspberry Pi, and you can fully use 4 cores. If only one core is used, describe only one. If you want to distribute the processing with multiple Raspberry Pis, you can set the IP address of each Raspberry Pi here.
> python setup.py
> ppserver.py &
> python pptest.py
Execution environment | processing time[sec] | Speed improvement rate * 1 |
---|---|---|
Old model(Raspberry Pi B+) | 48.7 | - |
New model(Raspberry Pi 2 B)1 core used | 20.1 | 2.4 |
New model(Raspberry Pi 2 B)Use 4 cores | 5.7 | 8.5 |
The output of the script is as follows
task0 args:(1,10000000,2) Multiples of 2 out of natural numbers less than 10000000 = 5000000 Job execution statistics: job count | % of all jobs | job time sum | time per job | job server 1 | 100.00 | 48.4202 | 48.420249 | 192.168.1.241:60000 Time elapsed since server creation 48.7006518841 0 active tasks, 0 cores
task0 args:(1,10000000,2) Multiples of 2 out of natural numbers less than 10000000 = 5000000 Job execution statistics: job count | % of all jobs | job time sum | time per job | job server 1 | 100.00 | 19.6110 | 19.610974 | 192.168.1.241:60000 Time elapsed since server creation 20.0955970287 0 active tasks, 0 cores
task0 args:(1,2500000,2) task1 args:(2500001,5000000,2) task2 args:(5000001,7500000,2) task3 args:(7500001,10000000,2) Multiples of 2 out of natural numbers less than 10000000 = 5000000 Job execution statistics: job count | % of all jobs | job time sum | time per job | job server 4 | 100.00 | 20.6152 | 5.153800 | 192.168.1.241:60000 Time elapsed since server creation 5.68198180199 0 active tasks, 0 cores
sudo apt-get install sysstat
When measuring 10 times at 1-second intervals, do as follows
mpstat -P ALL 1 10
I tried running the above script while running mpstat in another terminal.
Since there is only one core, the CPU number is only 0. % usr is 100.
22:30:56 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 22:30:57 all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22:30:57 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Since there are 4 cores, there are CPU numbers from 0 to 3. Only CPU3 is 100%, and the others are 0%. All is 1/4, which is 25.0%.
22:35:54 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 22:35:55 all 25.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.00 22:35:55 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 22:35:55 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 22:35:55 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 22:35:55 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
All 4 cores are 100%. Satisfied.
22:22:11 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 22:22:12 all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22:22:12 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22:22:12 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22:22:12 2 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22:22:12 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Recommended Posts