It seems that two or more threads do not work in parallel at the same time due to the influence of GIL (Global Interpreter Lock) in threading.Thread of Python (CPython) (although it works in parallel). Then, I wondered what about other implementations, so I checked it with Jython. Also, Jython can use Java API, so I checked java.lang.Thread as well.
Create a worker that distributes and calculates tasks that enumerate prime numbers in the range of 4 to 100,000. The following is when threading.Thread is used.
py_worker.py
from threading import Thread
class Worker(Thread):
def __init__(self, start, end):
super(Worker, self).__init__()
self._start = start
self._end = end
def run(self):
self.prime_nums = []
for i in xrange(self._start, self._end):
if not 0 in self._remainders(i):
self.prime_nums.append(i)
def _remainders(self, end, start=2):
for i in xrange(start, end):
yield end % i
The following is when using java.lang.Thread. (Only the class to import is different)
jy_worker.py
from java.lang import Thread
class Worker(Thread):
def __init__(self, start, end):
super(Worker, self).__init__()
self._start = start
self._end = end
def run(self):
self.prime_nums = []
for i in xrange(self._start, self._end):
if not 0 in self._remainders(i):
self.prime_nums.append(i)
def _remainders(self, end, start=2):
for i in xrange(start, end):
yield end % i
And the process of kicking the worker thread and measuring the elapsed time is as follows.
main.py
import sys
from threading import Thread
from datetime import datetime
def total_seconds(td):
return (td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6
if __name__ == '__main__':
argv = sys.argv
argc = len(argv)
if argc < 4:
print 'ERROR: <worker_module> <n_workers> <max_value>'
sys.exit(1)
worker_module = argv[1]
n_workers = int(argv[2])
max_value = int(argv[3])
min_value = 4
interval = (max_value - min_value) / n_workers
Worker = __import__(worker_module).Worker
workers = []
for start in xrange(4, max_value, interval):
print 'Worker: %s, %s' % (start, start+interval)
worker = Worker(start, start+interval)
workers.append(worker)
start_time = datetime.utcnow()
for worker in workers:
worker.start()
for worker in workers:
worker.join()
end_time = datetime.utcnow()
elapsed_time = end_time - start_time
elapsed_sec = total_seconds(elapsed_time)
n_primes = sum([len(w.prime_nums) for w in workers])
print '# of primes = %s, time = %s sec' % (n_primes, elapsed_sec)
The elapsed time until the worker processing is completed is as follows.
Implementation | class | 1 thread | 2 threads |
---|---|---|---|
Python | threading.Thread | 100 sec | 125 sec |
Jython | threading.Thread | 101 sec | 73 sec |
Jython | java.lang.Thread | 101 sec | 77 sec |
Python can only run one thread at a time, so distributing it across two threads doesn't make it faster (rather slower), but Jython gave different results.
Since the elapsed time in one thread is almost the same in Python and Jython, I think that the basic performance for the processing used this time will not change. (Actually, I expected that Java dynamic compilation would work and Jython would be faster). And in the case of Jython, 2 threads finished earlier than 1 thread, so it feels like they are working in parallel. I wonder if the behavior here is implementation-dependent.
Recommended Posts