This is a project (?) Originating from Making a password inquiry system (clojure reducers), and a realistic range of JAL's password management system. It is a delusion of the method that can be achieved with.
Don't call it now or outdated.
Please let us know if there are any omissions in the list. I will add it.
A record that doesn't shine at all is added here.
os:windows7 Home Premium sp1 cpu:i7-4770T @2.50GHz Python:3.4.1(Anaconda 2.0.1 (64-bit))
Since it is Python3, there is work to convert the character string to a byte string. The cpu has 8 cores, but the number of parallels is 4. The code I used can be found here [https://github.com/soyiharu/md5_time_trial).
hash_single.py
import time
import hashlib
import sys
def main():
argv = sys.argv[1:]
if len(argv) != 2:
sys.exit(0)
salt = argv[0]
hash = argv[1]
start = time.time()
for i in range(1000000):
pw = "{0}${1:06d}".format(salt, i).encode("utf-8")
tmp = hashlib.md5(pw).hexdigest()
if hash == tmp:
print("match[{0:06d}]".format(i))
end = time.time()
print("elapsed time:{0}s".format(end - start))
if __name__ == "__main__":
main()
hash_parallel.py
import time
import hashlib
import sys
from multiprocessing import Pool
from itertools import repeat
def calc_hash(arg):
hash, salt, i = arg
tmp = hashlib.md5("{0}${1:06d}".format(salt, i).encode("utf-8")).hexdigest()
return hash == tmp
def main():
argv = sys.argv[1:]
if len(argv) != 2:
sys.exit(0)
salt = argv[0]
hash = argv[1]
start = time.time()
pool = Pool(4)
result = pool.map(calc_hash, zip(repeat(hash), repeat(salt), range(1000000)))
index = result.index(True)
print("match[{0:06d}]".format(index))
end = time.time()
print("elapsed time:{0}s".format(end - start))
if __name__ == "__main__":
main()
Repeat each 5 times to measure the time.
python hash_single.py hoge 4b364677946ccf79f841114e73ccaf4f
python hash_parallel.py hoge 4b364677946ccf79f841114e73ccaf4f
First time | Second time | Third time | 4th | 5th time | average | standard deviation | |
---|---|---|---|---|---|---|---|
Single version | 1.724097967 | 1.736099005 | 1.729099035 | 1.733099937 | 1.739099026 | 1.732298994 | 0.005891065 |
Multi version | 1.086061954 | 1.098062992 | 1.080061913 | 1.113064051 | 1.085062027 | 1.092462587 | 0.013278723 |
The unit is seconds
Execution time in parallel is 63% of non-parallel ※reference "Brute force of MD5 hash value of 6-digit password" When using OpenMP, the execution result (parallel number 4) was 0.912s even though it was slightly changed based on the kita up to 0.70 seconds. (Source Code)
Since the execution time is only about 63% in parallelization, it is at all in terms of parallelization, but I think that Python has worked hard because I caught up with the difference of about 16% from the case of c. It can be said that c is useless.
I learned about the existence of a bytearray. The python string and byte string cannot be changed, but the bytearray can be changed. Using this, I thought that it would be faster if I recreated only the 6-digit number part without recreating the entire byte string every time in the single thread version, but it was only about 0.1 seconds faster, not as much as I expected. did.
I'll post the changes so that you can see what you've done.
python
from itertools import product
pw = bytearray("{0}${1:06d}".format(salt, 0).encode("utf-8")) #Prepare bytearray
for i, value in enumerate(product(b'0123456789', repeat=6)):
pw[-6:] = value #Change only the number part
After all, if you want to pursue the speed, maybe C / C ++.
Recommended Posts