It's probably natural to hear that my friends are studying the implementation of algorithms from time to time and working on the implementation of merge sort. A mysterious game was held in which it was better to implement it with as few lines as possible because it was a big deal. I heard that my friend wanted to do it in Python, so I decided to do it in Python. I won't lose
--Use a data array in which numbers from 1 to 10 are randomly arranged.
--The code of the data array generation and output part is common, so do not mess with it
--Anyway, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
should be output.
The code of the common part is as follows
from data import generate_data
data = generate_data() #Numbers from 1 to 10 are lined up in pieces
print(data)
#Implemented below this
There is data.py
in another file, and the generate_data
function there will generate the data.
The data will be something like [5, 1, 3, 9, 2, 8, 7, 4, 10, 6]
.
I just made the contents of data.py
properly, so I will omit it here.
When I wrote it according to the above rules, it became like this.
from data import generate_data
data = generate_data() #Numbers from 1 to 10 are lined up in pieces
print(data)
#Implemented below this
divide = lambda data: data if len(data) == 1 else [min(data), max(data)] if len(data) <= 2 else [divide([data[i] for i in range(len(data) // 2)]), divide([data[i] for i in range(len(data) // 2, len(data))])]
merge = lambda left, right: [right.pop(0) if len(left) == 0 else left.pop(0) if len(right) == 0 else left.pop(0) if left[0] < right[0] else right.pop(0) for i in range(len(left) + len(right))] if type(left[0]) is int and type(right[0]) is int else merge(left, merge(right[0], right[1])) if type(left[0]) is int else (merge(merge(left[0], left[1]), right) if type(right[0]) is int else merge(merge(left[0], left[1]), merge(right[0], right[1])))
print(merge(divide(data)[0], divide(data)[1]))
output
[5, 1, 3, 9, 2, 8, 7, 4, 10, 6]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
There was no problem even if I was told "Long. In three lines". I will explain it for the time being.
Merge sort seems to be based on the "divide-and-conquer method", and it seems that the data array is once divided to the minimum unit, and the data is appropriately selected from each divided data, combined, and then reordered. The person who thought Since many ancestors have explained various details of the algorithm, I will omit it here.
So, the implementation result of the above three lines is Line 1: Split Line 2: Join Line 3: Output The roles are divided like this.
Let's start with "splitting".
This is the code in question.
"Split" code
divide = lambda data: data if len(data) == 1 else [min(data), max(data)] if len(data) <= 2 else [divide([data[i] for i in range(len(data) // 2)]), divide([data[i] for i in range(len(data) // 2, len(data))])]
If the data array is [7, 6, 3, 2, 4, 5, 1, 8]
, the purpose of "splitting" is from this array.
[
[
[6, 7],
[2, 3]
],
[
[4, 5],
[1, 8]
],
]
It means to generate a multiple array with a hierarchical structure like. Divide the minimum unit in half so that the number of elements is 2 or less, and if the number of elements in the minimum unit is 2 or less, order them in ascending order. I'm doing something like that. The ones that are ordered in ascending order are originally the role of "coupling", but it is convenient to do it here, so I will do it at this point.
The "split" function is essentially like this.
Original shape
def divide(data):
if (len(data) <= 2):
return data if len(data) == 1 else [min(data), max[data]]
else:
half_count = len(data) // 2
left = [data[i] for i in range(half_count)]
right = [data[i] for i in range(half_count, len(data))]
return [divide(left), divide(right)]
When the number of elements is 2 or less, the array is returned in ascending order, and in other cases, the above hierarchical structure can be realized by recursing in half for left
and right
. I will.
After `ʻelse, if you use the variable of
half_countdirectly by replacing it with
len (data) // 2, you can avoid using the variable. Oh, I used
leftand
right`` only once! If you do it as it is within the return value
The return value is crazy ~
def divide(data):
if (len(data) <= 2):
return data if len(data) == 1 else [min(data), max[data]]
else:
return [divide([data[i] for i in range(len(data) // 2)]), divide([data[i] for i in range(len(data) // 2, len(data))])]
Will be.
Ah, both ʻif`` and
ʻelsehave only one
return`` sentence. It ’s a ternary operator and it ’s just one line.
One line with ternary operator
def divide(data):
return data if len(data) == 1 else [min(data), max(data)] if len(data) <= 2 else [divide([data[i] for i in range(len(data) // 2)]), divide([data[i] for i in range(len(data) // 2, len(data))])]
It calms down to the feeling. If there is only one return
in the function, use a lambda expression
Make it a lambda expression
divide = lambda data: data if len(data) == 1 else [min(data), max(data)] if len(data) <= 2 else [divide([data[i] for i in range(len(data) // 2)]), divide([data[i] for i in range(len(data) // 2, len(data))])]
It fits perfectly in one line. It's nice to be able to use recursion even with lambda expressions.
This is the code in question.
"Join" code
merge = lambda left, right: [right.pop(0) if len(left) == 0 else left.pop(0) if len(right) == 0 else left.pop(0) if left[0] < right[0] else right.pop(0) for i in range(len(left) + len(right))] if type(left[0]) is int and type(right[0]) is int else merge(left, merge(right[0], right[1])) if type(left[0]) is int else (merge(merge(left[0], left[1]), right) if type(right[0]) is int else merge(merge(left[0], left[1]), merge(right[0], right[1])))
Guhe, Nageyo ... The purpose of "joining" is
Data after division
[
[
[6, 7],
[2, 3]
],
[
[4, 5],
[1, 8]
],
]
It is to combine the divided data in order from the lower layer, and finally combine all the data to return to the original form of one layer. As a flow
Flow 1
[
[2, 3, 6, 7],
[1, 4, 5, 8]
]
Flow 2
[1, 2, 3, 4, 5, 6, 7, 8]
And so on. To achieve this flow, the merge
function
--Both the left and right elements of the argument are combined when their child elements are numeric
--Joining is done by pop
the smaller of the first elements on the left and right and add them as child elements of the new array.
--Otherwise, recurse with the child elements of the left and right elements so that the child elements of the left and right elements are numerical values.
I will do something like that. It's hard to understand even if you explain it in words, so let's look at the code.
The "join" function originally looks like this.
Original shape
def merge(left, right):
if type(left[0]) is int and type(right[0]) is int: #Both when the child element is a number
result = [] #Prepare a new array
for i in range(len(left) + len(right)):
# left,When the right element disappears, pop the one that still exists
if len(left) == 0:
result.append(right.pop(0))
elif len(right) == 0:
result.append(left.pop(0))
#If both are still left, pop the smaller one
elif left[0] < right[0]:
result.append(left.pop(0))
else:
result.append(right.pop(0))
return result
else:
"""
Depending on the hierarchy, the child elements may be numbers or arrays even at the same level.
Recurse leaving the one whose child element is numeric
"""
if type(left[0]) is int:
return merge(left, merge(right[0], right[1]))
elif type(right[0]) is int:
return merge(merge(left[0], left[1]), right)
else:
return merge(merge(left[0], left[1]), merge(right[0], right[1]))
See comments for implementation details. I intentionally made it complete with `ʻif-elif-else``. This means that you can use the ternary operator to make the return value one line, and you can write the function in one line with a lambda expression, just like in the case of "split"!
When I tried to express result
in a list comprehension notation, I was worried that pop
would reduce the elements properly, but when I actually tried it, it decreased properly. I was able to represent the for
part in one line using list comprehensions and ternary operators.
"Output" code
print(merge(divide(data)[0], divide(data)[1]))
Originally here
data = divide(data)
print(merge(data[0], data[1]))
And divide
should be executed only once, but as you can see, it is a waste of the number of lines to exercise 1 for securing variables, so I forcibly execute it twice. Well, if the number of elements is about 10, it's an error.
I'm sure it will be the winning code for the Fucking Code of the Year Awards in me. Thank you very much.
Recommended Posts