[Reinventing the wheel] Human-powered tournament sort [Python]

Environment: Python 2.7

Introduction

When you pass what you want to sort, the elements are combined in a tournament formula and the two elements are manually compared. The final result is a complete ranking with no overlap in ranking. (Total order relation)

Maybe there is a better implementation, but I wanted to see if the idea was correct, so I made it.


__ Addendum (2014/12/04) __ Too natural implementation → Re: Human-powered sorting [Python]

Algorithm overview

The outline of the algorithm is as follows.

tournament.png

First, place all the elements in a tournament format and let them compete (pictured above). At this time, keep the tuple that has the winner at the beginning of the index. The final winner is the strongest of these factors. So this is rank 1 (green arrow).

Second, rank 2 is the second strongest, so of course you should have lost to this champion somewhere. You can search for this from the tuple you saved earlier (the part surrounded by red and the red arrow). Therefore, if you reposition these elements in a tournament format and compete (see the figure below), Rank 2 will definitely win.

After that, Rank 3 must have been defeated by Rank 2 in the previous battles (the part surrounded by blue and the blue arrow), so I will repeat this. By doing this, it is stored in the array rank from the highest rank, and you can see that you can continue until there are no more elements.

Code commentary

Commentary?

I wrote in Python what implements the above algorithm.

As I wrote in the description, this was written for self-analysis (laughs) I wanted something that would eventually become a ranking when I listed things of the same genre and answered which one I liked. is.

--Argument management with argparse. --As you can see in the last 7 lines, the function is executed recursively.

One thing to keep in mind is when you can't play a tournament with one element and when the final end is.

The function compare, which is commented out in the ~~ class Rank, was used when it was troublesome to input each time at the test stage, and was also used to measure the amount of calculation shown later. Other print commentouts are also for testing. If you display this, it will be easier to understand how it works. ~~

(Addition) For commenting out print, by adding "-v" to the option, the display is divided according to the truth of self.verbose. There seems to be a better way. Compare also switches between default_compare and auto_compare with the option "-t".

ranking.py


#!/usr/bin/env python
# -*- coding:utf-8 -*-
#
# written by ssh0, September 2014.

description = """
Ranking interface for self-analysis.
In this application
Create a comparative question based on the given list
The ranking is completed when the user answers it.
              """

import argparse
import sys


class Rank:

    def __init__(self, args):
        self.pair = []  # match pairs
        self.rank = []  # sorted list
        self.args = args
        self.verbose = self.args.verbose
        if self.args.test:
            self.compare = self.auto_compare
        else:
            self.compare = self.default_compare
        self.N = 0

    def default_compare(self, a, b):
        self.N += 1
        print '\nQ[%d] which ones do you like?' % self.N
        print '  [j]: %s, [k]: %s' % (a, b)
        key_input = raw_input(">>> ")
        if key_input == 'j':
            ans = (a, b)
        elif key_input == 'k':
            ans = (b, a)
        else:
            raise ValueError('please select by "j" or "k".')
        return ans

    def auto_compare(self, a, b):  #For measurement
        self.N += 1
        ans = (max(a, b), min(b, a))
        return ans

    def first_matching(self, data):
        n = len(data)
        N = 1
        if n == 0:
            self.func_finally()
        elif n == 1:
            self.rank.append(data[0])
            if self.verbose:
                print "n = 1 ::",
                print self.rank
            return self.first_matching(self.narrowing(data[0]))
        else:
            while n >= N:
                N = N*2
            unseed = 2*n - N
            first_winner = []
            for i in range(unseed/2):
                a, b = data.pop(0), data.pop(0)
                ans = self.compare(a, b)
                g, l = ans[0], ans[1]
                first_winner.append(g)
                self.pair.append((g, l))            
            data += first_winner
            member = [(data[2*i], data[2*i+1]) for i in range(len(data)/2)]
            return member

    def tournament(self, data):
        member = self.first_matching(data)
        if self.verbose:
            print "next tournament",
            print member
        while len(member) > 1:
            winner = []
            for p in member:
                ans = self.compare(p[0], p[1])
                g, l = ans[0], ans[1]
                self.pair.append((g, l))
                winner.append(g)
            member = [(winner[2*i], winner[2*i+1]) for i 
                      in range(len(member)/2)]
        ans = self.compare(member[0][0], member[0][1])
        top, l = ans[0], ans[1]
        self.pair.append((top, l))
        self.rank.append(top)
        if self.verbose:
            print "pairs",
            print self.pair
            print "champion",
            print top
            print "current rank",
            print self.rank
        return top

    def narrowing(self, x):
        unsort_table = []
        for_delete = []
        for i, pair in enumerate(self.pair):
            if x == pair[0]:
                unsort_table.append(pair[1])
                for_delete.append(i)
        for j in for_delete:
            self.pair[j] = None
        self.pair = [k for k in self.pair if k is not None]

        if self.verbose:
            print "unsort_table",
            print unsort_table
        return unsort_table
        
    def func_finally(self):
        result = ''
        for i, x in enumerate(self.rank):
            result += ' %d: %s\n' % (i+1, x)

        if self.args.outfile:
            outfile = self.args.outfile
            with open(outfile, 'a') as out:
                out.write(result)
            if not self.args.test:
                print result
                print "these result is saved in '%s'." % outfile
            else:
                print self.N
        elif self.args.test:
            print self.N
        else:
            print result
        sys.exit()


if __name__ == '__main__':

    parse = argparse.ArgumentParser(description=description)
    parse.add_argument('-l', '--list',
                       dest='obj',
                       nargs='*',
                       type=str,
                       help='list of some objects',
                       default=argparse.SUPPRESS
                       )
    parse.add_argument('-o', '--output',
                       dest='outfile',
                       type=str,
                       help='OPTIONAL: file to save output',
                       default=None
                       )
    parse.add_argument("-v", "--verbose",
                       help="increase output verbosity",
                       action="store_true"
                       )
    parse.add_argument("-t", "--test",
                       help="auto comparing mode (int, descending) for test",
                       action="store_true"
                       )
    args = parse.parse_args()
    data = args.obj

    rank = Rank(args)

    def ranking(pair_lists):
        result = rank.narrowing(rank.tournament(pair_lists))
        return result
        
    a = ranking(data)
    while True:
        a = ranking(a)

Execution example 1

The execution example is as follows.

➤  python ranking.py -l apple orange banana melon berry                      

Q[1] which ones do you like?
  [j]: apple, [k]: orange
>>> j

Q[2] which ones do you like?
  [j]: banana, [k]: melon
>>> j

Q[3] which ones do you like?
  [j]: berry, [k]: apple
>>> j

Q[4] which ones do you like?
  [j]: banana, [k]: berry
>>> k

Q[5] which ones do you like?
  [j]: apple, [k]: banana
>>> j

Q[6] which ones do you like?
  [j]: orange, [k]: banana
>>> j
 1: berry
 2: apple
 3: orange
 4: banana
 5: melon

Execution example 2

Since it is difficult to tell if the numbers are sorted properly from the above example, I will display the debug display and show the case where the numbers from 0 to 9 are sorted in descending order. Pairs is the list of tuples in the figure, and unsorted_table is the elements of the next tournament specified by the arrows in the figure. In the latter half, the number of elements of unsorted_table may become 1, so it is necessary to separate the processing. Ends the process when unsorted_table is empty.

➤  python ranking.py -l 1 0 6 3 4 9 5 7 8 2 -v

Q[1] which ones do you like?
  [j]: 1, [k]: 0
>>> j

Q[2] which ones do you like?
  [j]: 6, [k]: 3
>>> j
next tournament [('4', '9'), ('5', '7'), ('8', '2'), ('1', '6')]

Q[3] which ones do you like?
  [j]: 4, [k]: 9
>>> k

Q[4] which ones do you like?
  [j]: 5, [k]: 7
>>> k

Q[5] which ones do you like?
  [j]: 8, [k]: 2
>>> j

Q[6] which ones do you like?
  [j]: 1, [k]: 6
>>> k

Q[7] which ones do you like?
  [j]: 9, [k]: 7
>>> j

Q[8] which ones do you like?
  [j]: 8, [k]: 6
>>> j

Q[9] which ones do you like?
  [j]: 9, [k]: 8
>>> j
pairs [('1', '0'), ('6', '3'), ('9', '4'), ('7', '5'), ('8', '2'), ('6', '1'), ('9', '7'), ('8', '6'), ('9', '8')]
champion 9
current rank ['9']
unsort_table ['4', '7', '8']

Q[10] which ones do you like?
  [j]: 4, [k]: 7
>>> k
next tournament [('8', '7')]

Q[11] which ones do you like?
  [j]: 8, [k]: 7
>>> j
pairs [('1', '0'), ('6', '3'), ('7', '5'), ('8', '2'), ('6', '1'), ('8', '6'), ('7', '4'), ('8', '7')]
champion 8
current rank ['9', '8']
unsort_table ['2', '6', '7']

Q[12] which ones do you like?
  [j]: 2, [k]: 6
>>> k
next tournament [('7', '6')]

Q[13] which ones do you like?
  [j]: 7, [k]: 6
>>> j
pairs [('1', '0'), ('6', '3'), ('7', '5'), ('6', '1'), ('7', '4'), ('6', '2'), ('7', '6')]
champion 7
current rank ['9', '8', '7']
unsort_table ['5', '4', '6']

Q[14] which ones do you like?
  [j]: 5, [k]: 4
>>> j
next tournament [('6', '5')]

Q[15] which ones do you like?
  [j]: 6, [k]: 5
>>> j
pairs [('1', '0'), ('6', '3'), ('6', '1'), ('6', '2'), ('5', '4'), ('6', '5')]
champion 6
current rank ['9', '8', '7', '6']
unsort_table ['3', '1', '2', '5']
next tournament [('3', '1'), ('2', '5')]

Q[16] which ones do you like?
  [j]: 3, [k]: 1
>>> j

Q[17] which ones do you like?
  [j]: 2, [k]: 5
>>> k

Q[18] which ones do you like?
  [j]: 3, [k]: 5
>>> k
pairs [('1', '0'), ('5', '4'), ('3', '1'), ('5', '2'), ('5', '3')]
champion 5
current rank ['9', '8', '7', '6', '5']
unsort_table ['4', '2', '3']

Q[19] which ones do you like?
  [j]: 4, [k]: 2
>>> j
next tournament [('3', '4')]

Q[20] which ones do you like?
  [j]: 3, [k]: 4
>>> k
pairs [('1', '0'), ('3', '1'), ('4', '2'), ('4', '3')]
champion 4
current rank ['9', '8', '7', '6', '5', '4']
unsort_table ['2', '3']
next tournament [('2', '3')]

Q[21] which ones do you like?
  [j]: 2, [k]: 3
>>> k
pairs [('1', '0'), ('3', '1'), ('3', '2')]
champion 3
current rank ['9', '8', '7', '6', '5', '4', '3']
unsort_table ['1', '2']
next tournament [('1', '2')]

Q[22] which ones do you like?
  [j]: 1, [k]: 2
>>> k
pairs [('1', '0'), ('2', '1')]
champion 2
current rank ['9', '8', '7', '6', '5', '4', '3', '2']
unsort_table ['1']
n = 1 :: ['9', '8', '7', '6', '5', '4', '3', '2', '1']
unsort_table ['0']
n = 1 :: ['9', '8', '7', '6', '5', '4', '3', '2', '1', '0']
unsort_table []
 1: 9
 2: 8
 3: 7
 4: 6
 5: 5
 6: 4
 7: 3
 8: 2
 9: 1
 10: 0

Computational complexity measurement

By the way, since this program itself is premised on human input, I would like the number of comparisons to be the smallest. So, I measured how many comparison operations were performed on N elements.

Prepare a test script as shown below,

test_ranking.py


#!/usr/bin/env python
# -*- coding:utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt
import commands

Ns = [10*i for i in range(1, 11)]
cal = []

for N in Ns:
    raw = [i for i in range(N)]
    x = np.random.choice(raw, N, replace=False)
    lists = ''
    for _x in x:
        lists += str(int(_x)) + ' '

    command = "python ranking.py -l %s -t" % lists
    result = commands.getoutput(command)
    cal.append(result)

f = plt.figure()
ax = f.add_subplot(111)
ax.plot(Ns, cal)
ax.set_xlabel(r"$N$")
ax.set_ylabel("calculate times")
plt.show()

The graph obtained as a result of execution is shown in the figure below. The horizontal axis represents the number of elements N (in 10 increments), and the vertical axis represents the number of comparison operations.

figure_1.png

[Theoretical Limits of Comparative Sorting (Wikipedia)](http://ja.wikipedia.org/wiki/%E3%82%BD%E3%83%BC%E3%83%88#.E6.AF.94.E8 .BC.83.E3.82.BD.E3.83.BC.E3.83.88.E3.81.AE.E7.90.86.E8.AB.96.E9.99.90.E7.95.8C) is $ O ( For n \ log n) $, $ 100 \ log 100 \ simeq 460.5 $, and $ N = 100 $, the amount of comparison operation was 626, so it seems that the order cannot be made smaller. Conversely, I think that the algorithm created this time was a good comparison sorting algorithm.

Summary

The tournament sort itself should have been implemented somewhere, so if you define the comparison operator yourself, you might have been able to do the same thing. However, I am personally satisfied because I was able to think about it myself. I don't care if it's called reinventing the wheel. (If you are familiar with methods, please let me know)

that? What is self-analysis?

Recommended Posts

[Reinventing the wheel] Human-powered tournament sort [Python]
Python # sort
[Python] Sort the list of pathlib.Path in natural sort
[Python] Sort the table by sort_values (pandas DataFrame)
Sort in Python. Next, let's think about the algorithm.
Find the maximum Python
Bubble sort in Python
Python self-made class sort
the zen of Python
Re: Human-powered sorting [Python]
[Memo] Python3 list sort
Python sort cheat sheet
Custom sort in Python3
[Python] Sort collection types
[Python] Split the date
Sort and output the elements in the list as elements and multiples in Python.
Implemented the algorithm of "Algorithm Picture Book" in Python3 (Bubble Sort)
Implemented the algorithm of "Algorithm Picture Book" in Python3 (selection sort)