Comparison of Hungarian law and general-purpose solver for allocation problems

what is this

There was an article in "Assignment problem, Hungarian law and integer programming problem" that the general-purpose solver is slow. In this article, I'll show you that it's faster to modify the code a bit and "create a mathematical model and solve it with a general-purpose solver".

Python code

The code is below. You need pip install numpy pulp munkres to run it.

import random
import time

import numpy as np
from munkres import Munkres
from pulp import PULP_CBC_CMD, LpProblem, LpVariable, lpDot, lpSum, value


class AssigmentProblem:
    def __init__(self, size, seed=0):
        self.size = size
        random.seed(seed)
        rng = range(self.size)
        self.weight = np.array([[random.randint(1, 100) for i in rng] for j in rng])

    def solve_hungarian(self):
        start_tm = time.time()
        m = Munkres()
        result = m.compute(self.weight.tolist())
        val = sum([self.weight[i, j] for i, j in result])
        tm = time.time() - start_tm
        print(f"hungarian {tm = :.2f} {val = }")

    def solve_pulp(self):
        m = LpProblem("AssignmentProblem")
        rng = range(self.size)
        x = np.array(LpVariable.matrix('x', (rng, rng), cat='Binary'))
        m += lpDot(self.weight.flatten(), x.flatten())
        start_tm = time.time()
        for i in rng:
            m += lpSum(x[i]) == 1
            m += lpSum(x[:, i]) == 1
        m.solve(PULP_CBC_CMD(mip=False, msg=False))
        val = value(m.objective)
        tm = time.time() - start_tm
        print(f"pulp      {tm = :.2f} {val = }")


if __name__ == "__main__":
    p1 = AssigmentProblem(300)
    p1.solve_hungarian()
    p1.solve_pulp()

Execution result

Tm: Calculation time (seconds), val: Objective function value
The upper row is the Hungarian law, and the lower row is the general-purpose solver.

hungarian tm = 2.43 val = 352
pulp      tm = 1.94 val = 352.0

As you can see, the general-purpose solver is faster.

The time in the lower part of the above includes both mathematical model creation and general-purpose solver execution.
Mathematical model creation uses a modeler called PuLP, and general-purpose solver execution uses a solver called cbc.

Reference: Python in optimization

Main correction points

――In the original article, it took time ** to create a mathematical model **. The reason is that I used sum instead of lpSum. sum creates wasted memory and is slow. --Please also refer to "Sum of variables in mathematical model". --The variables of the mathematical model are 0-1 binary variables, but they are solved as continuous variables. Because the adjacency matrix of the assignment problem is all unimodular, linear relaxation gives an integer solution. --Reference: https://www.weblio.jp/content/ All unimodularity

Digression

――The general-purpose solver used this time is a free solver called cbc. It will be faster with a paid solver. ――In practice, an approximate solution is often sufficient. Using the approximate solution solver is a trade-off with the accuracy of the solution, but it will be even faster. ――The data this time is a complete bipartite graph, but in practice the data is often sparse. In that case, it will be even faster because there are fewer variables. ――I could solve the allocation problem with NetworkX, but it was too slow to compare.