[DSU Edition] AtCoder Library reading with a green coder ~ Implementation in Python ~

0. Introduction

AtCoder Official Algorithm Collection AtCoder Library (** ACL **) released on September 7, 2020 it was done. I thought it was a good opportunity because most of the algorithms and data structures recorded in ACL were new to me, so I did everything from studying algorithms to implementing them in Python.

In this article we will look at ** DSU **.

Target readers

Untargeted readers

Referenced

There is an easy-to-understand explanation by the AtCoder formula.

1. What is DSU?

** DSU ** (** Disjoint Set Union **, ** Disjoint-set data structure **) is a data structure that divides a data set into elementary sets (groups) and holds them. That is, each data belongs to one group and never to more than one group.

dsu_1.png

This data structure supports two convenient operations:

For this reason, this data structure is sometimes referred to as ** UnionFind **. This name may be more familiar to you.

dsu_2.png

In terms of implementation ...

Union specifies two elements instead of groups and merges the groups to which they belong. In addition, one element is selected as a leader in each group and managed as a so-called "group name". ("Group 1" and "Group 2" in the above figure are, for example, 1 and 4.)

2. Implementation

Let's implement it. Variable names, method names, etc. should be implemented according to ACL as much as possible.

2.1. Constructor

First, create a class DSU and implement the constructor.


class DSU:
    def __init__(self, n):  # n:Element count
        self._n = n
        self.parent_or_size = [-1] * n

n is the number of elements and is stored in the instance variable _n. It also creates a list parent_or_size that stores information about each element. As the name implies, each element parent_or_size [i] in this list represents the parent of element i or the size of the group to which element i belongs. In particular

parent_or_size [i] is negative :
Element i is the leader of the group, and the size of the group to which it belongs is abs (parent_or_size [i])
parent_or_size [i] is 0 or more :
The leader of the group to which element i belongs is parent_or_size [i]

is. In other words, at the time of initialization, all the elements belong to the "group of size 1 with you as the leader (group with your own elements)".

2.2. Find The operation Find is implemented in ACL under the name leader. By specifying an element, this method returns the leader of the group to which the element belongs.

def leader(self, a):
    assert 0 <= a < self._n
    if self.parent_or_size[a] < 0: return a  #Leader if negative
    self.parent_or_size[a] = self.leader(self.parent_or_size[a])
    return self.parent_or_size[a]

First, if parent_or_size [a] is negative on the third line, a is the leader, so it returns as it is. Otherwise parent_or_size [a] is the reader ** (provisional) ** of a. This is because one leader is no longer a leader if there is a group integration. Therefore, we recursively search for a ** true ** reader from this reader (provisional). In the 4th line, the reader information is updated by substituting the right side without returning directly. It may be said that it is a memo. This will shorten the next and subsequent searches.

2.3. Union The operation Union is implemented in ACL under the name merge. This method integrates the groups to which they belong by specifying two elements.

def merge(self, a, b):
    assert 0 <= a < self._n
    assert 0 <= b < self._n
    x, y = self.leader(a), self.leader(b)
    if x == y: return x
    if -self.parent_or_size[x] < -self.parent_or_size[y]: x, y = y, x
    self.parent_or_size[x] += self.parent_or_size[y]  #Add the size of y to the size of x
    self.parent_or_size[y] = x  #The leader of y is x
    return x

First, find the leaders x and y of a and b, respectively. If they match, you don't need to do anything because a and b already belong to the same group (line 5). If not, merge the y group into the x group. At this time, by doing the 6th line, always integrate the small group into the large group. Group members of y need to update the leader's information, which can reduce the amount of calculation.

2.4. Other methods

There are several other methods implemented in the ACL DSU, so we will implement them as well.

same By specifying two elements, the method same returns the boolean value of whether they belong to the same group.

def same(self, a, b):
    assert 0 <= a < self._n
    assert 0 <= b < self._n
    return self.leader(a) == self.leader(b)

size The method size returns the size (number of elements) of the group to which the specified element belongs.

def size(self, a):
    assert 0 <= a < self._n
    return -self.parent_or_size[self.leader(a)]

groups The method groups returns a list of all the elements grouped together.

def groups(self):
    leader_buf = [self.leader(i) for i in range(self._n)]
    result = [[] for _ in range(self._n)]
    for i in range(self._n): result[leader_buf[i]].append(i)
    return [r for r in result if r != []]

2.5. Summary

dsu.py


class DSU:
    def __init__(self, n):
        self._n = n
        self.parent_or_size = [-1] * n
    
    def merge(self, a, b):
        assert 0 <= a < self._n
        assert 0 <= b < self._n
        x, y = self.leader(a), self.leader(b)
        if x == y: return x
        if -self.parent_or_size[x] < -self.parent_or_size[y]: x, y = y, x
        self.parent_or_size[x] += self.parent_or_size[y]
        self.parent_or_size[y] = x
        return x

    def same(self, a, b):
        assert 0 <= a < self._n
        assert 0 <= b < self._n
        return self.leader(a) == self.leader(b)

    def leader(self, a):
        assert 0 <= a < self._n
        if self.parent_or_size[a] < 0: return a
        self.parent_or_size[a] = self.leader(self.parent_or_size[a])
        return self.parent_or_size[a]
    
    def size(self, a):
        assert 0 <= a < self._n
        return -self.parent_or_size[self.leader(a)]
    
    def groups(self):
        leader_buf = [self.leader(i) for i in range(self._n)]
        result = [[] for _ in range(self._n)]
        for i in range(self._n): result[leader_buf[i]].append(i)
        return [r for r in result if r != []]

3. Usage example

n = 8  #Total number of elements
d = DSU(n)
d.merge(0, 1)
d.merge(1, 3)
d.merge(0, 4)
d.merge(5, 6)
d.merge(3, 7)
print(d.groups())  # [[0, 1, 3, 4, 7], [2], [5, 6]]
print(d.leader(3))  # 0
print(d.same(1, 7))  # True
print(d.same(0, 5))  # False
print(d.size(6))  # 2

4. Problem example

AtCoder Library Practice Contest A "Disjoint Set Union" AtCoder Typical Contest 001 B "Union Find"

5. Conclusion

From elucidation of the mechanism of DSU to implementation in Python. In addition, it was found that various measures were taken in terms of implementation. I was particularly impressed with the idea of expressing the necessary information in a single array.

Please let us know if you have any mistakes, bugs, or advice.

Recommended Posts