Get the nth smallest number from the array with O (logN) using a segment tree

What is this article?

The use of a segment tree can be considered as a method for quickly obtaining the kth smallest number from a set in which numbers are added or deleted. At first I didn't understand this explanation well, so I dump it.

Prerequisite knowledge

RSQ using a segment tree. The following is very easy to understand. Although it is an explanation of RMQ, the idea is about 95% the same for Min and Sum. https://www.creativ.xyz/segment-tree-entrance-999/

RSQ itself is not a data structure whose purpose is to obtain the kth smallest value, but this can be achieved at high speed by using RSQ.

The problem you want to solve

ARC033C: Data Structure

--Process the following query for the set of numbers S. --Type 1: Add the number X to S. `` --Type 2: Answer the Xth smallest number in S and remove that number from S. `` --Constraints: 1 <= N, <= 200,000 --Point 1: The same number is not added to S

point

N is a relatively small value. This constraint is due to the memory constraint when creating the $ len (N) $ segment tree and the constraint that O (N) is always applied when initializing.

Binary search of segment tree RSQ

As an example, consider getting the third smallest number out of $ [2, 3, 5, 7] $.

First, set the value in the segment tree with $ index $ of each number as $ 1 $. (Inside the square in the figure) Given the query RSQ $ [0, i) $, it looks like the value below the box.

Looking at this, we can see that the first (leftmost) i for which the query $ [0, i) $ = k is the value we want to obtain (the kth smallest number).

If you get $ i $ that satisfies the leftmost query $ [0, i) $ = k, it is obvious that the index of i is the value you want to get. Hereafter, this method will be considered.

Method 1: Execute the query in a binary search to find the leftmost k O (log ^ 2 N)

First, the method described in the explanation of ARC033 can be considered.

Taking advantage of the fact that the amount of calculation required for RSQ is O (logN) at most, it is possible to obtain the leftmost index by starting with query [0, N) and performing a binary search like query [0, mid). It is realistic.

In this method, the complexity of binary search $ O (logN) $ is applied to the complexity of RSQ $ O (logN) $, so the total is $ O (log ^ 2N) $.

def segmentTreeBisectLeft(segTreeTarget: segmentTreeSum, x):
    lo, hi = 0, st.lenOriginalList
    while lo < hi:
        mid = (lo+hi)//2
        if segTreeTarget.query(0, mid + 1) < x: lo = mid + 1 #RSQ here[0,mid)Query
        else: hi = mid
    return lo

Method 2: Digging a segment tree with a 2-minute search to find the leftmost k O (log N)

Here, consider find (x, n) that searches for the smallest x managed under node n using the tree structure of the segment tree. Next, using the input example above, the operation when you want to know the third smallest number is shown, and each operation is described.

--1: Input to root node 0 (find (3, 0) --2: Pass find (3, 1) to left node 1 —— 3: If the node has a smaller value than x it receives, it subtracts the node's value and returns it to its parent. In the example, x = 3 minus its own 2 is returned to node 0. This is because if you want to find the x-th smallest number but manage only x-a nodes under your control, there is no node you want to find under your control. --4: Root node 0 returns 1 from node 1 on the left, so use that value to send find (1, 2) to node 2 on the right. --5. Node 2 sends find (1,5) to node 5 on the left. ―― 6. Node 5 sends find (1, 11) to node 11 on the left. This leaf has no value, so it returns None. ―― 7. Node 5 sends find (1, 12) to node 11 on the left. ―― 8: This leaf matches x. This returns index = 5. This required the function to have the kth smallest value of 5.

def findNthValueSub(self, x, nodeId):
    """
    [2, 3, 5, 7] = [0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0]
Function to get the xth smallest value in the segment tree mapped to
    """
    #From here onward, processing to nodes that are not leaves
    if self.dat[nodeId] < x: #When this node has less than requested value
        return (None, x - self.dat[nodeId])
    if nodeId >= self.lenTreeList - 1: #When nodeIf is a node
        return (True, nodeId - (self.lenTreeList - 1)) #Returns its index number
    resLeft = self.findNthValueSub(x, 2 * nodeId + 1) #Do a search on the left
    if resLeft[0] != None: #If there is a value, return it
        return (True, resLeft[1])
    resRight = self.findNthValueSub(resLeft[1], 2 * nodeId + 2) #Do a search on the right
    return resRight

Detailed explanation

https://www.slideshare.net/chokudai/arc033 https://ei1333.hateblo.jp/entry/2017/12/14/000000

Whole source code

https://github.com/recuraki/PythonJunkTest/blob/master/atcoder/lib/treeSegment/segmentTreeSum.py

Recommended Posts

Get the nth smallest number from the array with O (logN) using a segment tree
Get the number of searches with a regular expression. SeleniumBasic VBA Python
Get the average salary of a job with specified conditions from indeed.com
[Python] Get the files in a folder with Python
Create a phylogenetic tree from Biopyton using ClustalW2
Create a decision tree from 0 with Python (1. Overview)
Get the value of a specific key in a list from the dictionary type in the list with Python
How to identify the element with the smallest number of characters in a Python list?
Get the file name in a folder using glob
DJango Note: From the beginning (using a generic view)
A little bit from Python using the Jenkins API
How to know the number of GPUs from python ~ Notes on using multiprocessing with pytorch ~
How to get only the data you need from a structured data set using a versatile method