Since it was written that the segment tree can do various things depending on what you put on it, I implemented what I came up with. It is written in various places that you have to devise something to put on, but it was a good example so I posted it.
--Initialization: $ O (N) $ --Get maximum length of characters with interval: $ O (logN) $ --One character replacement: $ O (logN) $
I want to answer the following problems at high speed.
--A string of n characters s is given. s is expressed as $ s_ {1}, s_ {2} \ dots s_ {n} $ from the first character. --Answer the following 5 types of q queries.
set (i, x)
$ s_ {i} $ to the letter x.contained in the interval of
getCount (s, t)` $ s_ {s}, s_ {s + 1}, \ dots, s_ {t} $.the number of consecutive a
contained in the interval$ s_ {s}, s_ {s + 1}, \ dots, s_ {t} $
getLongestLen (s, t) `.getLeftLen (s, t)
$ s_ {s}, s_ {s + 1}, \ dots, s_ {t} $ interval the number of consecutive a from the left end
getRightLen (s, t)
$ s_ {s}, s_ {s + 1}, \ dots, s_ {t} $ interval the number of consecutive a from the right end
--The constraint is assumed to be about $ 1 \ leq n \ leq 100,000 $, $ 1 \ leq q \ leq 100,000 $ (★ I haven't checked if it passes below)
--In this article, ʻais targeted for each count etc. for the sake of simplicity of explanation (actually, it seems to be a problem to operate the above for the character
Implement a segment tree that carries the following information on each node.
(char
, count
, leftLen
, rightLen
, longestLen
, noneCount
)
An example is shown above.
char
: Characters contained (not present in the figure)
count
: Number of a (5)
leftLen
: Number of consecutive a from the left end of the string (1)
rightLen
: Number of consecutive a from the right end of the string (2)
longestLen
: Maximum length of a in the interval (may be leftLen or rightLen) (3)
noneCount
: Sum of noneCount of L and R (not present in the figure)
Will be. The need for noneCount will be explained later.
Because it is a value of at most one character
--When x = a ("a ", 1, 1, 1, 1, 0)
--x = otherwise (<char>, 0, 0, 0, 0, 0)
And the right end to be filled when making a seg tree is (None, 0, 0, 0, 0, 1)
.
First, the basic operation will be described. Hereinafter, L and R are the left child nodes of the node, and R is the right child node. There are some important exceptions to (*), which will be described later.
count
: Sum of L and R counts
leftLen
: Take over the leftCount of the L node ()
rightLen
: Take over the rightCount of the R node ()
longestLen
: Maximum value (own leftLen, own rightLen, L rightLen + r leftLen, L longest, R Longest)
noneCount
: Sum of noneCount of L and R
Here, the calculation of longestLen
is the maximum of some values. First, it is self-evident that the longest Len
s for L and R are candidates. For L rightLen + r leftLen
, in the parent node, the right end of L and the left end of R are concatenated, so these are continuous character strings and are the longest candidates.
After this, the following two patterns will be described for the continuous length processing of leftLen
and rightLen
, which stated that there is a special case *.
The count of strings from one end is special, if all the nodes under it are consecutive a
, that end has a continuous string combined with the other end. Two examples are shown, and Example 2 corresponds to this.
--Example 1: "aabc" + "aaxz" = "aabcaaxz": The continuous character string from the right end is the same as L 2, and the left end is also 0.
--Example 2: "aaaa" + "aaxz" = "aaaaaaxz": The left end is also 0, but since the right end is all a, it is concatenated with the left end of R and L string length + left end of R
It becomes. This is the basis of string counting from the edge.
Next, due to the characteristics of the segment tree, the problems that clearly occur during initialization are shown in the figure.
In the segment tree, the length of the target list is basically aligned to the power of 2 and stored. If you store the 7 characters "axbaaaa" as above, the right edge will be empty. This time, it is assumed that None is stored. At this time, when node 6 receives a value from nodes 7 and 8, 8 has no value, so the continuous character string on the right side is treated as 0. Therefore, noneCount is used.
--When node 6 receives a value from node 8, this interval length 1 --noneCount value 1 = 0 is included, so set rightLen
as 0 + 2 (rightLen of L).
--Consider updating node 2. Similarly, R contains length 2 --noneCount1 = 1, so set leftLen
as 0 + 2 (rightLen of L).
The implementation of this is as follows. For the subsequent "considered cases", noneCount (padding) processing is added to both the right and left ends.
def funcSegmentValue(self, lNode, rNode, parentNodeId):
lchar, lcnt, lconsLeft, lconsRight, lconsLen, lNoneNum= lNode
rchar, rcnt, rconsLeft, rconsRight, rconsLen, rNoneNum = rNode
# l,Total amount of characters in r
ncnt = lcnt + rcnt
#The right length after concatenation basically matches the left
nconsLeft = lconsLeft
nconsRight = rconsRight
#Even if the number of consecutive character strings at the left end of the right node is insufficient when adding up the right nodes
#If you meet the padding, add it to the number of consecutive strings at the right end of the left node
#print("magic = {0}".format(2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length()))))
if lconsLeft == (2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length())) - lNoneNum):
nconsLeft += rconsLeft
if rconsRight == (2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length())) - rNoneNum):
nconsRight += lconsRight
nconsLen = max(nconsLeft, nconsRight, lconsLen, rconsLen, lconsRight + rconsLeft)
nNoneNum = lNoneNum + rNoneNum
res = [nchar, ncnt, nconsLeft, nconsRight, nconsLen, nNoneNum]
return res
Now consider the query $ [1,4) $ in the four-letter "aaab".
Since the parent node of "aa" is in charge of the interval [0,1], it queries the child node of the interval 0,1 (in this case, the leaf node). At this time, since the node of the leaf of interval 0 is not included in the range, it is answered that the range (1 in this case) is the response load (= None).
def querySub(self, a, b, nodeId, l, r):
"""
[a,b)To node node for the parent query of the interval[l, r)Delegate the search for
For intervals, the subscript of data is 0,1,2,3,When it is 4,
[0,3)Then 0,1,Returns the result of 2
Condition to be None: r <= a or b <= l
"""
if (r <= a or b <= l):
cannotAnswer = (r - l)
return [None, 0, 0, 0, 0, cannotAnswer]
if a <= l and r <= b:
res = self.dat[nodeId]
return res
resLeft = self.querySub(a, b, 2 * nodeId + 1, l, (l + r) // 2)
resRight = self.querySub(a, b, 2 * nodeId + 2, (l + r) // 2, r)
res = self.funcSegmentValue(resLeft, resRight, nodeId)
return res
#Of the target character
#Just a number,Number of consecutive to the right,Number of consecutive left,Consecutive number, (Assumed from the end)Number of consecutive None
#Count
# cons: consecutive
#data is
# char=Data included, cnt, consLeft, consRight, consLen,Number of noneNum
class segmentTreeCharCount():
dat = []
lenTreeList = -1
targetChar = None
depthTreeList = 0
lenPaddingEntry = 0
unitDefault = [None, 0, 0, 0, 0, 1]
def __init__(self):
pass
def load(self, l, tc):
self.targetChar = tc
# len(l)Get the square of 2 greater than the number
self.lenTreeList = 2 ** (len(l) - 1).bit_length() #2 for len 5^3 = 8
self.depthTreeList = (len(l) - 1).bit_length() #Number of steps of wood(0 origin)
#lenPaddingEntry is[1,2,3,4,5]If you give[1,2,3,4,5,None,None,None]Returned 3 because it was treated as
self.lenPaddingEntry = 2 ** (len(l) - 1).bit_length() - len(l) #How many entries were completed
self.dat = [self.unitDefault] * (self.lenTreeList * 2)
#Load value
for i in range(len(l)):
if l[i] == self.targetChar:
self.dat[self.lenTreeList - 1 + i] = [l[i], 1, 1, 1, 1, 0]
else:
self.dat[self.lenTreeList - 1 + i] = [l[i], 0, 0, 0, 0, 0]
self.build()
def funcSegmentValueById(self, nodeId):
l = self.dat[nodeId * 2 + 1]
r = self.dat[nodeId * 2 + 2]
return self.funcSegmentValue(l, r, nodeId)
#Do the calculation to write.
#In this case, the boundary is used in this calculation.,r position a,By inserting b, padding at the right end and the left end is performed.
def funcSegmentValue(self, lNode, rNode, parentNodeId):
#print("funcSegmentValue parentNode={0}".format(parentNodeId))
#print("L:")
lchar, lcnt, lconsLeft, lconsRight, lconsLen, lNoneNum= lNode
#print(lNode)
#print("R:")
rchar, rcnt, rconsLeft, rconsRight, rconsLen, rNoneNum = rNode
#print(rNode)
#Renamed here for convenience(It doesn't mean much)
if lchar is None or rchar is None:
nchar = None
elif rchar is not None:
nchar = rchar
elif lchar is not None:
nchar = lchar
# l,Total amount of characters in r
ncnt = lcnt + rcnt
#The right length after concatenation basically matches the left
nconsLeft = lconsLeft
nconsRight = rconsRight
"""
#print("searchdepth = {0}".format(self.depthTreeList - ((parentNodeId + 1).bit_length() - 1)))
if lcnt == 2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length())):
#print("child!L!")
nconsLeft += rconsLeft
if rcnt == 2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length())):
#print("child!R!")
nconsRight += lconsRight
"""
#In the case of root, even if the number of consecutive character strings at the left end of the right node is insufficient when adding up the right nodes
#If you meet the padding, add it to the number of consecutive strings at the right end of the left node
#print("magic = {0}".format(2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length()))))
if lconsLeft == (2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length())) - lNoneNum):
#print(" parentnodeid {2} l root special cur={0} add={1}".format(nconsRight, rconsLeft, parentNodeId))
nconsLeft += rconsLeft
if rconsRight == (2 ** (self.depthTreeList - ((parentNodeId + 1).bit_length())) - rNoneNum):
#print(" parentnodeid {2} r root special cur={0} add={1}".format(nconsRight, rconsLeft, parentNodeId))
#print(" nconsRight{0} += lconsLeft{1}".format(nconsRight, lconsLeft))
nconsRight += lconsRight
#print("update n={0}, max({1},{2},{3},{4},{5}".format(parentNodeId, nconsLeft, nconsRight, lconsLen, rconsLen, lconsRight + rconsLeft))
nconsLen = max(nconsLeft, nconsRight, lconsLen, rconsLen, lconsRight + rconsLeft)
nNoneNum = lNoneNum + rNoneNum
res = [nchar, ncnt, nconsLeft, nconsRight, nconsLen, nNoneNum]
#print("Return{0}".format(res))
return res
# char=Data included, cnt, consLeft, consRight, consLen
def build(self):
for nodeId in range(self.lenTreeList - 2, -1, -1):
#Important: This code will regenerate the list, so replace it with an assignment!
self.dat[nodeId] = self.funcSegmentValueById(nodeId)
def setValue(self, i, a):
"""
set a to list[i]
"""
#print("setValue: {0}, {1}".format(i, a))
nodeId = (self.lenTreeList - 1) + i
#print(" first nodeId: {0}".format(nodeId))
"""
self.dat[nodeId] = a
if a == self.targetChar:
self.dat[self.lenTreeList - 1 + i] = [a, 1, 1, 1, 1, 0]
else:
self.dat[self.lenTreeList - 1 + i] = [a, 0, 0, 0, 0, 0]
"""
#print("before")
#print(self.dat[nodeId])
self.dat[nodeId] = a
if a == self.targetChar:
self.dat[nodeId] = [a, 1, 1, 1, 1, 0]
else:
self.dat[nodeId] = [a, 0, 0, 0, 0, 0]
#print("after")
#print(self.dat[nodeId])
while nodeId != 0:
nodeId = (nodeId - 1) // 2
#print(" next nodeId: {0}".format(nodeId))
# sum : self.dat[nodeId] = self.dat[nodeId * 2 + 1] + self.dat[nodeId * 2 + 2]
self.dat[nodeId] = self.funcSegmentValueById(nodeId)
def querySub(self, a, b, nodeId, l, r):
"""
[a,b)To node node for the parent query of the interval[l, r)Delegate the search for
For intervals, the subscript of data is 0,1,2,3,When it is 4,
[0,3)Then 0,1,Returns the result of 2
Condition to be None: r <= a or b <= l
"""
#print("querySub: a={0}, b={1}, nodeId={2}, l={3}, r={4}".format(a, b, nodeId, l, r))
if (r <= a or b <= l):
cannotAnswer = (r - l)
#print(" > None") #This should return an unanswerable number
return [None, 0, 0, 0, 0, cannotAnswer]
if a <= l and r <= b:
#print(" > have: {0} [node = {1}]".format(self.dat[nodeId], nodeId))
#print(" > : a={0} <= l={1} and r{2} <= b{3}".format(a,l,r,b))
res = self.dat[nodeId]
return res
#print("querySubcalc: a={0}, b={1}, nodeId={2}, l={3}, r={4}".format(a, b, nodeId, l, r))
resLeft = self.querySub(a, b, 2 * nodeId + 1, l, (l + r) // 2)
resRight = self.querySub(a, b, 2 * nodeId + 2, (l + r) // 2, r)
#print("querySubend: a={0}, b={1}, nodeId={2}, l={3}, r={4}".format(a, b, nodeId, l, r))
#print(" > L")
#print(" node{0}: {1}".format(2 * nodeId + 1, resLeft))
#print(" > R")
#print(" node{0}: {1}".format(2 * nodeId + 2, resRight))
#print(resRight)
res = self.funcSegmentValue(resLeft, resRight, nodeId)
#print(" > res")
#print(res)
return res
def query(self, a, b):
return self.querySub(a, b, 0, 0, self.lenTreeList)
def debugGetSliceStr(self, a, b):
"""
Of the original string list[a:b]return it: str
"""
return "".join(list(map(lambda x: x[0], self.dat[self.lenTreeList - 1 + a:self.lenTreeList - 1 + b])))
from pprint import pprint
def test1(a,b):
pprint(st.query(a, b))
pprint(st.debugGetSliceStr(a, b))
l = list("xaazaaa")
l = list("xaaaaa0A")
l = list("abaaabcaaaaxasaaa")
st = segmentTreeCharCount()
st.load(l, "a")
st.build()
#st.setValue(2, "x")
#st.setValue(4, "x")
#print("-----")
#pprint(st.dat)
print("----------------------------")
test1(0,9)
st.setValue(1, "a")
test1(0,9)
st.setValue(0, "x")
st.setValue(1, "x")
st.setValue(2, "x")
st.setValue(3, "x")
st.setValue(8, "x")
test1(0,9)
Recommended Posts