Given a list of IDs sorted by certain conditions, there was a requirement to somehow match in a close range. Extracting an arbitrary element from an array is O (n), but O (1) is limited to elements in a fixed range from the end or end, so if you extract elements while matching from the back, O ( You should be able to match with n).
It's not difficult at all, and if you write it in Python, it will be like this. It's nice to allow list.pop ()
to index from the end (-1 for the last element).
matching.py
# coding: utf-8
def match(seq, r=100):
from random import randint
#If you don't want the elements around the beginning to be bocce when there are an odd number, go to the back first
#Remove the element and keep it even.
while len(seq) >= 2:
#If the argument is omitted, the last element is extracted..
a = seq.pop()
#Randomly select and retrieve from the end of the array within r.
b = seq.pop(-randint(1, min(r, len(seq))))
yield a, b
def test(n, r):
for a, b in match(range(n), r):
#print a, b
assert a - b < 20
import sys
test(int(sys.argv[1]), 10)
As soon as I heard the requirements, I came up with this algorithm and promised it was "easy and easy", but it wasn't easy at all in php. I can remove the element of the specified key from the associative array, but I can't find it no matter how much I search for a function that retrieves the element in the middle (fills the end) from the array.
After reading the array chapter of the php reference for tens of minutes, there was no equivalent to list.pop ()
. If you cut out the array once and concatenate it, you can do something similar, but since the concatenation cost of the array is O (n), the matching algorithm becomes O (n ^ 2). In the end, I had to write a loop that shifts the elements behind the elements I took out one by one.
In php, arrays and associative arrays are messed up, so array-type functions are difficult to understand how to handle keys. You may not be aware that it's too confusing and php developers don't provide such basic processing.
As you pointed out in the comments, we were able to achieve the expected behavior with ʻarray_splice ()
. It also supports indexes from the end with negative numbers. If you rewrite the above code in php, it will be as follows.
However, the execution time does not seem to be O (n) after all orz As an algorithmic sword, I claim the right not to use php as a basic human right.
matching.php
<?php
function match($seq, $r=100) {
$res = array();
while (count($seq) >= 2) {
$a = array_pop($seq);
$i = mt_rand(1, min($r, count($seq)));
$b = $seq[count($seq)-$i];
array_splice($seq, -$i, 1);
$res[$a] = $b;
}
return $res;
}
function test($n, $r) {
foreach (match(range(0, $n-1), $r) as $a => $b) {
//printf("%d %d\n", $a, $b);
if ($a - $b >= $r*2) {
error_log("XXX");
}
}
}
test($argv[1], 10);
#Execution result
$ time python matching.py 10000
real 0m0.038s
user 0m0.031s
sys 0m0.006s
$ time python matching.py 20000
real 0m0.043s
user 0m0.035s
sys 0m0.007s
$ time php matching.php 10000
real 0m1.821s
user 0m1.806s
sys 0m0.012s
$ time php matching.php 20000
real 0m11.842s
user 0m11.797s
sys 0m0.041s
I thought I could implement it myself, but that was also difficult. I'm afraid, but I feel that CoW occurred when counting () for the reference, so I think that's the reason.
You can't even handle arrays yourself in php to avoid crappy built-in functions, so if you want to work with arrays, create a php extension and write it in C. php is a template language, and it is safe to write programs in C language.
matching2.php
<?php
function list_pop(&$seq, $index) {
$n = count($seq);
if ($index < 0)
$index += $n;
$value = $seq[$index];
for ($i = $index; $i < $n-1; ++$i) {
$seq[$i] = $seq[$i+1];
}
array_pop($seq);
return $value;
}
function match($seq, $r=100) {
$res = array();
while (count($seq) >= 2) {
$a = array_pop($seq);
$i = mt_rand(1, min($r, count($seq)));
$b = list_pop($seq, -$i);
$res[$a] = $b;
}
return $res;
}
function test($n, $r) {
foreach (match(range(0, $n-1), $r) as $a => $b) {
printf("%d %d\n", $a, $b);
if ($a - $b >= $r*2) {
error_log("XXX");
}
}
}
test($argv[1], 10);
#Execution result
$ time php matching2.php 10000
real 0m1.631s
user 0m1.607s
sys 0m0.020s
$ time php matching2.php 20000
real 0m9.160s
user 0m9.105s
sys 0m0.052s
After writing the process that was set to list_pop ()
in Addendum 2, the performance finally came out as expected. Don't manipulate array references in php. That is, the algorithm that manipulates the array should not be a function.
matching3.php
<?php
function match($seq, $r=100) {
$res = array();
while (count($seq) >= 2) {
$a = array_pop($seq);
$n = count($seq);
$i = $n - mt_rand(1, min($r, $n));
$b = $seq[$i];
for (; $i < $n-1; ++$i) {
$seq[$i] = $seq[$i+1];
}
array_pop($seq);
$res[$a] = $b;
}
return $res;
}
function test($n, $r) {
foreach (match(range(0, $n-1), $r) as $a => $b) {
//printf("%d %d\n", $a, $b);
if ($a - $b >= $r*2) {
error_log("XXX");
}
}
}
#Execution result
$ time php matching2.php 10000
real 0m0.064s
user 0m0.052s
sys 0m0.011s
$ time php matching2.php 20000
real 0m0.076s
user 0m0.064s
sys 0m0.011s
Recommended Posts