Track green objects with python + numpy (particle filter)

at first

Earlier, I wrote Articles like this (let's recognize red objects with python). It is a method of simply converting an image to HSV (Hue, Saturation, Value) and finding a region with a strong red component.

This time, let's consider a "green" object as an application. And then I'll talk about "tracking" an object in a continuous video stream. If you look closely at a continuous video stream, it is a continuous still image. If you continue to recognize the green object from this still image, you can track it naturally. It is not a big mistake to think that. However, there is a camera that is different from the human eye. If the angle of the target object deviates a little, it will glow white depending on the amount of light and deviate from the judgment (judgment fails and you lose sight of it), the judgment point flies around, or you lose sight of it even if the ambient light changes a little. It is. What happens if bad conditions happen to overlap and you lose sight of it? The computer will determine that the object is "absent." But in the next moment, the conditions may improve and reappear. It appears and disappears, it appears again, and there is no sense of stability at all, and 100% detection accuracy cannot always be expected.

Therefore, it is necessary to abandon the idea that a still image should be analyzed each time even if it is a video stream, and reconsider the video stream as a "continuous still image". In other words, it should be predicted and searched by making full use of the idea of probability and statistics that "it should be around here because it was here last time".

Particle filter

There are good ways to solve these problems. It's called a "particle filter". I don't know enough to explain the academic definition and contents of particle filters, so I'd like to leave it to many other commentary pages. However, conceptually, image analysis using a particle filter

Scatter particles (dots) on the image
Calculate the likelihood of the surroundings (Yudo: weight, for example, reddish or bluish) for each particle.
Make use of particles with high likelihood and remove particles with low likelihood
Find the point where the most likely particles are most concentrated (this is the position of the object)
Randomly move all particles little by little to prepare for the next frame (→ go to 2.)

If you continue these procedures 1 to 5, you will naturally be able to achieve the above. The state of "I was around here last time" is remembered by the scattered particles, and for each particle, "Is it this side next?" And repeats moving the particles randomly. Poor particles that continue to hurt their predictions are eliminated. Particles desperately predict the next frame and survive. W It feels unexpectedly cruel, and it seems that even the attachment that particles are difficult (lie).

It is quite difficult to describe these series of processes in C / C ++, but mainly due to the power of numpy, it is a great tokoro of python + numpy that a small amount of code is required.

Implementation of particle filter (basic)

First of all, about import. It is assumed that you have imported it as follows.

import cv2
import numpy as np

Particle definition

First, define the structure of the particles. Let's do as follows. x and y are the positions of the particles, and weight is the likelihood (weight).

particle = [x, y, weight]

Likelihood calculation function

Next, create a function to calculate the likelihood (weight) of particles. This function scans a range of 30x30 pixels around the specified coordinates and returns the percentage of 900 pixels that meet the criteria. For example, if the range of 30x30 pixels is all NG, a value of 0.0, if all are OK, a value of 1.0, and if about half is OK, a value of about 0.5 is returned. The decision function func () is out and allows the caller to specify the decision function.

def likelihood(x, y, func, image, w=30, h=30): 
  x1 = max(0, x - w / 2)
  y1 = max(0, y - h / 2)
  x2 = min(image.shape[1], x + w / 2)
  y2 = min(image.shape[0], y + h / 2)
  region = image[y1:y2, x1:x2]
  count = region[func(region)].size
  return (float(count) / image.size) if count > 0 else 0.0001

Particle initialization function

Next is the particle initialization function. The func parameter is supposed to specify the decision function to go out as before. This function creates a lot (500) of these particles. As an initial value, the particles are positioned near the largest area in the area determined by the func () function. The 500 of(500, 3)specified in the call ofnp.ndarray ()is the number of particles, and 3 is the number of elements of the above x, y, weight. is.

def init_particles(func, image): 
  mask = image.copy()
  mask[func(mask) == False] = 0
  contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
  if len(contours) <= 0:
    return None
  max_contour = max(contours, key=cv2.contourArea)
  max_rect = np.array(cv2.boundingRect(max_contour))
  max_rect = max_rect[:2] + max_rect[2:] / 2
  weight = likelihood(max_rect[0], max_rect[1], func, image)
  particles = np.ndarray((500, 3), dtype=np.float32)
  particles[:] = [max_rect[0], max_rect[1], weight]
  return particles

Particle filter implementation (algorithm)

Here is the real movement of the particle filter. The particle filter goes through the following four procedures.

Particle resampling
Forecast
Likelihood (weight) judgment
Measurement

Particle resampling

In resampling, we use random numbers to remove particles with poor grades and replace them with particles with good grades. The method cumsum () used to create the weights array calculates the cumulative sum. Also, by setting (weights> weight) .argmax (), it will scan the weights array and return the array index when a value larger than weight appears for the first time.

def resample(particles): 
  tmp_particles = particles.copy()
  weights = particles[:, 2].cumsum()
  last_weight = weights[weights.shape[0] - 1]
  for i in xrange(particles.shape[0]):
    weight = np.random.rand() * last_weight
    particles[i] = tmp_particles[(weights > weight).argmax()]
    particles[i][2] = 1.0

Forecast

Actually move the particles towards the next frame. Add the coefficient specified by variance multiplied by the result ofnumpy.random.randn (). This variance coefficient is a numerical value that should be set according to the intensity of the movement of the target. Random movement of particles is called "prediction". Particles that happen to move in the right direction survive, and particles that happen to move in the wrong direction are destined to be eliminated (overwritten).

def predict(particles, variance=13.0): 
  particles[:, 0] += np.random.randn((particles.shape[0])) * variance
  particles[:, 1] += np.random.randn((particles.shape[0])) * variance

Likelihood (weight) judgment

Determines the likelihood (weight) of each particle. It will be the material to judge the result of the previous prediction. This likelihood (weight) is calculated by calling the likelihood () function created earlier.

def weight(particles, func, image): 
  for i in xrange(particles.shape[0]): 
    particles[i][2] = likelihood(particles[i][0], particles[i][1], func, image)
  sum_weight = particles[:, 2].sum()
  particles[:, 2] *= (particles.shape[0] / sum_weight)

Measurement

Measure the particles to determine where the good particles are concentrated.

def measure(particles): 
  x = (particles[:, 0] * particles[:, 2]).sum()
  y = (particles[:, 1] * particles[:, 2]).sum()
  weight = particles[:, 2].sum()
  return x / weight, y / weight

Finish

Finally, the processing implemented so far is summarized like a utility function. During the max_frame specified by the argument, if no green component is found, the particles are reinitialized.

particle_filter_cur_frame = 0

def particle_filter(particles, func, image, max_frame=10):
  global particle_filter_cur_frame
  if image[func(image)].size <= 0:
    if particle_filter_cur_frame >= max_frame:
      return None, -1, -1
    particle_filter_cur_frame = min(particle_filter_cur_frame + 1, max_frame)
  else:
    particle_filter_cur_frame = 0
    if particles is None:
      particles = init_particles(func, image)

  if particles is None:
    return None, -1, -1

  resample(particles)
  predict(particles)
  weight(particles, func, image)
  x, y = measure(particles)
  return particles, x, y

Let's use it!

So, let's create a program that tracks green objects using the particle filters implemented so far. Here, the frame data (BGR image) acquired by cv2.VideoCapture () is converted to HSV, and S (Saturation) and V (Value) are multiplied by the OTSU threshold, respectively, and the color depth and brightness are multiplied. In both cases, the process of utilizing only sufficient pixels is executed (the H component is masked with passing pixels of S and V and filled with 0). I want to find the range of green (50-85), so clearing 0 is reasonable. (On the other hand, when detecting red, it will be strange if you do not fill it with a value other than 0.)

What is actually passed to the particle_filter () function is the H component that has been filtered above. After that, I think that the particle_filter () function manages the particles and tracks the green part well.

import cv2
import numpy as np

if __name__ == "__main__": 
  def is_green(region): 
    return (region >= 50) | (region < 85)

  cap = cv2.VideoCapture(0)
  particles = None

  while cv2.waitKey(30) < 0:
    _, frame = cap.read()
	frame_hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV_FULL)
	frame_h = frame_hsv[:, :, 0]
	_, frame_s = cv2.threshold(frame_hsv[:, :, 1], 0, 255, cv2.THRESH_BINARY|cv2.THRESH_OTSU)
	_, frame_v = cv2.threshold(frame_hsv[:, :, 2], 0, 255, cv2.THRESH_BINARY|cv2.THRESH_OTSU)
    frame_h[(frame_s == 0) | (frame_v == 0)] = 0
	
    particles, x, y = particle_filter(particles, is_green, frame_h)
	
    if particles is not None:
      valid_particles = particles[(particles[:, 0] >= 0) & (particles[:, 0] < frame.shape[1]) &
                                  (particles[:, 1] >= 0) & (particles[:, 1] < frame.shape[0])]
      for i in xrange(valid_particles.shape[0]): 
        frame[valid_particles[i][1], valid_particles[i][0]] = [255, 0, 0]
	  p = np.array([x, y], dtype=np.int32)
	  cv2.rectangle(frame, tuple(p - 15), tuple(p + 15), (0, 0, 255), thickness=2)
	
	cv2.imshow('green', frame)

  cap.release()
  cv2.destroyAllWindows()

This particle filter is extremely resistant to noise and movement, and will chase an object fairly persistently. In addition, the detection position obtained as a result is also very stable. Once you have implemented the particle filter, the key is how to implement the likelihood () function. The life and death of particles depends on this likelihood ().

Observe the desperate movement of the particles to survive. Gradually, the desperation of each particle, which is ruthlessly evaluated by likelihood (), becomes interesting. It is a particle filter that accurately follows, but in the shadow of it, countless particles are born and die, and I feel such harsh sorrow.