Trigger

The giftee office where I work has moved on May 08, 2017.

In the new office, sofa seats, family restaurant seats, etc. Various types of shared spaces have increased. https://www.wantedly.com/companies/giftee/post_articles/64703

Therefore, I want to understand what type of shared space is used and how much. Therefore, we decided to consider how to obtain the usage status.

Consideration

The use of motion sensors and pressure sensors was also a candidate for grasping the usage status, but It seems that the number of people cannot be taken with the motion sensor, Since the pressure sensor requires a sensor for each seat, I took a picture of each shared space on a regular basis and measured the number of people in it.

Number analysis method

I decided to use a machine learning framework to get the number of people from the image.

This time ・ Learned data is open to the public ・ Easy to use I decided to use darknet from the two points.

https://pjreddie.com/darknet/

darknet

Installation

Installation is very easy, just clone from github and run make.

git clone https://github.com/pjreddie/darknet
cd darknet
make

Download the trained weight data to this and you're ready to go.

wget https://pjreddie.com/media/files/yolo.weights

Image analysis

Specify detect for the darknet option and pass the config, weight data, and target photo. When analyzing data / person.jpg included in the source, it will be as follows.

./darknet detect cfg/yolo.cfg yolo.weights data/person.jpg

The result is output in the same hierarchy with the name "predictions.png ".

The original image

Image after analysis

How to get the number of people

The analysis result of darknet is also output to the standard output as follows.

data/person.jpg: Predicted in 14.067749 seconds.
person: 86%
horse: 82%
dog: 86%

This time, I simply greped the standard output person and took the count to get the number of people in the image.

./darknet detect cfg/yolo.cfg yolo.weights data/person.jpg | grep person | wc -l

Overall flow of the system

The system created this time is roughly divided into three phases.

Take a picture of the target area with iPhone at regular intervals and upload it to S3
Run the analysis script with cron on a regular basis to get the number of people in the photo Insert the number of people you could get into DynamoDB
Visualize DynamoDB data with re: dash

Phase 1

@ koh518 Takes a shared space at regular intervals with the iPhone app. (This time every 5 minutes) I didn't have a stand to fix the iPhone, so I put it in a mug and fixed it.

The captured image will be uploaded to S3. The key name of S3 is "shared space name / time.jpeg " (e.g dining / 20170620131500.jpeg).

Phase 2

The analysis script started from cron retrieves the images accumulated in S3 in order and analyzes the number of people. However, since darknet does not recognize the orientation information of jpeg exif, Convert it to the correct orientation with ImageMagick's convert command.

convert iphone.jpg -auto-orient converted.png

If you do not do this, it will be analyzed sideways and the accuracy will be considerably worse.

Then pass the converted image to darknet to get the person's count.

./darknet detect cfg/yolo.cfg yolo.weights ../converted.png | grep person | wc -l

The image was analyzed as follows.

The original image

After analysis

It is a little difficult to understand because the purple frames overlap, but the person is correctly recognized as three people.

Insert this number into DynamoDB. Move the analyzed image to another S3 bucket. The output image after analysis is also uploaded to S3 for later verification.

This time Phase 2 is done with a python script. The source is below.

import boto3
import botocore
import subprocess
import os
import subprocess

BUCKET_NAME = os.environ["BUCKET_NAME"]
BUCKET_NAME_DONE = os.environ["BUCKET_NAME_DONE"]
DYNAMODB_REGION = os.environ["DYNAMODB_REGION"]
DYNAMODB_TABLE = os.environ["DYNAMODB_TABLE"]

s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET_NAME)
client = boto3.client('s3')

for obj in bucket.objects.all():
    key = obj.key
    shared_space_name, filename = key.split('/')

    # "prefix/"Because the object of is also fetched, skip
    if not filename:
      continue

    created_at, extention = filename.split('.')

    # download
    s3.Bucket(BUCKET_NAME).download_file(key, 'iphone.jpg')

    command = "convert iphone.jpg -auto-orient converted.png "
    proc = subprocess.Popen(
      command,
      shell  = True,
      stdin  = subprocess.PIPE,
      stdout = subprocess.PIPE,
      stderr = subprocess.PIPE)

    stdout_data, stderr_data = proc.communicate()

    # yolo
    command = "cd darknet;./darknet detect cfg/yolo.cfg yolo.weights ../converted.png | grep person | wc -l"
    proc = subprocess.Popen(
      command,
      shell  = True,
      stdin  = subprocess.PIPE,
      stdout = subprocess.PIPE,
      stderr = subprocess.PIPE)

    stdout_data, stderr_data = proc.communicate()

    value = int(stdout_data.decode('ascii'))

    # insert dynamo
    dynamodb = boto3.resource('dynamodb', region_name=DYNAMODB_REGION)
    table = dynamodb.Table(DYNAMODB_TABLE)

    resposne = table.put_item(
      Item = {
        'shared_space_name' : shared_space_name,
        'created_at' : created_at,
        'value': value
      }
    )

    # copy file to done bucket
    copy_source = {
      'Bucket': BUCKET_NAME,
      'Key': key
    }
    s3.meta.client.copy(copy_source, BUCKET_NAME_DONE, key)

    # delete file
    obj.delete()

    # upload image
    yolo_key = shared_space_name + '/' + created_at + '_yolo.png'
    client.upload_file('darknet/predictions.png', BUCKET_NAME_DONE, yolo_key)

Phase 3

Use re: dash to visualize DynamoDB data.

This time, the time stamp was a String type and was put in the format of YYYYMMDDhhmmss. If I get it as it is, it will not be the normal time, so I put the TIMESTAMP function on the time string to display the normal time.

Below is the DQL used for acquisition.

SCAN shared_space_name,TIMESTAMP(created_at),value FROM tbl_name

The result is a graph like the one below.

In addition, re: dash and analysis script are run as containers by putting docker in one EC2.

Task

Threshold

In the analysis, there was a rare case where an image showing only one person was identified as two people. Since darknet can specify the threshold with the threshold option, It is possible to set not to count if the probability of being a person is low. This time I moved it with the default of 25%, but adjusting this value may improve the accuracy.

Reflection of people who are not using

If a person who happens to be walking near the common space is reflected, the number of people will be measured more than it actually is. If you don't want that much accuracy, you can ignore it, If you take multiple shots every few tens of seconds in one measurement and take the minimum number of people in the picture, I think we can eliminate the number of people in the picture. (I haven't tried this yet.)

Finally

Giftee Co., Ltd. is looking for engineers. If you are interested in this system or would like to visit a new office Feel free to contact here

Analysis of shared space usage by machine learning

Trigger

Consideration

Number analysis method

Installation

Image analysis

The original image

Image after analysis

How to get the number of people

Overall flow of the system

Phase 1

Phase 2

The original image

After analysis

Phase 3

Task

Threshold

Reflection of people who are not using

Finally