Nice to meet you, I'm Pong from China. I am a new engineer working at Nomura Research Institute. I'm still not good at Japanese, so please forgive me if you get strange Japanese. Thank you very much.
There are too many white masks for travel photos during the Corona period, You can't stand it anymore, right? Right now, there is new employee development training, and I made this a training issue. To solve this, we have developed an application that converts the white mask in the photo into a monster mask. Based on the concept of "reducing the amount of work as much as possible" It was developed as a serverless LINE photo processing application by utilizing various AWS services. ** For those who are tired of masked photos ** and ** for those who are interested in serverless ** Please enjoy this development report.
I traveled with her to Choshi in Chiba during the summer vacation. I played in the sea, climbed the lighthouse and took a lot of commemorative photos. But unfortunately, the main character in the photo was not a human or a landscape, but a white mask. Photographs from the Corona era (time) have the highest appearance rate of white masks and appear everywhere. When she saw this photo, she complained, "I don't want to see the white mask anymore." So what if I convert the white mask in the photo to something else?
Just as I and she love superhero movies, I love the monster mask (e.g. Batman's monster Bane) in it. Wouldn't it be nice if the white mask became a monster mask?
※This work is a derivative of "[Bane](https://www.flickr.com/photos/istolethetv/30216006787/)"by[istolethetv](https://www.flickr.com/people/istolethetv/),usedunder[CCBY2.0](https://creativecommons.org/licenses/by/2.0/)That's why I came up with the idea and decided to develop this photo processing app. But there are three problems in front of me.
First of all, there are various options for the application form. A web app as a web page? An ios or android app for smartphones only? You have to design the front-end interface as well as the back-end processing. Considering various things, I think the LINE app is the most appropriate. There are three reasons:
Therefore, ** I decided to use the LINE application as the application form! ** **
And the next challenge is where to set up the server Will it be built on a physical machine such as a Raspberry pi? Do you use a cloud server such as AWS EC2? In addition, the server needs not only construction but also maintenance management later. As a lazy person who has the idea of "reducing the amount of work as much as possible", I don't want to do that. .. .. Then, why not develop without a server without needing a server? When I investigated, ** With AWS API Gateway and Lambda, I could realize serverless, and there was no server construction and maintenance management **! Alright, I decided on you! !!
Finally, this time we need a face recognition AI to process the face photo. As a result, questions such as "what AI model structure do you use?", "Where do you get the training data?" And "what kind of label do you want to label the data?" I thought, "I wish I had a face recognition AI that I could use right away," so I looked it up on AWS and the results really came out! There is an AWS service that analyzes images or videos called Rekognition (not recognition). ** You don't need to make AI, just call Rekognition and you can recognize and analyze the face in the photo **. With this, you can achieve "reduce the amount of work as much as possible".
With this in mind, we decided to develop a serverless LINE photo processing app on AWS!
We have already decided on the application form, so let's build the system from now on! The overall picture of the system created this time is as follows:
Here, it is assumed that the communication with the user is a smartphone. (PC version LINE is also available) The front end is a LINE Bot. All backends are processed by AWS Cloud. To achieve serverless processing, processing is executed on three Lambdas: "controller", "face recognition", and "new image generation". Considering the processing flow, this system can be divided into 5 parts as shown in the figure below:
Then, I will explain these five parts from the flow of processing.
The first part is the input part. The function is literally to load the image that the user sent to the LINE Bot. The entities related to this part are "LINE Bot", "API Gateway" and "Controller Lambda". The processing flow is as follows:
First, the user sends the photo image to the LINE Bot. The LINE Bot then wraps the image in line_event and sends it to the API Gateway. API Gateway sends the event to the controller Lambda without any changes.
To make this part, first create a LINE Bot (messagingApi) as a front door. Click here for how to make: LINE Official Document: Get Started with Messaging API After creating the channel, there are still two settings required. The first is to issue a "channel access token" for authentication with Lambda. The second is to turn off the response function of the messaging api and turn on the webhook function. Do not enter the webhook URL now, but after you have configured the API Gateway.
Next, create an IAM role that runs a service such as Lambda. Enter the IAM service from the dashboard and create a new role. The new IAM role names serverless-linebot etc. and the service used is Lambda. The policies are "Amazon S3FullAccess", "AmazonRekognitionFullAccess", and "CloudWatchLogsFullAccess". Also, since the controller Lambda calls another Lambda, add the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"lambda:InvokeFunction",
"lambda:InvokeAsync"
],
"Resource": [
"Face recognition Lambda arn",
"New image generation Lambda arn"
]
}
]
}
"Face recognition Lambda arn" and "New image generation Lambda arn" are not yet available, so don't forget to rewrite them after creating the Lambda function. All this processing is executed in this role.
Since API Gateway is a "connection", we have to create a LINE Bot at both ends and a controller Lambda function before creating it, so next we will create a controller Lambda function. Since python is used for function creation this time, select python3.x (3.6 ~ 3.8) as the runtime. The IAM role you want to run is the one you just created.
After creating it, first set the memory to 512MB and the timeout to 1min in "Basic settings". Then set the following environment variables:
Key | value |
---|---|
LINE_CHANNEL_ACCESS_TOKEN | LINE Bot Channel Access Token |
LINE_CHANNEL_SECRET | LINE Bot Channel Secret |
Regarding the contents of the Lambda function, the controller Lambda communicates with the LINE Bot, so the "line-bot-sdk" package is required. To install on Lambda, first install line-bot-sdk locally in a new folder using the following command:
python -m pip install line-bot-sdk -t <new_folder>
After that, create a lambda_function.py (Lambda recognizes this as the main function with this name, so be sure to name it) file in the same folder and enter the following code:
lambda_function_for_controller.py
import os
import sys
import logging
import boto3
import json
from linebot import LineBotApi, WebhookHandler
from linebot.models import MessageEvent, TextMessage, TextSendMessage, ImageMessage, ImageSendMessage
from linebot.exceptions import LineBotApiError, InvalidSignatureError
logger = logging.getLogger()
logger.setLevel(logging.ERROR)
#Read the line bot channel access token and secret from the environment variables
channel_secret = os.getenv('LINE_CHANNEL_SECRET', None)
channel_access_token = os.getenv('LINE_CHANNEL_ACCESS_TOKEN', None)
if channel_secret is None:
logger.error('Specify LINE_CHANNEL_SECRET as environment variable.')
sys.exit(1)
if channel_access_token is None:
logger.error('Specify LINE_CHANNEL_ACCESS_TOKEN as environment variable.')
sys.exit(1)
# api&Generate handler
line_bot_api = LineBotApi(channel_access_token)
handler = WebhookHandler(channel_secret)
#Connect with S3 bucket
s3 = boto3.client("s3")
bucket = "<S3 bucket name>"
#Lambda main function
def lambda_handler(event, context):
#X for authentication-Line-Signature header
signature = event["headers"]["X-Line-Signature"]
body = event["body"]
#Setting the return value
ok_json = {"isBase64Encoded": False,
"statusCode": 200,
"headers": {},
"body": ""}
error_json = {"isBase64Encoded": False,
"statusCode": 403,
"headers": {},
"body": "Error"}
@handler.add(MessageEvent, message=ImageMessage)
def message(line_event):
#User profile
profile = line_bot_api.get_profile(line_event.source.user_id)
#Extract the ID of the user who sent(push_Use if message,Not necessary for reply)
# user_id = profile.user_id
#Extract message ID
message_id = line_event.message.id
#Extract image file
message_content = line_bot_api.get_message_content(message_id)
content = bytes()
for chunk in message_content.iter_content():
content += chunk
#Save image file
key = "origin_photo/" + message_id
new_key = message_id[-3:]
s3.put_object(Bucket=bucket, Key=key, Body=content)
#Call face recognition lambda
lambdaRekognitionName = "<Here is arn of face recognition lambda>"
params = {"Bucket": bucket, "Key": key} #Image file path information
payload = json.dumps(params)
response = boto3.client("lambda").invoke(
FunctionName=lambdaRekognitionName, InvocationType="RequestResponse", Payload=payload)
response = json.load(response["Payload"])
#Call new image generation lambda
lambdaNewMaskName = "<Here is arn of new image generation lambda>"
params = {"landmarks": str(response),
"bucket": bucket,
"photo_key": key,
"new_photo_key": new_key}
payload = json.dumps(params)
boto3.client("lambda").invoke(FunctionName=lambdaNewMaskName,
InvocationType="RequestResponse", Payload=payload)
#Signed URL generation
presigned_url = s3.generate_presigned_url(ClientMethod="get_object", Params={
"Bucket": bucket, "Key": new_key}, ExpiresIn=600)
#Replying to new image message
line_bot_api.reply_message(line_event.reply_token, ImageSendMessage(
original_content_url=presigned_url, preview_image_url=presigned_url))
try:
handler.handle(body, signature)
except LineBotApiError as e:
logger.error("Got exception from LINE Messaging API: %s\n" % e.message)
for m in e.error.details:
logger.error(" %s: %s" % (m.property, m.message))
return error_json
except InvalidSignatureError:
return error_json
return ok_json
Above is the entire controller Lambda function, which is associated with all five parts. The part about this first part is:
lambda_function_for_controller.py
#Read the line bot channel access token and secret from the environment variables
channel_secret = os.getenv('LINE_CHANNEL_SECRET', None)
channel_access_token = os.getenv('LINE_CHANNEL_ACCESS_TOKEN', None)
if channel_secret is None:
logger.error('Specify LINE_CHANNEL_SECRET as environment variable.')
sys.exit(1)
if channel_access_token is None:
logger.error('Specify LINE_CHANNEL_ACCESS_TOKEN as environment variable.')
sys.exit(1)
# api&Generate handler
line_bot_api = LineBotApi(channel_access_token)
handler = WebhookHandler(channel_secret)
lambda_function_for_controller.py
#X for authentication-Line-Signature header
signature = event["headers"]["X-Line-Signature"]
body = event["body"]
You have now authenticated your LINE Bot and received the event details. After that, zip the contents of that folder and Upload by "Function code"-> "Action"-> "Upload .zip file" of Lambda.
The last is the creation of API Gateway as a connection. The type of API Gateway created here is REST API. After creating the API, create the resources and methods. The method is POST, the integration type is a Lambda function, and you also enable the use of Lambda proxy integration. The Lambda function selects the controller Lambda function.
Also, about the setting of POST method request First, select "Verify query string parameters and headers" to authenticate the request. And add the following header to the HTTP request header:
name | Mandatory | cache |
---|---|---|
X-Line-Signature | ☑ | ☐ |
Once set, let's deploy. After the deployment is complete, copy the method call URL on stage and Paste it into the LINE Bot webhook URL. This completes the first part.
The second part is the image storage part. This part is very easy, just save the image loaded by the controller Lambda to the S3 bucket. The processing flow is as follows:
First, create an S3 bucket for your work. In this project, if the bucket name is too long, a "signed URL length problem" will occur (see [3-5](#signed url) for details). Make the bucket name as short as possible (4 English characters in my case). Also, you don't want others to see your picture, right? To protect your privacy Check "Block all public access" in the permission settings to create a bucket. After creating, a folder called "origin_photo" to save the photo uploaded by the user, Create a folder called "masks" to save the mask images. This completes the work on the S3 side.
The controller Lambda function was filled in in Part 1, so there's nothing special to do here. Just explain the code for this part and the content is:
lambda_function_for_controller.py
#Connect with S3 bucket
s3 = boto3.client("s3")
bucket = "<S3 bucket name>"
lambda_function_for_controller.py
#Extract message ID
message_id = line_event.message.id
#Extract image file
message_content = line_bot_api.get_message_content(message_id)
content = bytes()
for chunk in message_content.iter_content():
content += chunk
#Save image file
key = "origin_photo/" + message_id
new_key = message_id[-3:]
s3.put_object(Bucket=bucket, Key=key, Body=content)
Here, rename the image file with the LINE message ID, Multiple users will be able to distinguish.
The third part is the recognition of saved photos. Specifically, it recognizes the contour of the face and the positions of the eyes and nose, and uses it for combining with the mask image later. With the concept of "reducing the amount of work as much as possible" I don't want to train face recognition AI from scratch myself Faces are recognized using a service called "Rekognition" on AWS.
Rekognition is a service that "automates image and video analysis using machine learning." Simply put, it feels like "use the trained AI as it is". Here's an introduction to Rekognition: Amazon Rekognition
Rekognition has various functions such as object and scene detection and face comparison, and can process not only images but also videos. This time, we will use the "face-detection" function to obtain the position of the face. The location information you want to get is called a "landmark". The figure below is an image of a landmark:
Analysis result of this figure:
{
"FaceDetails": [
{
"AgeRange": {
"High": 43,
"Low": 26
},
"Beard": {
"Confidence": 97.48941802978516,
"Value": true
},
"BoundingBox": {
"Height": 0.6968063116073608,
"Left": 0.26937249302864075,
"Top": 0.11424895375967026,
"Width": 0.42325547337532043
},
"Confidence": 99.99995422363281,
"Emotions": [
{
"Confidence": 0.042965151369571686,
"Type": "DISGUSTED"
},
{
"Confidence": 0.002022328320890665,
"Type": "HAPPY"
},
{
"Confidence": 0.4482877850532532,
"Type": "SURPRISED"
},
{
"Confidence": 0.007082826923578978,
"Type": "ANGRY"
},
{
"Confidence": 0,
"Type": "CONFUSED"
},
{
"Confidence": 99.47616577148438,
"Type": "CALM"
},
{
"Confidence": 0.017732391133904457,
"Type": "SAD"
}
],
"Eyeglasses": {
"Confidence": 99.42405700683594,
"Value": false
},
"EyesOpen": {
"Confidence": 99.99604797363281,
"Value": true
},
"Gender": {
"Confidence": 99.722412109375,
"Value": "Male"
},
"Landmarks": [
{
"Type": "eyeLeft",
"X": 0.38549351692199707,
"Y": 0.3959200084209442
},
{
"Type": "eyeRight",
"X": 0.5773905515670776,
"Y": 0.394561767578125
},
{
"Type": "mouthLeft",
"X": 0.40410104393959045,
"Y": 0.6479480862617493
},
{
"Type": "mouthRight",
"X": 0.5623446702957153,
"Y": 0.647117555141449
},
{
"Type": "nose",
"X": 0.47763553261756897,
"Y": 0.5337067246437073
},
{
"Type": "leftEyeBrowLeft",
"X": 0.3114689588546753,
"Y": 0.3376390337944031
},
{
"Type": "leftEyeBrowRight",
"X": 0.4224424660205841,
"Y": 0.3232649564743042
},
{
"Type": "leftEyeBrowUp",
"X": 0.36654090881347656,
"Y": 0.3104579746723175
},
{
"Type": "rightEyeBrowLeft",
"X": 0.5353175401687622,
"Y": 0.3223199248313904
},
{
"Type": "rightEyeBrowRight",
"X": 0.6546239852905273,
"Y": 0.3348073363304138
},
{
"Type": "rightEyeBrowUp",
"X": 0.5936762094497681,
"Y": 0.3080498278141022
},
{
"Type": "leftEyeLeft",
"X": 0.3524211347103119,
"Y": 0.3936865031719208
},
{
"Type": "leftEyeRight",
"X": 0.4229775369167328,
"Y": 0.3973258435726166
},
{
"Type": "leftEyeUp",
"X": 0.38467878103256226,
"Y": 0.3836822807788849
},
{
"Type": "leftEyeDown",
"X": 0.38629674911499023,
"Y": 0.40618783235549927
},
{
"Type": "rightEyeLeft",
"X": 0.5374732613563538,
"Y": 0.39637991786003113
},
{
"Type": "rightEyeRight",
"X": 0.609208345413208,
"Y": 0.391626238822937
},
{
"Type": "rightEyeUp",
"X": 0.5750962495803833,
"Y": 0.3821527063846588
},
{
"Type": "rightEyeDown",
"X": 0.5740782618522644,
"Y": 0.40471214056015015
},
{
"Type": "noseLeft",
"X": 0.4441811740398407,
"Y": 0.5608476400375366
},
{
"Type": "noseRight",
"X": 0.5155643820762634,
"Y": 0.5569332242012024
},
{
"Type": "mouthUp",
"X": 0.47968366742134094,
"Y": 0.6176465749740601
},
{
"Type": "mouthDown",
"X": 0.4807897210121155,
"Y": 0.690782368183136
},
{
"Type": "leftPupil",
"X": 0.38549351692199707,
"Y": 0.3959200084209442
},
{
"Type": "rightPupil",
"X": 0.5773905515670776,
"Y": 0.394561767578125
},
{
"Type": "upperJawlineLeft",
"X": 0.27245330810546875,
"Y": 0.3902156949043274
},
{
"Type": "midJawlineLeft",
"X": 0.31561678647994995,
"Y": 0.6596118807792664
},
{
"Type": "chinBottom",
"X": 0.48385748267173767,
"Y": 0.8160444498062134
},
{
"Type": "midJawlineRight",
"X": 0.6625112891197205,
"Y": 0.656606137752533
},
{
"Type": "upperJawlineRight",
"X": 0.7042999863624573,
"Y": 0.3863988518714905
}
],
"MouthOpen": {
"Confidence": 99.83820343017578,
"Value": false
},
"Mustache": {
"Confidence": 72.20288848876953,
"Value": false
},
"Pose": {
"Pitch": -4.970901966094971,
"Roll": -1.4911699295043945,
"Yaw": -10.983647346496582
},
"Quality": {
"Brightness": 73.81391906738281,
"Sharpness": 86.86019134521484
},
"Smile": {
"Confidence": 99.93638610839844,
"Value": false
},
"Sunglasses": {
"Confidence": 99.81478881835938,
"Value": false
}
}
]
}
What I want to get this time is the "landmarks" item in this. "Type" is the name of the point (see image above). However, x and y are not the coordinates of specific pixel points. Shows the ratio to the width of the image.
The processing flow of the third part is as follows: Rekognition has two mechanisms to read images. The first is to load using an S3 bucket or an image URL on the Internet. The second is to send the file and read it directly. This time we will use the first URL method. Therefore, it is not the image that is passed from the controller Lambda to the face recognition Lambda, but the storage location information of the file. The same goes for face recognition Lambda to pass to Rekognition.
Here, the IAM role that executes face recognition Lambda is the role created in the first part. I have the authority to use S3 and Rekognition, so Even if the S3 bucket is private, it's okay for Rekognition to read the images in it.
And the result returned from Rekognition seems to be an example of the above result. There are various things such as "age" and "gender" in it, I only want to use "landmarks" this time. Therefore, face recognition Lambda extracts landmarks from the results.
Also, there are many landmarks, There are some points (mouth, etc.) that cannot be recognized well due to the mask, and there are some extra points (eyes, etc.) that are too fine. So, here we just extract the following 5 landmarks and return them to the controller Lambda.
Landmark name | position |
---|---|
eyeLeft | left eye |
eyeRight | right eye |
upperJawlineLeft | Left temple |
upperJawlineRight | Right temple |
chinBottom | Chin |
To separate the roles, create another face recognition Lambda function in addition to the controller Lambda function. When you create it, just like the controller Lambda function, Select python3.x and the execution role is the same. Also, set the timeout of 1min and the memory of 512MB in the same way in "Basic settings".
After creating it, there is no package to introduce here, so No need to upload zip, All you have to do is fill in the automatically generated Lambda_function.py code below.
lambda_function_for_rekognition.py
import json
import boto3
rekognition = boto3.client("rekognition")
def lambda_handler(event, context):
#Get the image file path from the event
bucket = event["Bucket"]
key = event["Key"]
#Call Rekognition for face recognition
response = rekognition.detect_faces(
Image={'S3Object': {'Bucket': bucket, 'Name': key}}, Attributes=['ALL'])
#How many people are in the photo
number_of_people = len(response["FaceDetails"])
#Make a list of all the required landmarks
all_needed_landmarks = []
#Process by the number of people
for i in range(number_of_people):
#This is a list of dictionaries
all_landmarks_of_one_person = response["FaceDetails"][i]["Landmarks"]
#This time eyeLeft, eyeRight, upperJawlineLeft, upperJawlineRight,Using only chinBottom
# needed_Extract to landmarks
needed_landmarks = []
for type in ["eyeLeft", "eyeRight", "upperJawlineLeft", "upperJawlineRight", "chinBottom"]:
landmark = next(
item for item in all_landmarks_of_one_person if item["Type"] == type)
needed_landmarks.append(landmark)
all_needed_landmarks.append(needed_landmarks)
return all_needed_landmarks
I've already filled in the controller Lambda function, so This is just a code description for the third part.
lambda_function_for_controller.py
lambdaRekognitionName = "<Here is arn of face recognition lambda>"
params = {"Bucket": bucket, "Key": key} #Image file path information
payload = json.dumps(params)
response = boto3.client("lambda").invoke(
FunctionName=lambdaRekognitionName, InvocationType="RequestResponse", Payload=payload)
response = json.load(response["Payload"])
The fourth part is the new image generation part. In other words, it is the part that combines the photographic image and the following new mask image:
name | Bane | Joker | Immortan Joe |
---|---|---|---|
Mask image | ※1 |
※2 |
※3 |
Source | The Dark Knight Rises | darkKnight | MadMax:FuryRoad |
※1:This work is a derivative of "Bane"byistolethetv,usedunderCCBY2.0. ※2:This work is a derivative of this photo,usedunderCC01.0. ※3:This work, "joe's mask" is a derivative of "File:Fan_Expo_2015_-Immortan_Joe(21147179383).jpg"byGabboT,usedunderCCBY-SA2.0."joe'smask"islicensedCCBY-SA2.0 by y2-peng.
The processing flow on AWS is as follows:
First, the controller Lambda passes "photo image storage information (S3 bucket name and file path)", "5 landmark information" and "new image file name" to the new image generation Lambda.
Next, new image generation Lambda uses the file save information to load the photo image and mask image from the S3 bucket. In addition, it is necessary to save the mask image in the S3 bucket in advance and save the file path in the new image generation Lambda. (Refer to the code for detailed settings such as file path)
Then, combine the photographic image and the mask image as many times as the number of people. Randomly select and use one mask image each time. The order of joining work is as follows:
That's all for processing.
First, create a new Lambda function in AWS Lambda. The runtime and execution roles are the same as before. Also, as before, set the memory and timeout from "Basic settings".
This time, image combining requires two python packages, "pillow" and "numpy". Therefore, first create a new folder and install the package using the following command.
python -m pip install pillow numpy -t <new_folder>
Then, create "lambda_function.py" in that folder and enter the following code.
lambda_function_for_new_image_gengeration.py
import json
import boto3
import numpy as np
from PIL import Image, ImageFile
from operator import sub
from io import BytesIO
from random import choice
s3 = boto3.client("s3")
class NewPhotoMaker:
def __init__(self, all_landmarks, bucket, photo_key, new_photo_key):
self.all_landmarks = eval(all_landmarks)
self.bucket = bucket
self.photo_key = photo_key
self.new_photo_key = new_photo_key
#Load photographic image
def load_photo_image(self):
s3.download_file(self.bucket, self.photo_key, "/tmp/photo_file")
self.photo_image = Image.open("/tmp/photo_file")
#Load mask image
def load_mask_image(self):
#bane (Batman),joker (Batman),Random selection from immortan joe (Mad Max)
mask_key = "masks/" + choice(["bane", "joker", "joe"]) + ".png "
s3.download_file(self.bucket, mask_key, "/tmp/mask_file")
self.mask_image = Image.open("/tmp/mask_file")
#Change from a landmark (ratio) to a specific point
def landmarks_to_points(self):
upperJawlineLeft_landmark = next(
item for item in self.landmarks if item["Type"] == "upperJawlineLeft")
upperJawlineRight_landmark = next(
item for item in self.landmarks if item["Type"] == "upperJawlineRight")
eyeLeft_landmark = next(
item for item in self.landmarks if item["Type"] == "eyeLeft")
eyeRight_landmark = next(
item for item in self.landmarks if item["Type"] == "eyeRight")
self.upperJawlineLeft_point = [int(self.photo_image.size[0] * upperJawlineLeft_landmark["X"]),
int(self.photo_image.size[1] * upperJawlineLeft_landmark["Y"])]
self.upperJawlineRight_point = [int(self.photo_image.size[0] * upperJawlineRight_landmark["X"]),
int(self.photo_image.size[1] * upperJawlineRight_landmark["Y"])]
self.eyeLeft_point = [int(self.photo_image.size[0] * eyeLeft_landmark["X"]),
int(self.photo_image.size[1] * eyeLeft_landmark["Y"])]
self.eyeRight_point = [int(self.photo_image.size[0] * eyeRight_landmark["X"]),
int(self.photo_image.size[1] * eyeRight_landmark["Y"])]
#Resize the mask image to fit the face width
def resize_mask(self):
face_width = int(np.linalg.norm(list(map(sub, self.upperJawlineLeft_point, self.upperJawlineRight_point))))
new_hight = int(self.mask_image.size[1]*face_width/self.mask_image.size[0])
self.mask_image = self.mask_image.resize((face_width, new_hight))
#Rotate the mask image to match the angle of the face (not the slanted face due to neck rotation)
def rotate_mask(self):
angle = np.arctan2(self.upperJawlineRight_point[1] - self.upperJawlineLeft_point[1],
self.upperJawlineRight_point[0] - self.upperJawlineLeft_point[0])
angle = -np.degrees(angle) # radian to dgree
self.mask_image = self.mask_image.rotate(angle, expand=True)
#Combine photographic image and mask image
def match_mask_position(self):
#Matching using eye position
face_center = [int((self.eyeLeft_point[0] + self.eyeRight_point[0])/2),
int((self.eyeLeft_point[1] + self.eyeRight_point[1])/2)]
mask_center = [int(self.mask_image.size[0]/2),
int(self.mask_image.size[1]/2)]
x = face_center[0] - mask_center[0]
y = face_center[1] - mask_center[1]
self.photo_image.paste(self.mask_image, (x, y), self.mask_image)
#Save new image file to S3
def save_new_photo(self):
new_photo_byte_arr = BytesIO()
self.photo_image.save(new_photo_byte_arr, format="JPEG")
new_photo_byte_arr = new_photo_byte_arr.getvalue()
s3.put_object(Bucket=self.bucket, Key=self.new_photo_key,
Body=new_photo_byte_arr)
#Run
def run(self):
self.load_photo_image()
#Processing for the number of people
for i in range(len(self.all_landmarks)):
self.load_mask_image() #Load one new mask each time
self.landmarks = self.all_landmarks[i]
self.landmarks_to_points()
self.resize_mask()
self.rotate_mask()
self.match_mask_position()
self.save_new_photo()
#lambda main function
def lambda_handler(event, context):
landmarks = event["landmarks"]
bucket = event["bucket"]
photo_key = event["photo_key"]
new_photo_key = event["new_photo_key"]
photo_maker = NewPhotoMaker(landmarks, bucket, photo_key, new_photo_key)
photo_maker.run()
Finally, zip all the contents of the folder and upload it to Lambda. This completes the creation of a new image generation.
The controller Lambda code for this part is below:
lambda_function_for_controller.py
#Call new image generation lambda
lambdaNewMaskName = "<Here is arn of new image generation lambda>"
params = {"landmarks": str(response),
"bucket": bucket,
"photo_key": key,
"new_photo_key": new_key}
payload = json.dumps(params)
boto3.client("lambda").invoke(FunctionName=lambdaNewMaskName,
InvocationType="RequestResponse", Payload=payload)
The last part is the output part of the new image. This app inputs and outputs images with LINE Bot, and when inputting, it passes the image file directly, The output cannot send the image file directly.
Image message document in LINE Bot Messageing Api is an image transmission method to the user. Is stipulated. The API can receive the URL of the image, not the image file. According to the documentation, Communication between the user and the LINE Bot is via the LINE platform. So this transmission process is
It has become. But this process makes ** S3 bucket permissions a problem **. If the access right is set to "private", the LINE platform will not be able to read the image and the image given by the user will look like this: If the access right is set to "public", anyone can access it by knowing the S3 object URL of the image. This means that your photos can be seen by others, which is a privacy issue.
For the time being, I thought about using DynamoDB etc. to authenticate LINE users, The amount of work has increased considerably, and it collides with the concept of "reducing the amount of work as much as possible". To be honest, I don't want to do it.
After a lot of research, I finally found a good way. It's a "signed URL".
To protect your privacy, make your access to your S3 bucket "private". I can't access it even if I know the S3 object URL of the image. But if you use the Signed URL issued with the authority of the IAM role, it is private. You will be able to access certain objects in your S3 bucket. It looks like a conference URL with a zoom password.
You can also set an expiration date for this signed URL. When it expires, you can no longer use the URL, which makes it one step more secure: But one thing to note is the length of the signed URL. The signed URL issued with the privileges of the IAM role contains token information for temporary access, so the URL will be quite long. However, according to the LINE Bot Image message API, the maximum length of the URL that can be received is 1000 characters. Therefore, if the S3 bucket name, image file path, and image file name are too long, the URL will exceed 1000 characters and you will not be able to send it. So when I created the second part of the S3 bucket, I sometimes said, "The bucket name should be as short as possible." For the same reason, the new image file name should be the last 3 characters of the message ID (shorten the file name). I also save the new image file in the roll folder of the S3 bucket (shorten the file path). This solved the signed URL length issue.
Supplement: There is actually another solution to the signed URL length problem. It's about issuing URLs with IAM user privileges, not IAM roles. URLs issued by IAM users do not need tokens and can be shortened, You must use the IAM user's "access key ID" and "secret access key". For security reasons, we don't recommend issuing URLs as an IAM user.
Now that we've solved the S3 bucket permissions issue, let's implement this part. The flow of this part is as follows:
First, the controller Lambda function passes the signed URL of the new image to the LINE Bot. Then, the LINE Bot reads the image file from the S3 bucket (the actual reading is done on the LINE platform), and Send to the last user. This is the end of the process.
As with the part above, we'll cover the controller Lambda function code for this part.
lambda_function_for_controller.py
#Signed URL generation
presigned_url = s3.generate_presigned_url(ClientMethod="get_object", Params={
"Bucket": bucket, "Key": new_key}, ExpiresIn=600)
lambda_function_for_controller.py
#Replying to new image message
line_bot_api.reply_message(line_event.reply_token, ImageSendMessage(
original_content_url=presigned_url, preview_image_url=presigned_url))
Let's try the app we made!
First, send and receive using the LINE interface. There is a bot QR code from LINE Bot's "Messaging API Settings" that you can use to add to your friends. I'll send it later. .. .. ※This work, "wearing joe's mask" is a derivative of "File:Fan_Expo_2015_-Immortan_Joe(21147179383).jpg"byGabboT,usedunderCCBY-SA2.0."wearingjoe'smask"islicensedCCBY-SA2.0 by y2-peng.
You've done it! Now let's find out what patterns work and what doesn't work!
description | before | after |
---|---|---|
1 person front | ※1 | |
1 person front (with rotation) | ※2 | |
Front of multiple people | ※3 | |
Even if the face is too big | ※4 |
※1:This work is a derivative of this photo,usedunderCC01.0. ※2:This work, "result 2" is a derivative of "File:Fan_Expo_2015_-Immortan_Joe(21147179383).jpg"byGabboT,usedunderCCBY-SA2.0."result2"islicensedCCBY-SA2.0 by y2-peng. ※3:This work, "masked 4" is a derivative of "File:Fan_Expo_2015_-Immortan_Joe(21147179383).jpg" by GabboT, used under CC BY-SA 2.0, "Bane"byistolethetv,usedunderCCBY2.0, and this photo,usedunderCC01.0. "masked 4" is licensed CC BY-SA 2.0 by y2-peng. ※4:This work is a derivative of "Bane"byistolethetv,usedunderCCBY2.0.
description | before | after |
---|---|---|
Diagonal face | ※1 | |
Face too small (the person at the back) | ※2 | |
Blur (the person behind) | ※3 |
※1:This work, "standing 2" is a derivative of "File:Fan_Expo_2015_-Immortan_Joe(21147179383).jpg" by GabboT, used under CC BY-SA 2.0 and "Bane"byistolethetv,usedunderCCBY2.0. "standing 2" is licensed CC BY-SA 2.0 by y2-peng. ※2:This work, "standing 4" is a derivative of "File:Fan_Expo_2015_-Immortan_Joe(21147179383).jpg" by GabboT, used under CC BY-SA 2.0 and "Bane"byistolethetv,usedunderCCBY2.0. "standing 4" is licensed CC BY-SA 2.0 by y2-peng. ※3:This work is a derivative of "Bane"byistolethetv,usedunderCCBY2.0.
Depending on the result, if it is front and clear, the processing can be roughly done. If there is a blur, the face cannot be recognized and processing will not be performed. If the face is slanted or too small, it will be processed, but the result is not correct.
This time, we have developed a LINE application that changes the white mask in the photo to a monster mask. By utilizing AWS services, it was possible to realize it without a server, and we were able to thoroughly implement the concept of "reducing the amount of work as much as possible". If the photo is clear in the front, the conversion process is generally okay. However, processing diagonal faces and blurred faces will be an issue for the future.
lambda_function_for_controller.py
import os
import sys
import logging
import boto3
import json
from linebot import LineBotApi, WebhookHandler
from linebot.models import MessageEvent, TextMessage, TextSendMessage, ImageMessage, ImageSendMessage
from linebot.exceptions import LineBotApiError, InvalidSignatureError
logger = logging.getLogger()
logger.setLevel(logging.ERROR)
#Read the line bot channel access token and secret from the environment variables
channel_secret = os.getenv('LINE_CHANNEL_SECRET', None)
channel_access_token = os.getenv('LINE_CHANNEL_ACCESS_TOKEN', None)
if channel_secret is None:
logger.error('Specify LINE_CHANNEL_SECRET as environment variable.')
sys.exit(1)
if channel_access_token is None:
logger.error('Specify LINE_CHANNEL_ACCESS_TOKEN as environment variable.')
sys.exit(1)
# api&Generate handler
line_bot_api = LineBotApi(channel_access_token)
handler = WebhookHandler(channel_secret)
#Connect with S3 bucket
s3 = boto3.client("s3")
bucket = "<S3 bucket name>"
#Lambda main function
def lambda_handler(event, context):
#X for authentication-Line-Signature header
signature = event["headers"]["X-Line-Signature"]
body = event["body"]
#Setting the return value
ok_json = {"isBase64Encoded": False,
"statusCode": 200,
"headers": {},
"body": ""}
error_json = {"isBase64Encoded": False,
"statusCode": 403,
"headers": {},
"body": "Error"}
@handler.add(MessageEvent, message=ImageMessage)
def message(line_event):
#User profile
profile = line_bot_api.get_profile(line_event.source.user_id)
#Extract the ID of the user who sent(push_Use if message,Not necessary for reply)
# user_id = profile.user_id
#Extract message ID
message_id = line_event.message.id
#Extract image file
message_content = line_bot_api.get_message_content(message_id)
content = bytes()
for chunk in message_content.iter_content():
content += chunk
#Save image file
key = "origin_photo/" + message_id
new_key = message_id[-3:]
s3.put_object(Bucket=bucket, Key=key, Body=content)
#Call face recognition lambda
lambdaRekognitionName = "<Here is arn of face recognition lambda>"
params = {"Bucket": bucket, "Key": key} #Image file path information
payload = json.dumps(params)
response = boto3.client("lambda").invoke(
FunctionName=lambdaRekognitionName, InvocationType="RequestResponse", Payload=payload)
response = json.load(response["Payload"])
#Call new image generation lambda
lambdaNewMaskName = "<Here is arn of new image generation lambda>"
params = {"landmarks": str(response),
"bucket": bucket,
"photo_key": key,
"new_photo_key": new_key}
payload = json.dumps(params)
boto3.client("lambda").invoke(FunctionName=lambdaNewMaskName,
InvocationType="RequestResponse", Payload=payload)
#Signed URL generation
presigned_url = s3.generate_presigned_url(ClientMethod="get_object", Params={
"Bucket": bucket, "Key": new_key}, ExpiresIn=600)
#Replying to new image message
line_bot_api.reply_message(line_event.reply_token, ImageSendMessage(
original_content_url=presigned_url, preview_image_url=presigned_url))
try:
handler.handle(body, signature)
except LineBotApiError as e:
logger.error("Got exception from LINE Messaging API: %s\n" % e.message)
for m in e.error.details:
logger.error(" %s: %s" % (m.property, m.message))
return error_json
except InvalidSignatureError:
return error_json
return ok_json
lambda_function_for_rekognition.py
import json
import boto3
rekognition = boto3.client("rekognition")
def lambda_handler(event, context):
#Get the image file path from the event
bucket = event["Bucket"]
key = event["Key"]
#Call Rekognition for face recognition
response = rekognition.detect_faces(
Image={'S3Object': {'Bucket': bucket, 'Name': key}}, Attributes=['ALL'])
#How many people are in the photo
number_of_people = len(response["FaceDetails"])
#Make a list of all the required landmarks
all_needed_landmarks = []
#Process by the number of people
for i in range(number_of_people):
#This is a list of dictionaries
all_landmarks_of_one_person = response["FaceDetails"][i]["Landmarks"]
#This time eyeLeft, eyeRight, upperJawlineLeft, upperJawlineRight,Using only chinBottom
# needed_Extract to landmarks
needed_landmarks = []
for type in ["eyeLeft", "eyeRight", "upperJawlineLeft", "upperJawlineRight", "chinBottom"]:
landmark = next(
item for item in all_landmarks_of_one_person if item["Type"] == type)
needed_landmarks.append(landmark)
all_needed_landmarks.append(needed_landmarks)
return all_needed_landmarks
lambda_function_for_new_image_gengeration.py
import json
import boto3
import numpy as np
from PIL import Image, ImageFile
from operator import sub
from io import BytesIO
from random import choice
s3 = boto3.client("s3")
class NewPhotoMaker:
def __init__(self, all_landmarks, bucket, photo_key, new_photo_key):
self.all_landmarks = eval(all_landmarks)
self.bucket = bucket
self.photo_key = photo_key
self.new_photo_key = new_photo_key
#Load photographic image
def load_photo_image(self):
s3.download_file(self.bucket, self.photo_key, "/tmp/photo_file")
self.photo_image = Image.open("/tmp/photo_file")
#Load mask image
def load_mask_image(self):
#bane (Batman),joker (Batman),Random selection from immortan joe (Mad Max)
mask_key = "masks/" + choice(["bane", "joker", "joe"]) + ".png "
s3.download_file(self.bucket, mask_key, "/tmp/mask_file")
self.mask_image = Image.open("/tmp/mask_file")
#Change from a landmark (ratio) to a specific point
def landmarks_to_points(self):
upperJawlineLeft_landmark = next(
item for item in self.landmarks if item["Type"] == "upperJawlineLeft")
upperJawlineRight_landmark = next(
item for item in self.landmarks if item["Type"] == "upperJawlineRight")
eyeLeft_landmark = next(
item for item in self.landmarks if item["Type"] == "eyeLeft")
eyeRight_landmark = next(
item for item in self.landmarks if item["Type"] == "eyeRight")
self.upperJawlineLeft_point = [int(self.photo_image.size[0] * upperJawlineLeft_landmark["X"]),
int(self.photo_image.size[1] * upperJawlineLeft_landmark["Y"])]
self.upperJawlineRight_point = [int(self.photo_image.size[0] * upperJawlineRight_landmark["X"]),
int(self.photo_image.size[1] * upperJawlineRight_landmark["Y"])]
self.eyeLeft_point = [int(self.photo_image.size[0] * eyeLeft_landmark["X"]),
int(self.photo_image.size[1] * eyeLeft_landmark["Y"])]
self.eyeRight_point = [int(self.photo_image.size[0] * eyeRight_landmark["X"]),
int(self.photo_image.size[1] * eyeRight_landmark["Y"])]
#Resize the mask image to fit the face width
def resize_mask(self):
face_width = int(np.linalg.norm(list(map(sub, self.upperJawlineLeft_point, self.upperJawlineRight_point))))
new_hight = int(self.mask_image.size[1]*face_width/self.mask_image.size[0])
self.mask_image = self.mask_image.resize((face_width, new_hight))
#Rotate the mask image to match the angle of the face (not the slanted face due to neck rotation)
def rotate_mask(self):
angle = np.arctan2(self.upperJawlineRight_point[1] - self.upperJawlineLeft_point[1],
self.upperJawlineRight_point[0] - self.upperJawlineLeft_point[0])
angle = -np.degrees(angle) # radian to dgree
self.mask_image = self.mask_image.rotate(angle, expand=True)
#Combine photographic image and mask image
def match_mask_position(self):
#Matching using eye position
face_center = [int((self.eyeLeft_point[0] + self.eyeRight_point[0])/2),
int((self.eyeLeft_point[1] + self.eyeRight_point[1])/2)]
mask_center = [int(self.mask_image.size[0]/2),
int(self.mask_image.size[1]/2)]
x = face_center[0] - mask_center[0]
y = face_center[1] - mask_center[1]
self.photo_image.paste(self.mask_image, (x, y), self.mask_image)
#Save new image file to S3
def save_new_photo(self):
new_photo_byte_arr = BytesIO()
self.photo_image.save(new_photo_byte_arr, format="JPEG")
new_photo_byte_arr = new_photo_byte_arr.getvalue()
s3.put_object(Bucket=self.bucket, Key=self.new_photo_key,
Body=new_photo_byte_arr)
#Run
def run(self):
self.load_photo_image()
#Processing for the number of people
for i in range(len(self.all_landmarks)):
self.load_mask_image() #Load one new mask each time
self.landmarks = self.all_landmarks[i]
self.landmarks_to_points()
self.resize_mask()
self.rotate_mask()
self.match_mask_position()
self.save_new_photo()
#lambda main function
def lambda_handler(event, context):
landmarks = event["landmarks"]
bucket = event["bucket"]
photo_key = event["photo_key"]
new_photo_key = event["new_photo_key"]
photo_maker = NewPhotoMaker(landmarks, bucket, photo_key, new_photo_key)
photo_maker.run()