My name is Takakura and I am a server and infrastructure engineer for social games at Apiritz Co., Ltd.
This article is the 24th day article of AWS & Game Advent Calendar 2020.
FROM ruby:2.7.2-alpine #* Point 1
...
#Package installation#* Point 2
RUN apk add --update --no-cache --virtual=.build-dependencies \
build-base \
curl-dev \
libxml2-dev \
...
&& \
apk add --update --no-cache \
libxml2 \
libxslt \
linux-headers \
mariadb-connector-c \
...
&& \
gem install bundler --no-document && \
bundle install -j4 && \
apk del .build-dependencies
...
The official Docker Image contains many packages that are unnecessary for server operation, which leads to an increase in Image size. Select a distribution with unnecessary items such as alpine and Distroless according to the language and environment used.
#Example: Size comparison of base image of ruby. Even with the same version, there is a difference of about 800 MB
$ docker images | grep ruby
ruby 2.7.2 7e58098089a4 5 days ago 842MB
ruby 2.7.2-alpine f811257adce0 6 days ago 51.7MB
Packages such as libxml2-dev and linux-headers are often needed only when installing libraries such as bundle install. Files that are unnecessary for starting the server application Therefore, you can expect the phenomenon of size by combining virtual and apk del as in the sample and completing it in one RUN command.
→ As an example, this optimization reduced the sample application to 195.40MB → 111.36MB, which is almost half.
Docker officially has very well organized best practices, including the above points https://matsuand.github.io/docs.docker.jp.onthefly/develop/develop-images/dockerfile_best-practices/
CodeBuild(buildspec.yml)
version: 0.2
env:
variables:
DOCKER_BUILDKIT: "1"
DOCKERHUB_USER: "hoge"
DOCKERHUB_PASS: "fuga"
AWS_ACCOUNT_ID: "xxx"
IMAGE_REPO_NAME: "dfast"
AWS_DEFAULT_REGION: "ap-northeast-1"
IMAGE_TAG_BASE: "latest"
ALTER_CACHE_BRANCH_NAME: "main"
phases:
pre_build:
commands:
# ECR &Docker Hub login
- $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
- echo $DOCKERHUB_PASS | docker login -u $DOCKERHUB_USER --password-stdin
#Various env settings
- export BRANCH_NAME=${CODEBUILD_WEBHOOK_TRIGGER#branch/} #The branch name is CODE BUILD_WEBHOOK_Can be referenced with TRIGGER
- export IMAGE_TAG_NAME=${BRANCH_NAME/\//_}_${IMAGE_TAG_BASE} #When specifying the Image name, "/Because "" cannot be used_"Conversion to
- export REPOSITORY_URI=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com/${IMAGE_REPO_NAME}
- export GIT_HASH=${CODEBUILD_RESOLVED_SOURCE_VERSION}
#Settings for cache
- export CACHE_URI=${REPOSITORY_URI}:${IMAGE_TAG_NAME}
- export ALTER_IMAGE_TAG_NAME=${ALTER_CACHE_BRANCH_NAME/\//_}_latest
- export ALTER_CACHE_URI=${REPOSITORY_URI}:${ALTER_IMAGE_TAG_NAME}
- export use_alter_cache=0
build:
commands:
#* Point 1
#Get the latest Image of the target branch
# CACHE_If the URI does not exist, the build will stop halfway as it is, so "||I'm trying to keep the build going
- docker pull ${CACHE_URI} || use_alter_cache=1
- | #There is no latest image(=Target branch first build)→ Alternative Image(=Latest Image in main branch)As a cache
if [ "${use_alter_cache}" = 1 ] ; then
CACHE_URI=${ALTER_CACHE_URI}
docker pull ${CACHE_URI} || echo "error ignore because no cache docker image ..."
fi
- echo CACHE_URI:${CACHE_URI}
- docker build --cache-from ${CACHE_URI} --build-arg BUILDKIT_INLINE_CACHE=1 -t ${REPOSITORY_URI}:${IMAGE_TAG_NAME} .
- docker tag ${REPOSITORY_URI}:${IMAGE_TAG_NAME} ${REPOSITORY_URI}:${GIT_HASH}
post_build:
commands:
#* Point 2
# <Branch name>_Add both latest and Git hash value tags to Image
- docker push ${REPOSITORY_URI}:${IMAGE_TAG_NAME}
- docker push ${REPOSITORY_URI}:${GIT_HASH}
Docker can use the unchanged Layer as a cache by "specifying an existing Image for cache-from at build time"
By specifying this cache, installation with apk or bundle is skipped, so it can be very fast, but Since the cache target is specified for each branch ** Only for the first push, "Image of the corresponding branch name" does not exist, so the cache is not used and the full build runs **.
It's usually good, but if you want to create a dedicated branch such as "There was a bug in the development environment where QA stopped, so please fix it immediately" There was a problem that "a full build runs without using the cache for any minor response".
In order to pass this problem, we took the method of "dynamically changing cache-from depending on the existence of Image of the corresponding branch name". By doing this, you will be able to use the cache as much as possible and keep the build speed up.
To make it easier to manage multiple versions of Docker Image, the tag "branch name_latest" is added to the image at build time. If you use this tag for deploying to ECS or executing a one-shot task without thinking about it, Since it is a so-called latest operation anti-pattern, we decided to operate it with the following measures.
#Get the "GitHub hash" tag for the latest Image for that branch from ECR
export SEARCH_TAG=${BRANCH_NAME/\//_}_latest
#The latest Image has "branch name"_Since it has two tags, "latest" and "GitHub hash location", get only "Github hash"
tags=`aws ecr describe-images --repository-name ${REPO_NAME} --image-ids imageTag=${SEARCH_TAG} | jq -r .imageDetails[0].imageTags`
len=$(echo $tags | jq length)
for i in $( seq 0 $(($len - 1)) ); do
tag=$(echo $tags | jq -r .[$i])
if [ ! $tag = SEARCH_TAG ]; then
echo $tag
break
fi
done
By the way, since the release date of this entry is 12/24, I decided to display the Santa banner for startup on the CodeBuild console using netpbm. By connecting conversion and processing programs with pipes, it is possible to output image files in monospaced ASCII art.
buildspec.yml
...
BANNER_IMG_URL: https://xxx/yyy.png
BANNER_FILE_NAME: santa.png
# BANNER_IMG_URL: https://xxx/yyy.jpg
# BANNER_FILE_NAME: santa.jpg
phases:
pre_build:
commands:
#Display of Santa AA banner
- curl ${BANNER_IMG_URL} --output ${BANNER_FILE_NAME}
- yum install -y netpbm-progs
- pngtopam -mix -background=#ffffff ${BANNER_FILE_NAME} | pamscale -xscale .3 -yscale .3 | ppmtopgm | pgmtopbm | pbmtoascii
#Example for JPEG reference- jpegtopnm ${BANNER_FILE_NAME} | pamscale -xscale .3 -yscale .3 | ppmtopgm | pgmtopbm | pbmtoascii
...
This time, due to time constraints, I only displayed it on the CodeBuild console, but Since we are dealing with just text data, various applications such as "posting to slack with Lambda" are possible.
This is just one example, and there should be many more optimizations and settings. Container technology is an ever-increasing technology that is in season now, so it will continue to evolve. Let's pursue technology to remain active as an engineer!
Recommended Posts