3 Tips to Improve Lead Time for Docker Build

This article is the 19th day article of DMM Group Advent Calendar 2020.

Overview

We've made three improvements to reduce the lead time for building Docker, so here are some tips.

environment

This time, I am using ubuntu-18.04 on GitHub Actions, and the version of the tool is summarized below. c.f. https://github.com/actions/virtual-environments

Use jq locally for parsing.

Tips1 Measure

Measure build time

Use BuildKit to measure the build time. BuildKit is a tool built into Docker 18.09 or later. You can resolve dependencies in parallel and import / export the cache.

c.f. https://github.com/moby/buildkit

When building with BuildKit, the time taken to build is displayed. To use it, set DOCKER_BUILDKIT = 1 in the environment variable.


DOCKER_BUILDKIT=1 docker build -t test .

Measure image size

There is a tool called wagoodman/dive that analyzes Docker images. You can use this to check the size of layers and files.

dive supports json format output. You can use jq as shown below to sort and display the size of layers and files in descending order.

Save the result of dive to a file

dive <IMAGE:TAG> --json <FILENAME>.json

Display 10 large layers in descending order

cat <FILENAME>.json  | jq '.layer | sort_by(.sizeBytes) | reverse | [limit(10; .[])]'

Display 10 large files in descending order

cat <FILENAME>.json | jq '.image.fileReference | sort_by(.sizeBytes) | reverse | [limit(10; .[])]'

By measuring the build time and checking the image size, it was possible to identify the problem area.

Tips2 Multi-stage build in parallel using BuildKit

BuildKit will execute independent stages in parallel during multi-stage build. Let's actually check the behavior with a Dockerfile like the one below.

FROM alpine AS stage1
RUN echo "stage1" \
    && sleep 5 \
    && echo "stage1" > stage1.txt

FROM alpine AS stage2
RUN echo "stage2" \
    && sleep 5 \
    && echo "stage2" > stage2.txt

FROM alpine
COPY --from=stage1 /stage1.txt ./
COPY --from=stage2 /stage2.txt ./


Results of regular Docker build

$time docker build -t test . --no-cache
Sending build context to Docker daemon  8.192kB
Step 1/7 : FROM alpine AS stage1
 ---> 389fef711851
Step 2/7 : RUN echo "stage1"     && sleep 5     && echo "stage1" > stage1.txt
 ---> Running in 060af4f159d1
stage1
Removing intermediate container 060af4f159d1
 ---> 47cacbebce55
Step 3/7 : FROM alpine AS stage2
 ---> 389fef711851
Step 4/7 : RUN echo "stage2"     && sleep 5     && echo "stage2" > stage2.txt
 ---> Running in 5527e0adf01c
stage2
Removing intermediate container 5527e0adf01c
 ---> c4c36b1aaa7b
Step 5/7 : FROM alpine
 ---> 389fef711851
Step 6/7 : COPY --from=stage1 /stage1.txt ./
 ---> b29d3db1464c
Step 7/7 : COPY --from=stage2 /stage2.txt ./
 ---> 2ae9b56c1d34
Successfully built 2ae9b56c1d34
Successfully tagged test:latest

real    0m11.976s
user    0m0.261s
sys     0m0.190s

Since the sleep command for 5 seconds is executed for each of stage1 and stage2, it can be confirmed that it takes nearly 10 seconds.

Build Kit results


$DOCKER_BUILDKIT=1 docker build -t test . --no-cache
[+] Building 5.6s (9/9) FINISHED                                                                                                                                                        
 => [internal] load build definition from Dockerfile                                                                                                                               0.1s
 => => transferring dockerfile: 322B                                                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                                    0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                                                                                                                   0.0s
 => [stage2 1/2] FROM docker.io/library/alpine                                                                                                                                     0.0s
 => => resolve docker.io/library/alpine:latest                                                                                                                                     0.0s
 => [stage2 2/2] RUN echo "stage2"     && sleep 5     && echo "stage2" > stage2.txt                                                                                                5.3s
 => [stage1 2/2] RUN echo "stage1"     && sleep 5     && echo "stage1" > stage1.txt                                                                                                5.3s
 => [runner 2/3] COPY --from=stage1 /stage1.txt ./                                                                                                                                 0.1s
 => [runner 3/3] COPY --from=stage2 /stage2.txt ./                                                                                                                                 0.1s
 => exporting to image                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                            0.0s
 => => writing image sha256:27621000a30c0338903150a187a8b665a56137b5d14c018d0ac083a716d2fb72                                                                                       0.0s
 => => naming to docker.io/library/test                                                                                                                                            0.0s

In this way, it can be confirmed that stage1 and stage2 are executed in parallel and it takes only about 5 seconds.

Tips3 Using cache with CI

There are official Docker builds and GitHub Actions for push. https://github.com/docker/build-push-action From version 2 onwards, Buildx is used to build and push images. Combine this with actions/cache @ v2 to make the cache work on CI.

What is buildx

buildx is a CLI plugin that supports all the features of BuildKit. However, as of December 2020, it is an experimental function and is not recommended for production use. c.f. https://docs.docker.com/buildx/working-with-buildx/ c.f. https://github.com/docker/buildx

About intermediate stage cache

If you are using a multi-stage build and want to cache the intermediate stage as well, write mode = max in the --chace-to option. With mode = min, only the finally built stage will be cached.

About cache type

BuildKit has the following three types of caches. This time I used the local type.

--inline: Embed cache in Docker image --registry: push image and cache separately --local: Export cache to local directory

Example The actual code looks like this:

..github/workflows/build.yaml


name: Build and Push Container

on:
  push:
    branches:
      - 'main'

jobs:
  build-push-image:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1

      - name: Cache Docker layers
        uses: actions/cache@v2
        with:
          path: /tmp/.buildx-cache
          key: ${{ runner.os }}-buildx-${{ github.sha }}
          restore-keys: |
            ${{ runner.os }}-buildx-

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ap-northeast-1

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v1

      - name: Build and Push
        if: github.ref == 'refs/heads/master'
        uses: docker/build-push-action@v2
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          ECR_REPOSITORY: ${{ github.repository }}
        with:
          push: true
          tags: ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:$latest
          cache-from: type=local,src=/tmp/.buildx-cache
          cache-to: type=local,dest=/tmp/.buildx-cache,mode=max
          build-args: |
            ARG1=hoge

Specify the cache path with actions/cache @ v2 and use it with docker/build-push-action @ v2. As a result, it was confirmed that the cache is effective and the build time is shortened for the second and subsequent builds.

c.f. https://docs.github.com/en/free-pro-team@latest/actions/guides/caching-dependencies-to-speed-up-workflows

Summary

this time

--Measurement of Docker image --Parallel execution of multi-stage builds --Use of cache on CI

I wrote about. When implementing improvements, start with measurement, consider improvement costs and cost-effectiveness, and then start actual improvements. Also, make sure you follow the Dockerfile Best Practices (https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) before embarking on the improvements presented here. If CI is still taking a long time, why not consider the tips introduced here.

Recommended Posts

3 Tips to Improve Lead Time for Docker Build
How to build Docker + Springboot app (for basic learning)
How to build docker environment with Gradle for intelliJ
Articles referred to for incorporating Docker
How to build CloudStack using Docker
Introduction to java for the first time # 2
How to build a Ruby on Rails environment using Docker (for Docker beginners)
Docker container build fails to install php-radis
Time shift measures with Docker for Windows
How to build Rails 6 environment with Docker
I tried using Docker for the first time
I tried touching Docker for the first time
Tips for improving Jbuilder rendering time with jsonapi-serializer
Build a development environment for Docker, java, vscode
How to use nginx-ingress-controller with Docker for Mac
[Rails] How to build an environment with Docker
tips for java.nio.file.Path
How to quit Docker for Mac and build a Docker development environment with Ubuntu + Vagrant
Cache Gradle dependent files to speed up docker build
[Road _node.js_1-1] Road to build Node.js Express MySQL environment using Docker
Build a docker container for a python simple web server
Introduction to programming for college students (updated from time to time)
Measures for insufficient memory capacity of docker compose build
[Note] I suddenly can't build with Docker for windows.
(For myself) Build gitlab with ubuntu18.04 + docker for home (Note)
Try connecting to AzureCosmosDB Emulator for Docker with Java
How to make Laravel faster with Docker for Mac
Build an Android image for Orange Pi 4 with Docker
Try to build a Java development environment using Docker
[2021] Build a Docker + Vagrant environment for using React / TypeScript
How to study kotlin for the first time ~ Part 2 ~
How to study kotlin for the first time ~ Part 1 ~
I tried to build an environment using Docker (beginner)