In this article, we will describe how to measure with the Trace function of Google Cloud Platform (GCP) using open source called open Telemetry. The target of this Trace is executed on CloudRun.

What is open Telemetry?

openTelemetry is an open source for analyzing the behavior of increasingly complex systems. By measuring the time of the executed processing, you can search for the cause that leads to the bottleneck or failure. There is open source for distributed tracing called openCensus, and openTelemetry is the successor. openTelemetry

Trace structure

When starting Trace, first create an identification ID called TraceID. With this ID, the target of Trace is made unique and linked with the measurement section called Span. Span has a parent-child relationship and can subdivide the measurement. The figure below has a structure in which the top parent Span is divided into three and the center child Span2 is traced in more detail.

Implemented using Go

Terminal: MacBook PRO Go : go1.15.2 Cloud: GCP

Installation

Download go https://golang.org/doc/install
Install from the downloaded file
Add PATH setting

$ export PATH=/usr/local/go/bin:$PATH

Initialize

Initialize to manage the module using ** go mod **. The following command will create a file called ** go.mod **.

$ go mod init example.com/trace

After that, by setting the module to be loaded in the code and building, the setting of the module to be loaded automatically in go.mod will be added.

Constitution

The final configuration looks like this:

.
├── Dockerfile
├── go.mod
├── go.sum
└── main.go

Source code

main.go

I am creating an API that receives HTTP requests. Load the package for tracing and make initial settings such as project settings. In this code

Create a Tracer named ** example.com/trace **
Start a span named ** sample **
Measure the processing time at each step doing.


package main

import (
        "context"
        "fmt"
        "log"
        "net/http"
        "os"
        "time"

        texporter "github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace"
        "go.opentelemetry.io/otel/api/global"
        sdktrace "go.opentelemetry.io/otel/sdk/trace"
)

func handler(w http.ResponseWriter, r *http.Request) {

        exporter, err := texporter.NewExporter(texporter.WithProjectID("{project id}"))
        if err != nil {
                log.Fatalf("texporter.NewExporter: %v", err)
        }

        tp, err := sdktrace.NewProvider(sdktrace.WithSyncer(exporter))
        if err != nil {
                log.Fatal(err)
        }

        global.SetTraceProvider(tp)
        tracer := global.TraceProvider().Tracer("example.com/trace")

        ctx, span := tracer.Start(context.Background(), "sample")
	    defer span.End()

	    _, step1 := tracer.Start(ctx, "step1")
        time.Sleep(time.Second * 1)
	    step1.End()

	    _, step2 := tracer.Start(ctx, "step2")
        time.Sleep(time.Second * 2)
	    step2.End()

        fmt.Fprintf(w, "Done\n")
}

func main() {

        http.HandleFunc("/", handler)

        port := os.Getenv("PORT")
        if port == "" {
                port = "8080"
        }

        log.Fatal(http.ListenAndServe(fmt.Sprintf(":%s", port), nil))
}

Dockerfile Create an executable file at the top using a multi-stage build. Create an image to deploy by copying only the created executable file to the alpine image.


FROM golang:1.15 as builder

WORKDIR /app

COPY go.* ./
RUN go mod download
COPY . ./

RUN CGO_ENABLED=0 GOOS=linux go build -v -o server


FROM alpine:3
RUN apk add --no-cache ca-certificates

COPY --from=builder /app/server /server

CMD ["/server"]

Build

Deploy on Cloud Run. Use ** gcloud builds submit ** to build the image and save it in the Container Registry. Then use ** gcloud run deploy ** to deploy from the saved image.


$ gcloud builds submit --tag gcr.io/$1/{Image name}
$ gcloud run deploy {Service name} --image gcr.io/$1/{Image name} --platform managed --memory 256M --region {region} --allow-unauthenticated

result

If you run the deployed API and look at the GCP Trace, you'll see something like this: スクリーンショット 2020-09-23 19.02.43.png

The execution times of step1 and step2 are visualized and displayed in the span of the sample. I was able to easily measure the time by creating a tracer and surrounding the part I wanted to measure. In addition, since it is possible to measure in a nested manner by executing a tracer in the context argument, it is possible to check which part of the API is taking processing time. By collecting the trace results, it is possible to verify later whether the speed is constantly slow or the processing is slow only at that time, and I think that there is a big merit.

How to use GCP trace with open Telemetry