[Ruby] Learn Digdag from Digdag Official Documentation-Architecture

3 minute read


Translated the document architecture of Digdag official website + α The final goal is to make a batch in Rails using Digdag’s Ruby http://docs.digdag.io/architecture.html

table of contents

Getting started Architecture Concepts Workflow definition Scheduling workflow Operators Command reference Language API -Ruby Change the setting value for each environment with Digdag (RubyOnRails) Batch implementation in RubyOnRails environment using Digdagv

Digdag Architecture

Automating workflow with Digdag

Workflow automates manual activities. Define a set of tasks as a workflow. Digdag keeps running. Tasks are defined using operator plugins, so you can control many types of systems from a central workflow engine.

As a plugin runtime framework, Digdag handles the rest of the workload automation issues and allows you to focus on automation. Digdag sends an alert if the task fails. Digdag sends a notification if the workflow does not complete within the expected time. Tasks can be run on local machines, distributed servers, or Docker containers.

Group organization of tasks

Automating complex workflows can quickly add complexity to your definitions. Digdag allows you to organize your tasks into groups. I think I’ll move on to the details from the overview when checking the definition. You can see immediately that the upper part of the figure is the flow of data preparation, analysis and evaluation that you are going to do in the overview part. Then see the details of each group for easier debugging and review during development. Administrators will immediately see what is happening in production and how to fix the problem.

Data preparation group, analysis group, evaluation group Untitled Diagram.png

A task starts when there are no dependent siblings or when all siblings complete successfully. When a group’s parent task runs, its child tasks run. When all of them complete successfully, the parent task also completes successfully. If a child task fails, the failing child task and its parent task also fail. When the root task completes or fails, the entire run ends.


Task grouping is also used to pass parameters between tasks. Parent tasks can export variables because they are child tasks. (Example: The UNIX shell export command sets environment variables). The parent task can spawn child tasks at run time, allowing different tasks to be performed depending on the outcome of the previous task.

Workflow as code

The Digdag workflow is defined in code. This provides best practices for software development: version control, code review, testing, collaboration with pull requests. Pushing a workflow to a git repository allows anyone to pull it and reproduce the same results.

Running with local mode

Digdag is a single file executable command. Creating and running a new workflow is as easy as a Makefile.

The ```*.dig

extension files are used for workflow definition.

The ```digdag run my_workflow.dig
#### **` command runs a workflow.`**

Once you’ve developed a workflow on your local machine and tested it, you need to push it to a server to run it on a regular basis.

Running on a server

The ```* .dig

files and other files that exist in the same directory are called a project.

You will be able to push all your projects to the Digdag server and run the workflow on the server as a result.
It's a bit early, but if you push the Workflow you made last time to the server, it will be as follows.

#### **`Digdag server start`**

$ digdag init mydag $ cd mydag $ digdag run mydag.dig $ digdag server –memory

Add Project to 
#### **`Worflow`**

$digdag push mydag

2020-07-09 20:39:20 +0900: Digdag v0.9.41 Creating .digdag/tmp/archive-3560803829245476890.tar.gz… Archiving mydag.dig Workflows: mydag.dig Uploaded: id: 1 name: mydag revision: b8b9abb8-b156-4089-a009-a01aa0337d9f archive type: db project created at: 2020-07-09T11:39:21Z revision updated at: 2020-07-09T11:39:21Z

Use digdag workflows to show all workflows.

![Screenshot 2020-07-09 20.39.46.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/108475/d0bcf143-87a3-0d03-7265-(4d8ce8a65d9b.png)

### Running tasks on Docker
You can use Docker to execute the tasks in the container.
If the docker option is set, the task will run in the Docker container.

I have not learned detailed tasks yet, so the explanation of the following code will be explained in the Workflow definition part

_export: docker: image: ubuntu:14.04

+step1: py>: tasks.MyWorkflow.step1

Digdag caches the pulled image for reuse. By default Digdag will consistently use the cached image even if there are updates. pull_always: You can set the true option to pull the latest image of the tag every time Digdag checks for updates and starts the task.

_export: docker: image: ubuntu:latest pull_always: true

+step1: py>: tasks.MyWorkflow.step1 ```