There is s3cmd as a tool for managing S3 objects. You can operate S3 without installing the AWS CLI, and it is often used for backup and restore.
To use s3cmd on EKS, Pod needs access to S3. In the past, in order to give access to S3, ** IAM Role was given to Node **, and ** kube2iam was used to temporarily obtain credentials **. In 2019, IAM role for service account (IRSA) will be released for each language. SDK supports it, but s3cmd does not use SDK, so I implemented the mechanism myself.
macOS Mojabe 10.14.6 Pulumi 2.1.0 AWS CLI 1.16.292 EKS 1.15 s3cmd 2.1.0
Modify the source code of s3cmd and push the Docker image to ECR.
Download s3cmd with the following command.
$ wget --no-check-certificate https://github.com/s3tools/s3cmd/releases/download/v2.1.0/s3cmd-2.1.0.tar.gz
$ tar xzvf s3cmd-2.1.0.tar.gz
$ cd s3cmd-2.1.0
The directory structure of s3cmd-2.1.0 is as follows.
├── INSTALL.md
├── LICENSE
├── MANIFEST.in
├── NEWS
├── PKG-INFO
├── README.md
├── S3/
├── s3cmd
├── s3cmd.1
├── s3cmd.egg-info/
├── setup.cfg
└── setup.py
Only modify S3 / Config.py
. The flow to obtain S3 access authority is as follows.
and ʻAWS_WEB_IDENTITY_TOKEN_FILE
from the environment variables.Only the additional part is described below. Only the function role_config
will rewrite the existing one.
S3/Config.py
import urllib.request
import urllib.parse
import xml.etree.cElementTree
def _get_url():
stsUrl = "https://sts.amazonaws.com/"
roleArn = os.environ.get('AWS_ROLE_ARN')
path = os.environ.get('AWS_WEB_IDENTITY_TOKEN_FILE')
with open(path) as f:
webIdentityToken = f.read()
params = {
"Action": "AssumeRoleWithWebIdentity",
"Version": "2011-06-15",
"RoleArn": roleArn,
"RoleSessionName": "s3cmd",
"WebIdentityToken": webIdentityToken
}
url = '{}?{}'.format(stsUrl, urllib.parse.urlencode(params))
return url
def _build_name_to_xml_node(parent_node):
if isinstance(parent_node, list):
return build_name_to_xml_node(parent_node[0])
xml_dict = {}
for item in parent_node:
key = re.compile('{.*}').sub('',item.tag)
if key in xml_dict:
if isinstance(xml_dict[key], list):
xml_dict[key].append(item)
else:
xml_dict[key] = [xml_dict[key], item]
else:
xml_dict[key] = item
return xml_dict
def _replace_nodes(parsed):
for key, value in parsed.items():
if list(value):
sub_dict = _build_name_to_xml_node(value)
parsed[key] = _replace_nodes(sub_dict)
else:
parsed[key] = value.text
return parsed
def _parse_xml_to_dict(body):
parser = xml.etree.cElementTree.XMLParser(target=xml.etree.cElementTree.TreeBuilder(), encoding='utf-8')
parser.feed(body)
root = parser.close()
parsed = _build_name_to_xml_node(root)
_replace_nodes(parsed)
return parsed
class Config(object):
def role_config(self):
url = _get_url()
req = urllib.request.Request(url, method='POST')
with urllib.request.urlopen(req) as resp:
body = resp.read()
parsed = _parse_xml_to_dict(body)
Config().update_option('access_key', parsed['AssumeRoleWithWebIdentityResult']['Credentials']['AccessKeyId'])
Config().update_option('secret_key', parsed['AssumeRoleWithWebIdentityResult']['Credentials']['SecretAccessKey'])
Config().update_option('access_token', parsed['AssumeRoleWithWebIdentityResult']['Credentials']['SessionToken'])
Let's look at each one.
The function _get_url
is for creating the URL for POSTing to the STS API.
Applying IRSA to a pod creates the environment variables ʻAWS_ROLE_ARN and ʻAWS_WEB_IDENTITY_TOKEN_FILE
.
The latter is a file path, which gets the token inside and adds it to the URL parameter.
def _get_url():
stsUrl = "https://sts.amazonaws.com/"
roleArn = os.environ.get('AWS_ROLE_ARN')
path = os.environ.get('AWS_WEB_IDENTITY_TOKEN_FILE')
with open(path) as f:
webIdentityToken = f.read()
params = {
"Action": "AssumeRoleWithWebIdentity",
"Version": "2011-06-15",
"RoleArn": roleArn,
"RoleSessionName": "s3cmd",
"WebIdentityToken": webIdentityToken
}
url = '{}?{}'.format(stsUrl, urllib.parse.urlencode(params))
return url
def _build_name_to_xml_node(parent_node):
if isinstance(parent_node, list):
return build_name_to_xml_node(parent_node[0])
xml_dict = {}
for item in parent_node:
key = re.compile('{.*}').sub('',item.tag)
if key in xml_dict:
if isinstance(xml_dict[key], list):
xml_dict[key].append(item)
else:
xml_dict[key] = [xml_dict[key], item]
else:
xml_dict[key] = item
return xml_dict
def _replace_nodes(parsed):
for key, value in parsed.items():
if list(value):
sub_dict = _build_name_to_xml_node(value)
parsed[key] = _replace_nodes(sub_dict)
else:
parsed[key] = value.text
return parsed
def _parse_xml_to_dict(body):
parser = xml.etree.cElementTree.XMLParser(target=xml.etree.cElementTree.TreeBuilder(), encoding='utf-8')
parser.feed(body)
root = parser.close()
parsed = _build_name_to_xml_node(root)
_replace_nodes(parsed)
return parsed
class Config(object):
def role_config(self):
url = _get_url()
req = urllib.request.Request(url, method='POST')
with urllib.request.urlopen(req) as resp:
body = resp.read()
parsed = _parse_xml_to_dict(body)
Config().update_option('access_key', parsed['AssumeRoleWithWebIdentityResult']['Credentials']['AccessKeyId'])
Config().update_option('secret_key', parsed['AssumeRoleWithWebIdentityResult']['Credentials']['SecretAccessKey'])
Config().update_option('access_token', parsed['AssumeRoleWithWebIdentityResult']['Credentials']['SessionToken'])
Compress the code-modified version as s3cmd-2.1.0.tar.gz
and place it in the same directory as Dockerfile
.
├── Dockerfile
└── s3cmd-2.1.0.tar.gz
The Dockerfile
looks like this:
Dockerfile
FROM python:3.8.2-alpine3.11
ARG VERSION=2.1.0
COPY s3cmd-${VERSION}.tar.gz /tmp/
RUN tar -zxf /tmp/s3cmd-${VERSION}.tar.gz -C /tmp && \
cd /tmp/s3cmd-${VERSION} && \
python setup.py install && \
mv s3cmd S3 /usr/local/bin && \
rm -rf /tmp/*
ENTRYPOINT ["s3cmd"]
CMD ["--help"]
Build the image and push it to ECR. Replace XXXXXXXXXXXX
with your AWS account.
$ docker build -t XXXXXXXXXXXX.dkr.ecr.ap-northeast-1.amazonaws.com/s3cmd:2.1.0 .
$ docker push XXXXXXXXXXXX.dkr.ecr.ap-northeast-1.amazonaws.com/s3cmd:2.1.0
All the environment this time will be built with Pulumi.
The directory structure is as follows. Only edit ʻindex.ts and
k8s / s3cmd.yaml`.
├── Pulumi.dev.yaml
├── Pulumi.yaml
├── index.ts *
├── k8s
│ └── s3cmd.yaml *
├── node_modules/
├── package-lock.json
├── package.json
├── stack.json
└── tsconfig.json
Describe other than the Kubernetes manifest file in ʻindex.ts`. The EKS cluster must include the OpenID Connect Provider settings.
index.ts
import * as aws from "@pulumi/aws";
import * as awsx from "@pulumi/awsx";
import * as eks from "@pulumi/eks";
import * as k8s from "@pulumi/kubernetes";
import * as pulumi from "@pulumi/pulumi";
const vpc = new awsx.ec2.Vpc("custom", {
cidrBlock: "10.0.0.0/16",
numberOfAvailabilityZones: 3,
});
const cluster = new eks.Cluster("pulumi-eks-cluster", {
vpcId: vpc.id,
subnetIds: vpc.publicSubnetIds,
deployDashboard: false,
createOidcProvider: true,
instanceType: aws.ec2.T3InstanceSmall,
});
const s3PolicyDocument = pulumi.all([cluster.core.oidcProvider?.arn, cluster.core.oidcProvider?.url]).apply(([arn, url]) => {
return aws.iam.getPolicyDocument({
statements: [{
effect: "Allow",
principals: [
{
type: "Federated",
identifiers: [arn]
},
],
actions: ["sts:AssumeRoleWithWebIdentity"],
conditions: [
{
test: "StringEquals",
variable: url.replace('http://', '') + ":sub",
values: [
"system:serviceaccount:default:s3-full-access"
]
},
],
}]
})
})
const s3FullAccessRole = new aws.iam.Role("s3FullAccessRole", {
name: "s3-full-access-role",
assumeRolePolicy: s3PolicyDocument.json,
})
new aws.s3.Bucket("pulumi-s3cmd-test", {
bucket: "pulumi-s3cmd-test"
});
const s3FullAccessRoleAttachment = new aws.iam.RolePolicyAttachment("s3FullAccessRoleAttachment", {
role: s3FullAccessRole,
policyArn: aws.iam.AmazonS3FullAccess,
})
const myk8s = new k8s.Provider("myk8s", {
kubeconfig: cluster.kubeconfig.apply(JSON.stringify),
});
const s3cmd = new k8s.yaml.ConfigFile("s3cmd", {
file: "./k8s/s3cmd.yaml"
}, { provider: myk8s })
k8s / s3cmd.yaml
defines ServiceAccount and Deployment.
Service Account must add annotations.
s3cmd.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: default
name: s3-full-access
labels:
app: s3cmd
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::XXXXXXXXXXXX:role/s3-full-access-role
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
name: s3cmd
labels:
app: s3cmd
spec:
selector:
matchLabels:
app: s3cmd
replicas: 1
template:
metadata:
labels:
app: s3cmd
spec:
serviceAccountName: s3-full-access
containers:
- image: XXXXXXXXXXXX.dkr.ecr.ap-northeast-1.amazonaws.com/s3cmd:2.1.0
name: s3cmd
command: ["/bin/sh"]
args: ["-c", "while true; do echo hello; sleep 10; done"]
All you have to do is deploy with the following command.
$ pulumi up
Confirm that you can type the s3cmd command from the created s3cmd pod. The S3 bucket created this time is displayed properly.
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
s3cmd-98985855f-h5lgl 1/1 Running 0 63s
$ kubectl exec -it s3cmd-98985855f-h5lgl -- s3cmd ls
2020-05-02 15:04 s3://pulumi-s3cmd-test
I confirmed that IAM Role can be assigned to s3cmd Pod by IRSA without using kube2iam. Considering that it is necessary to deploy DaemonSet with kube2iam and management resources will increase, I think that the merit of IRSA is great.