I was playing with boto3 for studying. I thought that it would be better to store the resource information obtained by using boto3 in the data class and refer to it (the reason will be described later), so I will write it as a knowledge.
Briefly, it's an SDK for working with AWS resources in python. It is divided into low-level (client) API and high-level API (resorce). There are an infinite number of other related articles, so if you want to know more, please check them out.
This time we will use the low level API.
It provides a decorator that dynamically assigns special methods associated with classes such as \ _ \ _ init \ _ \ _ ().
A class with the following constructor
class Wanchan_Nekochan():
def __init__(self, cat:str, dog:str):
self.cat =cat
self.dog = dog
You will be able to write smartly without such a constructor.
@dataclass
class Wanchan_Nekochan():
cat: str
dog: str
https://docs.python.org/ja/3/library/dataclasses.html
When compared with named tuples that have similar functions, there are the following differences. -Namedtuple becomes immutable (non-editable) after instantiation. dataclass is mutable (editable) by default, but if you pass the argument frozen as true Become immutable. ・ Data class reads data a little faster (verification required)
This time, to get the bucket list of S3, use the list_buckets () method of S3.Client class. (Reference) Official reference https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.list_buckets
The definition of the response is the following dictionary type.
{
'Buckets': [
{
'Name': 'string',
'CreationDate': datetime
},
],
'Owner': {
'DisplayName': 'string',
'ID': 'string'
},
"ResponseMetadata": {
"RequestId": str,
"HTTPStatusCode": int,
"HTTPHeaders": dict,
"RetryAttempts": int
}
}
When getting the information from the returned response, the key will be specified as a character string as shown below.
s3_client = session.client('s3')
data=s3_client.list_buckets()
status_code = data["ResponseMetadata"]["HTTPStatusCode"]
display_name = data["Owner"]["DisplayName"]
Since the key is specified by a character string, there is a risk of spelling mistakes because the input completion function by the IDE cannot be used, and there are troublesome problems such as not knowing until you refer to what type the data has.
If you define the response definition as dataclass in advance, dot access is possible, so input completion becomes possible and type hints by mypy can be used rigorously.
There is no reason not to use dataclass !!!
So ...
from dataclasses import dataclass
from typing import List
from datetime import datetime
import boto3
from dacite import from_dict
session = boto3.session.Session(profile_name='s3_test')
#BasesClient is in global scope to avoid throwing APIs multiple times
#Implement with a singleton
s3_client = session.client('s3')
@dataclass
class Boto3_Response():
RequestId: str
HostId: str
HTTPStatusCode: int
HTTPHeaders: Dict
RetryAttempts: int
@dataclass
class Inner_Owner():
DisplayName: str
ID: str
@dataclass
class Inner_Buckets():
Name: str
CreationDate: datetime
@dataclass
class S3_LIST():
ResponseMetadata: Boto3_Response
Owner: Inner_Owner
Buckets: List[Inner_Buckets]
@classmethod
def make_s3_name_list(cls):
return from_dict(data_class=cls, data=s3_client.list_buckets())
s3_list_response = S3_LIST.make_s3_name_list()
#Status code
print(s3_list_response.ResponseMetadata.HTTPStatusCode) #200
#Owner name
print(s3_list_response.Owner.DisplayName) # nikujaga-kun
# ID
print(s3_list_response.Owner.ID)
#Bucket list
print(*[bucket.Name for bucket in s3_list_response.Buckets])
from dacite import from_dict
I'm importing a third party library called dacite here. dacite (I read it as "de") is simply a library for passing dictionary types to nested data classes and instantiating them. https://pypi.org/project/dacite/
@dataclass
class Boto3_Response():
RequestId: str
HostId: str
HTTPStatusCode: int
HTTPHeaders: Dict
RetryAttempts: int
@dataclass
class Inner_Owner():
DisplayName: str
ID: str
@dataclass
class Inner_Buckets():
Name: str
CreationDate: datetime
@dataclass
class S3_LIST():
ResponseMetadata: Boto3_Response
Owner: Inner_Owner
Buckets: List[Inner_Buckets]
@classmethod
def make_s3_name_list(cls):
return from_dict(data_class=cls, data=s3_client.list_buckets())
In the definition of S3_LIST above, different data classes Boto3_Response, Inner_Owner, Inner_Buckets are defined as attribute types. If you pass the response of list_buckets to S3_LIST as it is without using from_dict of dacite
@classmethod
def make_s3_name_list(cls):
return cls(**s3_client.list_buckets())
It looks like this, but since it is passed as a dictionary type instead of a class instance, it gets angry if there is no such attribute.
s3_list_response = S3_LIST.make_s3_name_list()
#Get angry with Attribute Error here
print(s3_list_response.ResponseMetadata.HTTPStatusCode)
# 'dict' object has no attribute 'HTTPStatusCode'
If you try to do something good here just by incorporating it, it will be difficult as it is, so I'm using dacite as a library that does a good job. You should ride on the shoulders of giants infinitely.
data class is good
Recommended Posts