Amazon Timestream, a fully managed time series database, was released to the public on 9/30, so let's touch it. I've known the existence of time series databases for a long time, but I've never used them, so I'm looking forward to it.
It's too new and neither CloudFormation nor Terraform is supported yet, so this time I'll try it from the console. By the way, it is not available in the Tokyo region yet, so let's point the console to the available region.
Well, I wasn't sure when I tried to make it myself, so in such a case, [Tutorial](https://aws.amazon.com/jp/blogs/news/store-and-access-time-series-data- It would be a standard practice to do it while watching at-any-scale-with-amazon-timestream-now-generally-available /).
Press the "Create database" button on the following screen of Timestream.
Then, set the database name on the opened database creation screen.
If you leave the KMS settings blank, the key will be created without permission. Set the tag as you like and press the "Create database" button.
Creation completed!
Now, let's press the link of the table name created in ↑. There is a "Create table" button on the database details screen, so press it.
Then, set the table name on the opened table creation screen.
The data storage setting is a trial this time, so it's a textbook.
Set the tag as you like and press the "Create table" button.
Table creation is complete!
It's not fun to make it like the tutorial, so let's register Data output by Locust.
The data to be registered is as follows.
Timestamp,User Count,Type,Name,Requests/s,Failures/s,50%,66%,75%,80%,90%,95%,98%,99%,99.9%,99.99%,100%,Total Request Count,Total Failure Count,Total Median Response Time,Total Average Response Time,Total Min Response Time,Total Max Response Time,Total Average Content Size
1603535373,20,GET,/xxxxx/,1.000000,0.000000,5,6,6,6,8,9,9,9,9,9,9,16,0,4.11685699998543,5.413748562499876,4.11685699998543,9.385663000045952,14265.0
Create a command in Python below and load it. You may not be familiar with dimensions, but in short, you can think of them as attribute information for classification. This time, I defined HTTP resources and methods as attributes.
import sys
import csv
import time
import boto3
import psutil
from botocore.config import Config
FILENAME = sys.argv[1]
DATABASE_NAME = "xxxxx-test-timestream"
TABLE_NAME = "xxxxx-test-table"
def write_records(records):
try:
result = write_client.write_records(DatabaseName=DATABASE_NAME,
TableName=TABLE_NAME,
Records=records,
CommonAttributes={})
status = result['ResponseMetadata']['HTTPStatusCode']
print("Processed %d records.WriteRecords Status: %s" %
(len(records), status))
except Exception as err:
print("Error:", err)
if __name__ == '__main__':
session = boto3.Session()
write_client = session.client('timestream-write', config=Config(
read_timeout=20, max_pool_connections=5000, retries={'max_attempts': 10}))
query_client = session.client('timestream-query')
with open(FILENAME) as f:
reader = csv.reader(f, quoting=csv.QUOTE_NONE)
for csv_record in reader:
if csv_record[0] == 'Timestamp' or csv_record[3] == 'Aggregated':
continue
ts_records = []
ts_columns = [
{ 'MeasureName': 'Requests/s', 'MeasureValue': csv_record[4] },
{ 'MeasureName': '95Percentile Response Time', 'MeasureValue': csv_record[10] },
{ 'MeasureName': 'Total Median Response Time', 'MeasureValue': csv_record[18] },
{ 'MeasureName': 'Total Average Response Time', 'MeasureValue': csv_record[19] },
]
for ts_column in ts_columns:
ts_records.append ({
'Time': str(int(csv_record[0]) * 1000),
'Dimensions': [ {'Name': 'resource', 'Value': csv_record[3]}, {'Name': 'method', 'Value': csv_record[2]} ],
'MeasureName': ts_column['MeasureName'],
'MeasureValue': ts_column['MeasureValue'],
'MeasureValueType': 'DOUBLE'
})
write_records(ts_records)
However, it's a feature that has just been released to the public, so some people may have an older version of boto3.
$ pip list -o
So, let's check if boto3 is Latest.
Package Version Latest Type
--------------------- -------- ---------- -----
boto3 1.13.26 1.16.4 wheel
Update with pip with `` `-U```.
$ pip install -U boto3
Also, use aws configure
to point the default region to the region where the database was created with ↑.
If psutil is not included, install it as follows.
$ yum install python3-devel
$ pip3 install psutil
I think that it will be fixed soon, but as of October 25, 2020, the command name is wrong in the official blog of ↑, so if you believe in the blog and pip3, you will not be able to install it and you will feel sad.
By the way, was it possible to load the data safely?
If you select "Query Editor" from the menu on the left, the following screen will be displayed, so let's execute SQL while narrowing down the attributes to the text. I want to know the average response time of GET requests in / xxxxx /!
When I executed it, only the information I wanted was extracted!
To get this as raw data, get it again with CLI or boto3. It is quite troublesome because a page nator is required. In the first place, it is easy to use pandas for a small amount, but in the actual usage scene, the information collected from thousands of servers etc. can be quickly retrieved, so with the amount of information that can be formatted locally with pandas There shouldn't be. The point that it can be monitored in real time in combination with Grafana is the true value.
Recommended Posts