[Ruby] [AWS ElastiCache (Memcached)] Caching queries to RDS

6 minute read

Target

-Build a system that caches queries to RDS using AWS ElastiCache (Memcached) (*).
-The caching method is constructed according to the cache strategy (described later).

Introduction

Since I built the connection of EC2⇔RDS and EC2⇔ElastiCache in the previous articles,
This time, I wrote a program that uses ElastiCache to cache queries to RDS with almost a portfolio.
The language used was Ruby.
It is an implementation that I have never experienced in practice, so please comment if something strange happens lol

What is a cash strategy (*)?

A best practice recommended by AWS when using ElastiCache as a cache.
It is necessary to divide into the following two methods and select a strategy that suits the system use case.
** This time, we will implement using lazy loading. ** **

** ・ Lazy loading **
A method that refers to the cache when reading data, accesses the data source only when there is no hit, acquires the necessary data, and writes it to the cache.
⇒ It is possible to reduce the memory usage of the cache, but the cache data may be old (because the cache is rewritten only when a cache miss occurs).

** ・ Write through **
A method of writing to the cache every time data is written
⇒Although the memory usage of the cache will increase, the latest cache data can always be obtained.

  • For more details, refer to the AWS document.
    ** Cash Strategy **
    https://docs.aws.amazon.com/ja_jp/AmazonElastiCache/latest/mem-ug/Strategies.html

Premise

-The connection between EC2 and RDS (MySQL) has been established (* 1).
-The connection between EC2 and ElactiCache (Memcached) has been established (* 2).

System environment

・ EC2
OS(AMI) : Amazon Linux 2 AMI (HVM), SSD Volume Type
Software: ** Self-made program using Ruby **

・ RDS
Engine: ** MySQL **

・ ElastiChache
Engine: ** Memcached **

Completion flow

The following flow when the cache hits

① Throw a search query from EC2
(2) ElaElastiCache returns the query result and the communication ends.
ElastiCache1 (3).png

The following flow when a cache miss occurs

① Throw a search query from EC2
② A cache miss is returned from ElastiCache
③ Issue SQL query to RDS
④ SQL query result is returned from RDS
⑤ Write the obtained query result to ElastiCache
Copy of ElastiCache2 (2).png

Workflow

|Item number|title|
|:—-:|:————-|
| 1 | Deploy |
| 2 | Operation verification|

procedure

1. Deploy

** ① OS login to EC2 **

** ② Install Ruby execution environment **


$ sudo yum install ruby

** ③ Install gem (Ruby library) of Memocache client for Ruby (*) **


$ gem install Ruby-MemCache

** ③ Install the Mysql client gem (*) **
Install the library required to use the Mysql client gem
(The following is the procedure when using Amazon Linux 2 of EC2. The required library may be different for other distributions.)


$ sudo yum -y install ruby-devel

sudo yum groupinstall "Development Tools"

sudo yum install mysql-devel

Mysql client gem installation


gem install mysql2

** ④ Deploy your own Ruby script (*) on EC2 **
Rewrite <Elasticache_endpoint>, <rds_endpoint>, <db_login_user>, <db_login_password>, <db_name> as appropriate


File name: rds_cache.rb
# **********************************************************************************

#Functional overview: Use AWS ElastiCache (Memcached) to cache query results to RDS
#Function details: Issue a query to ElastiCache and return its value if the cache exists.
#If the cache does not exist, the data source RDS(MySQL)After accessing and displaying the result, cache it in ElastiCache.
#Script usage: ruby<Script path> "<Search SQL query>"

# **********************************************************************************

unless ARGV.size() == 1
    puts "The number of arguments is incorrect."
    exit
end

#package
require 'base64'
require 'memcache'
require 'mysql2'

#variable
sql_query = ARGV[0]                                                     #Execute SQL query
cache_host = "<Elasticache_endpoint>"                                   #Elasticache endpoint
cache_port = 11211                                                      #Elasticache port number
db_host = "<rds_endpoint>"                                              #RDS endpoint
db_user = "<db_login_user>"                                             #DB login user
db_password = "<db_login_password>"                                     #DB password
db_name = "<db_name>"                                                   #Database name

#Base64 encode SQL query (whitespace removal, lowercase conversion) (use as cache key)
encoded_query = Base64.encode64(sql_query.gsub(" ", "").downcase)

#MemCache, instance creation for Mysql connection
memc_connect = MemCache::new "#{cache_host}:#{cache_port}"
db_connect = Mysql2::Client.new(host: db_host, username: db_user, password: db_password, database: db_name)

#Get cache from Elacache
cache_outcome = memc_connect[encoded_query]

if !cache_outcome[0].nil?
    puts "Cache HIT!"
    puts "[Query results from cache]"
    puts cache_outcome[0]
else
    puts "Cache MISS"
    puts "[Query results from datasource]"

    #In case of cache miss, issue SQL query to database
    sql_outcome = db_connect.query(sql_query) 

    cache_val = ""
    for row in sql_outcome do
        puts "--------------------"
        cache_val = cache_val + "--------------------\n"
        for key, value in row do
            puts "#{key} => #{value}"

            cache_val = cache_val + "#{key} => #{value}\n"
        end
    end

    #Set value in Elasticache
    memc_connect[encoded_query] = cache_val
end
  • Implementation policy is as follows
    -Script execution with SQL query as an argument
    -After removing spaces and converting to lowercase letters, the SQL query specified as an argument is encoded by Base64 and used for cache search and saving as an Elasticache key.
    -If a cache hit occurs, the result is output and the script ends.
    -If a cache miss occurs, access RDS (data source), execute SQL query, and output the result. Finally, save the result in Elasticache.

2. Operation verification

DB data for verification


+----+-----------+
| id | Name      |
+----+-----------+
|  1 | Ryosuke   |
|  2 | Tomoharu  |
|  3 | ryosuke   |
|  4 | shunsuke  |
|  5 | sato      |
|  6 | sato      |
|  7 | ryOsuke   |
|  8 | Kawashima |
|  9 | tomoharu  |
| 10 | RYOSUKE   |
+----+-----------+

Execution SQL query and expected result
tempsnip.png

** ① Query first execution (cache miss pattern) **
OK because the Cache MISS message is output and the query result is displayed properly from the data source (RDS).


$ ruby rds_cache.rb "SELECT * FROM test_table WHERE name = 'Ryosuke';"
Cache MISS
[Query results from datasource]
--------------------
id => 1
Name => Ryosuke
--------------------
id => 3
Name => ryosuke
--------------------
id => 7
Name => ryOsuke
--------------------
id => 10
Name => RYOSUKE

** ② Query re-execution (cache hit pattern 1) **
Cache HIT! Is output and Elasticache returns an appropriate query result, so it’s OK.


[[email protected] ~]$ ruby rds_cache.rb "SELECT * FROM test_table WHERE name = 'Ryosuke';"
Cache HIT!
[Query results from cache]
--------------------
id => 1
Name => Ryosuke
--------------------
id => 3
Name => ryosuke
--------------------
id => 7
Name => ryOsuke
--------------------
id => 10
Name => RYOSUKE

** ③ Re-execute a query that is lowercase and tampered with spaces (cache hit pattern 2) **


[[email protected] ~]$ ruby rds_cache.rb "select *            from test_table where name = 'Ryosuke';"
Cache HIT!
[Query results from cache]
--------------------
id => 1
Name => Ryosuke
--------------------
id => 3
Name => ryosuke
--------------------
id => 7
Name => ryOsuke
--------------------
id => 10
Name => RYOSUKE

Impressions

Originally, the purpose of this system is to use cache to speed up query response and reduce database load.
It’s hard to say that the amount of data is too small to confirm the performance benefits … lol
I’m hoping that if I have some time, I’ll be able to check it lightly.