I wrote an article like this before.
At this time, I used DBCP2 for the connection pool because I was migrating from the on-premise Tomcat 8 + MySQL 5.5 environment to the Tomcat 8 + Aurora (MySQL compatible) cluster on AWS EC2, but there were various problems and adjustment was difficult.
This time, I will replace this with HikariCP (2.7.6) and see how it changes.
In a nutshell, it's ** "just slow" **. In particular, "the process of acquiring (borrowing) a connection from the pool is slow".
for that reason,
It is easy to get into such a situation.
It is a connection pool that is fast, simple, and highly reliable.
The introduction page exemplifies how to embed in Maven.
I think that many people use it in combination with Hibernate, so I will omit the detailed explanation.
In a certain Web application (running on Tomcat 8), use JNDI DataSource Factory and try to incorporate it in Driver-based format.
DBCP2 connection
<Context reloadable="true" crossContext="true">
<Resource
name="jdbc/[Connection pool name]"
auth="Container"
type="javax.sql.DataSource"
driverClassName="com.mysql.jdbc.Driver"
factory="org.apache.commons.dbcp.BasicDataSourceFactory"
url="jdbc:mysql://[Cluster endpoint]/[DB name]?useUnicode=true&characterEncoding=utf8&connectTimeout=10000&socketTimeout=180000&alwaysSendSetIsolation=false&cacheServerConfiguration=true&elideSetAutoCommits=true&useLocalSessionState=true"
username="[Connected user name]"
password="[Connection password]"
maxTotal="300"
maxIdle="150"
minIdle="10"
initialSize="10"
testOnBorrow="true"
validationQuery="select concheck.validation()"
validationQueryTimeout="10"
timeBetweenEvictionRunsMillis="60000"
maxWait="10000" />
</Context>
Hikari CP connection
<Context reloadable="true" crossContext="true">
<Resource
name="jdbc/[Connection pool name]"
auth="Container"
type="javax.sql.DataSource"
driverClassName="com.mysql.jdbc.Driver"
factory="com.zaxxer.hikari.HikariJNDIFactory"
jdbcUrl="jdbc:mysql://[Cluster endpoint]/[DB name]?useUnicode=true&characterEncoding=utf8&connectTimeout=10000&socketTimeout=180000&alwaysSendSetIsolation=false&cacheServerConfiguration=true&elideSetAutoCommits=true&useLocalSessionState=true"
dataSource.implicitCachingEnabled="true"
dataSource.user="[Connected user name]"
dataSource.password="[Connection password]"
minimumIdle="10"
maximumPoolSize="300"
connectionTimeout="10000"
idleTimeout="180000"
maxLifetime="180000"
connectionInitSql="select concheck.validation()"
validationTimeout="9800" />
</Context>
connectionTestQuery
** is deprecated for the confirmation query for failover support (see the above article for details) * It is written in * connectionInitSql
**. Even if you write it here, if an error is returned (= you have connected to the Reader instance in read / write mode), it will close the connection and reconnect.JMeter sent 3,000 requests (750 threads) per minute to a web application for 45 minutes. Then, I tried to extract three types of raw (when executed independently) processing time: short / medium / long, out of several screen requests in the Web application.
The unit is milliseconds. In addition, ** Since the amount of data varies greatly for each thread (user), the SQL processing time of the "long" screen in particular varies even in the raw state ** (up to 8.5 seconds). Other conditions are as follows.
4 web servers (EC2 r4.large / Amazon Linux 2017/09 version / 2 for each AZ / Tomcat 8.0.47)
Aurora has db.r4.2xlarge as the main DB, 2 clusters / 1 cluster Writer / Reader / Ver.1.16
Results are aggregated by JMeter (2 JMeter clients)
Request (screen) | Average value | Median | 90%value | 最小value | 最大value |
---|---|---|---|---|---|
Short-time processing screen | 60 | 49 | 78 | 24 | 22,832 |
Medium screen | 234 | 177 | 332 | 86 | 21,607 |
Long-time processing screen | 728 | 171 | 1,827 | 45 | 24,557 |
Request (screen) | Average value | Median | 90%value | 最小value | 最大value |
---|---|---|---|---|---|
Short-time processing screen | 46 | 45 | 57 | 22 | 347 |
Medium screen | 172 | 158 | 242 | 83 | 3,087 |
Long-time processing screen | 676 | 146 | 1,703 | 41 | 9,524 |
Hikari CP is fast overall, but ** notable is the maximum **.
In the case of DBCP2, the short / medium / long range is in the 20 second range. This is not because the processing of the DB server (Aurora) slows down when SQL is flowed in parallel, but there is a waiting time (affected by the processing of other threads) to acquire (borrow) a connection from the pool. Indicates that you are doing.
On the other hand, in the case of HikariCP, if the number of requests is about this level, the response is returned within the range of ** "plain SQL processing time + α" **.
With DBCP2, when the operating load of the application becomes high, the reconnection after failover tends to be blocked immediately. When failover was executed with HikariCP under the above-mentioned load, reconnection was performed in a short time as shown in the figure below.
The difference is not so noticeable when the load is low, but in the case of a system with a certain load, the performance changes greatly depending on the connection pool, which also increases the capacity of the number of connections (scaleable range) and the stability of failover operation. It turns out that it will change a lot.
In particular, when migrating from on-premise to AWS environment or migrating only DB to Aurora, communication delay (latency) from the client that issues SQL and Web server becomes large ** "What problem with DBCP2 before? I didn't have any, but as soon as I migrated, the number of connections did not scale well and got stuck. ”** and other events tend to occur, so I think it is necessary to carefully verify when migrating.