[Java] Use INSERT … ON DUPLICATE KEY UPDATE in MySQL to conveniently insert/update records

8 minute read

Overview

The MySQL INSERT statement has the syntax “INSERT … ON DUPLICATE KEY UPDATE”. This is much more convenient than writing the INSERT and UPDATE statements separately if you want to insert or update records if there are duplicates. In this article, I will summarize how to use such “INSERT … ON DUPLICATE KEY UPDATE”, useful points, and other specifications.

Preparation

Use the following table.

select version();
+-----------+
| version() |
+-----------+
| 8.0.17 |
+-----------+

CREATE TABLE users (
    id INT(8) NOT NULL,
    name VARCHAR(128),
    age INT(3),
    registered_timestamp timestamp not null default current_timestamp,
    updated_timestamp timestamp not null default current_timestamp on update current_timestamp,
    primary key(id)
);

INSERT INTO users (id, name, age) VALUES(1,'taro', 20), (2,'jiro', 18);

+----+------+------+----------------------+------- --------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+------+------+----------------------+------- --------------+
| 1 | taro | 20 | 2020-06-05 11:05:52 | 2020-06-05 11:05:52 |
2 | jiro | 18 | 2020-06-05 11:05:52 | 2020-06-05 11:05:52 |
+----+------+------+----------------------+------- --------------+

Basic usage

For example, suppose you want to add a record with id:2 to this table, but you want to rewrite it if it already exists. First, check if the record with id: “2” exists, and if it does not exist, insert it. If it does, you will submit an UPDATE statement. The required SQL is as follows.

SELECT * FROM users where id = 2;

- -If the result is not returned
INSERT INTO users (id, name, age) values (2,'jiro', 19);

- -When the results come back
UPDATE users SET age = 19 where id = 2;

Let’s try building and executing this query using placeholders from Java.

//Get a Connection instance connection with MySQL

String sqlSelect = "SELECT * FROM users where id = ?;";
PreparedStatement psSelect = connection.prepareStatement(sqlSelect);
psSelect.setInt(1, 2);
ResultSet rs = psSelect.excuteQuery();

if(!rs.next()){
    String sqlInsert = "INSERT INTO users (id, name, age) values (?, ?, ?);";
    PreparedStatement psInsert = connection.prepareStatement(sqlInsert);
    psInsert.setInt(1, 2);
    psInsert.setString(2, "jiro");
    psInsert.setInt(3, 19);
    psInsert.excute();
} else {
    String sqlUpdate = "UPDATE users SET age = ?where id = ?;";
    PreparedStatement psUpdate = connection.prepareStatement(sqlUpdate);
    psUpdate.setInt(1, 19);
    psUpdate.setInt(2, 2);
    psUpdate.excute();
}
connection.commit();

It is quite troublesome.

“INSERT … ON DUPLICATE KEY UPDATE” does such conditional branch on the query,

  • Whether there is a target record (SELECT statement)
  • Insert if not (INSERT statement)
  • Update if there is (UPDATE statement)

Will combine all three queries into one.

INSERT INTO users (id, name, age) values (2,'jiro', 19) ON DUPLICATE KEY UPDATE age = 19;

+----+------+------+----------------------+------- --------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+------+------+----------------------+------- --------------+
| 1 | taro | 20 | 2020-06-05 11:05:52 | 2020-06-05 11:05:52 |
| 2 | jiro | 19 | 2020-06-05 11:05:52 | 2020-06-05 11:06:24 |
+----+------+------+----------------------+------- --------------+

In this case, the id column is the primary key, so if there is a duplicate id in the “INSERT INTO” part, it will automatically update that record. If you assemble this from Java

String sql = "INSERT INTO users (id, name, age) values (?, ?, ?) "+
                "ON DUPLICATE KEY UPDATE age = ?;";
PreparedStatement ps = connection.prepareStatement(sql);
ps.setInt(1, 2);
ps.setString(2, "jiro");
ps.setInt(3, 19);
ps.setInt(4, 2);
ps.excute();
connection.commit();

It is absolutely easy.

Using the VALUES() function

Furthermore, using a function called VALUES() that can be used only with “ON DUPLICATE KEY UPDATE” makes query construction easier. You can use this function as follows to refer to the value you tried to insert in the INSERT part.

- -VALUES(age) returns 19
INSERT INTO users (id, name, age) values (2,'jiro', 19) ON DUPLICATE KEY UPDATE age = VALUES(age);

In addition, this function allows you to insert/update multiple rows with “ON DUPLICATE KEY UPDATE”.

- -VALUES(age) returns 20 for id:"1" line and 17 for id:"3" line
INSERT INTO users (id, name, age) values (1,'taro', 20), (3,'saburo', 17) ON DUPLICATE KEY UPDATE age = VALUES(age);

+----+--------+------+----------------------+----- ----------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+--------+------+----------------------+----- ----------------+
| 1 | taro | 20 | 2020-06-05 11:05:52 | 2020-06-05 11:05:52 |
| 2 | jiro | 19 | 2020-06-05 11:05:52 | 2020-06-05 11:06:24 |
| 3 | saburo | 17 | 2020-06-05 11:07:01 | 2020-06-05 11:07:01 |
+----+--------+------+----------------------+----- ----------------+

As above, if there is a duplicate primary key, it can be updated according to the value of age that was tried to be inserted in each record.

Also, as you can see, when you try to execute UPDATE due to duplicate keys, if the value you are trying to update and the existing value all match, UPDATE is not executed and “ON UPDATE” is added. Columns are not updated either. (Same as the specifications of UPDATE itself)

If you try to assemble a query programmatically without using “ON DUPLICATE KEY UPDATE” for this operation, you need to loop the pattern itself that creates a conditional branch using the above SELECT statement multiple times. However, if you use the VALUES() function with “ON DUPLICATE KEY UPDATE”, you only need to loop the operation of assigning to the placeholder as shown below.

int numRecord = 2;
List<String> placeholderList = new ArrayList<>();
for (int i = 0; i <numRecord; i++){
    placeholdersList.add("(?, ?, ?)");
}

String sql = "INSERT INTO users (id, name, age) values "+
                StringUtils.join(placeholdersList, ", ") +
                "ON DUPLICATE KEY UPDATE age = VALUES(age)";PreparedStatement ps = connection.prepareStatement(sql);

// You only need to loop this part for the number of records
// first record
ps.setInt(1, 1);
ps.setString(2, "taro");
ps.setInt(3, 20);
// second record
ps.setInt(5, 3);
ps.setString(6, "saburo");
ps.setInt(7, 17);

ps.excute();

connection.commit();

The operation to switch between INSERT and UPDATE after checking whether each record already exists in the database can be executed with just one query.

Application

“ON DUPLICATE KEY UPDATE” syntax IF/CASE control flow functions Can be used to make conditional updates.

+----+--------+------+----------------------+----- ----------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+--------+------+----------------------+----- ----------------+
| 1 | taro | 20 | 2020-06-05 11:05:52 | 2020-06-05 11:05:52 |
| 2 | jiro | 19 | 2020-06-05 11:05:52 | 2020-06-05 11:06:24 |
| 3 | saburo | 17 | 2020-06-05 11:07:01 | 2020-06-05 11:07:01 |
+----+--------+------+----------------------+----- ----------------+

- -Update only when the age value (VALUES(age)) you tried to insert is larger than the original age value
INSERT INTO users (id, name, age) values (2,'jiro', 20), (3,'sabro', 17) ON DUPLICATE KEY UPDATE age = IF(VALUES(age) >age, VALUES(age), age);

+----+--------+------+----------------------+----- ----------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+--------+------+----------------------+----- ----------------+
| 1 | taro | 20 | 2020-06-05 11:05:52 | 2020-06-05 11:05:52 |
| 2 | jiro | 20 | 2020-06-05 11:05:52 | 2020-06-05 11:07:38 |
| 3 | saburo | 17 | 2020-06-05 11:07:01 | 2020-06-05 11:07:01 |
+----+--------+------+----------------------+----- ----------------+

- -If the age value (VALUES(age)) you tried to insert is larger than the original age value, add 1; if it is equal, keep it; if smaller, update it to a new value
INSERT INTO users (id, name, age) values (2,'jiro', 22), (3,'saburo', 15)
    ON DUPLICATE KEY UPDATE
        age =
        CASE
        WHEN VALUES(age) >age THEN age + 1
        WHEN VALUES(age) = age THEN age
        WHEN VALUES(age) <age THEN VALUES(age)
        END;

+----+--------+------+----------------------+----- ----------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+--------+------+----------------------+----- ----------------+
| 1 | taro | 20 | 2020-06-05 11:05:52 | 2020-06-05 11:05:52 |
| 2 | jiro | 21 | 2020-06-05 11:05:52 | 2020-06-05 11:08:06 |
| 3 | saburo | 15 | 2020-06-05 11:07:01 | 2020-06-05 11:08:06 |
+----+--------+------+----------------------+----- ----------------+

AUTO_INCREMENT and ON DUPLICATE KEY UPDATE

Article that if the table has a column of automatic numbering (AUTO_INCREMENT), the serial number will be incremented by one even if UPDATE is executed without being updated when “ON DUPLICATE KEY UPDATE” syntax is used Often comes out. (Advantages and notes of INSERT ON DUPLICATE KEY UPDATEetc.)

However, in the MySQL version 5.6 reference

(These effects are not the same for InnoDB tables where a is an autoincrement column. If you use an autoincrement column, the INSERT statement increases the autoincrement value, but not UPDATE.)

There is 1.

I tried to verify.

CREATE TABLE users_increment (
    id INT(8) NOT NULL AUTO_INCREMENT,
    name VARCHAR(128),
    age INT(3),
    registered_timestamp timestamp not null default current_timestamp,
    updated_timestamp timestamp not null default current_timestamp on update current_timestamp,
    primary key(id)
);

INSERT INTO users_increment (name, age) VALUES('taro', 20), ('jiro', 18);
+----+------+------+----------------------+------- --------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+------+------+----------------------+------- --------------+
| 1 | taro | 20 | 2020-06-05 11:17:58 | 2020-06-05 11:17:58 |
2 | jiro | 18 | 2020-06-05 11:17:58 | 2020-06-05 11:17:58 |
+----+------+------+----------------------+------- --------------+

- -The INSERT statement is not executed because the key is duplicated
- -If the automatic serial number advances even after being UPDATEd, the id of the next INSERTed record should be "4"
INSERT INTO users_increment (id, name, age) values (2,'jiro', 19) ON DUPLICATE KEY UPDATE age = VALUES(age);
+----+------+------+----------------------+------- --------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+------+------+----------------------+------- --------------+
| 1 | taro | 20 | 2020-06-05 11:17:58 | 2020-06-05 11:17:58 |
| 2 | jiro | 19 | 2020-06-05 11:17:58 | 2020-06-05 11:19:01 |
+----+------+------+----------------------+------- --------------+

- -The record actually inserted next is "3"
INSERT INTO users_increment (name, age) values ('saburo', 17);
+----+--------+------+----------------------+----- ----------------+
| id | name | age | registered_timestamp | updated_timestamp |
+----+--------+------+----------------------+----- ----------------+
| 1 | taro | 20 | 2020-06-05 11:17:58 | 2020-06-05 11:17:58 |
| 2 | jiro | 19 | 2020-06-05 11:17:58 | 2020-06-05 11:19:01 || 3 | saburo | 17 | 2020-06-05 11:19:46 | 2020-06-05 11:19:46 |
+----+--------+------+----------------------+----- ----------------+

As mentioned above, in recent versions of MySQL, even if “ON DUPLICATE KEY UPDATE” is executed, automatic serial numbers will not proceed unless INSERT is performed.

References

MySQL 5.6 Reference Manual Control Flow Functions

MySQL 5.6 Reference Manual INSERT … ON DUPLICATE KEY UPDATE Syntax

MySQL: INSERT…ON DUPLICATE KEY UPDATE Summary

Advantages and notes of INSERT ON DUPLICATE KEY UPDATE

  1. In the version 8 reference that I am using, “(The effects are not identical for an InnoDB table where a is an auto-increment column. With an auto-increment column, an INSERT statement increases the auto- increment value but UPDATE does not.)”.