Draw SQL in your head when using ActiveRecord

Do you guys have SQL in mind when using ActiveRecord?

ActiveRecord is very convenient and you can access the DB to get and update data without being aware of SQL. Therefore, it may be written without being aware of SQL at the time of implementation.

At the time of implementation, you do not have to be aware of plain SQL by using ActiveRecord, but when you finally execute it, SQL is executed. Therefore, when you look at the SQL that is actually issued, you may be surprised that such SQL is issued.

Here are some implementation examples that can be avoided if you write while thinking of SQL.

Joining a useless table

Suppose you have the following model.

def User < ApplicationRecord
  has_many :user_organizations
end

def UserOrganization < ApplicationRecord
  belongs_to :user
  belongs_to :organization
end

def Organization < ApplicationRecord
  has_many :user_organizations
end

"Get the user model narrowed down by organization_id. How do you implement it when you are told?

target_organization_id = 1
users = User.join(user_organizations: :organization)
  .where(user_organizations: { organization: { id: target_organization_id }})

This implementation works correctly as specified. The issued SQL is as follows

SELECT `users`.*
FROM `users`
  INNER JOIN `user_organizations` ON `user_organizations`.`user_id` = `users`.`id`
  INNER JOIN `organizations` ON `organizations`.`id` = `user_organizations`.`organization_id`
WHERE `organization`.`id` = 1

What do you think of this SQL? If you think about it, you'll notice that you don't have to join the organizations table. The improved SQL is as follows.

SELECT `users`.*
FROM `users`
  INNER JOIN `user_organizations` ON `user_organizations`.`user_id` = `users`.`id`
WHERE `user_organizations`.`organization_id` = 1

ActiveRecord that realizes this is as follows.

target_organization_id = 1
users = User.join(:user_organizations)
  .where(user_organizations: { organization_id: target_organization_id })

Considering the implementation centered on the ActiveRecord model, it is easy to JOIN even the model with the specified id like the first implementation. Even if you actually do a code review, you will often see such implementations. SQL has better performance with fewer JOINs, so be aware that you should implement ActiveRecord with as few JOINs as possible.

LEFT OUTER JOIN of eager_load

As before, "Get the user model narrowed down by organization_id. ], And if you want to cache ʻuser.exam_organization` later, how do you implement it?

If you keep the implementation as before, since exam_organizations is not cached, SQL will be issued every time you get exam_organization, and it will be N + 1. Therefore, it will be cached by changing joins to eager_load (or includes) as shown below.

target_organization_id = 1
users = User.eager_load(:user_organizations)
  .where(user_organizations: { organization_id: target_organization_id })

Now it will be cached safely, but if you look at the issued SQL, you will notice that INNER JOIN has changed to LEFT OUTER JOIN.

SELECT `users`.id AS t0_r0, ...(All columns are listed. Omitted because it is long)
FROM `users`
  LEFT OUTER JOIN `user_organizations` ON `user_organizations`.`user_id` = `users`.`id`
WHERE `user_organizations`.`organization_id` = 1

If you are writing Rails without thinking about SQL, I feel that you often don't care if INNER JOIN is a LEFT OUTER JOIN like this example. However, when SQL comes to my mind, I can't choose LEFT OUTER JOIN even though it is a join with data, so it feels strange. In such a case, you can cache the data while joining with INNER JOIN by adding joins as shown below.

target_organization_id = 1
users = User.eager_load(:user_organizations).joins(:user_organizations)
  .where(user_organizations: { organization_id: target_organization_id })

Finally

I've given a few examples, but both are easy to avoid if you're writing with SQL in mind. Even if you do something a little inefficient, in most cases it works fine, so you often do not notice it, but since there is a difference in performance due to stacking, I have not been aware of SQL when using ActiveRecord so far. I think it is good for people to be aware of the issued SQL.

Recommended Posts

Draw SQL in your head when using ActiveRecord
Precautions when using checkboxes in Thymeleaf
Things to keep in mind when using if statements
Differences in code when using the length system in Java