First look at the code below.
review = Review.preload(:user, :book).find_by(id: review_id)
What do you do when you see code like this?
I don't have to add preload
. I think I will point out.
In this article, I'd like to explain why this preload
is unnecessary.
If you add preload, you can get the specified related data at the same time. In the case of this example, when you get the review, you also get the related user and book at the same time.
Below is the result of executing with irb. When you get the review, user and book are also SELECTed, and you can see that SQL is not issued where you actually use it.
irb(main):011:0> review_id = 15
=> 15
irb(main):012:0> review = Review.preload(:user, :book).find_by(id: review_id)
Review Load (0.8ms) SELECT `reviews`.* FROM `reviews` WHERE `reviews`.`id` = 15 LIMIT 1
User Load (0.5ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1
Book Load (0.8ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 1
=> #<Review id: 15, content: "hogehoge", user_id: 1, book_id: 1, status: "draft", created_at: "2020-06-15 14:21:23", updated_at: "2020-06-15 14:21:23">
irb(main):013:0> review.user
=> #<User id: 1, name: "1234567890", created_at: "2019-12-12 05:43:52", updated_at: "2019-12-12 05:43:52">
irb(main):014:0> review.book
=> #<Book id: 1, title: "book1", created_at: "2020-06-15 14:21:15", updated_at: "2020-06-15 14:21:15">
It is mainly used as a countermeasure for N + 1. N + 1 will not be described in detail here, but it is an event in which the acquisition SQL of related data is issued one by one in a loop etc. as shown below.
irb(main):022:0> Review.all.each do |review|
irb(main):023:1* review.book
irb(main):024:1> end
Review Load (0.6ms) SELECT `reviews`.* FROM `reviews`
Book Load (0.3ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 1 LIMIT 1
Book Load (0.3ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 2 LIMIT 1
Book Load (0.4ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 3 LIMIT 1
Book Load (0.3ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 4 LIMIT 1
Book Load (2.7ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 5 LIMIT 1
In the above, since the book is not acquired in advance, SQL is issued one by one at review.book. If you attach prealod and get the book in advance, it will be as follows.
irb(main):025:0> Review.all.preload(:book).each do |review|
irb(main):026:1* review.book
irb(main):027:1> end
Review Load (0.8ms) SELECT `reviews`.* FROM `reviews`
Book Load (0.7ms) SELECT `books`.* FROM `books` WHERE `books`.`id` IN (1, 2, 3, 4, 5)
You can see that the book related to the review that was obtained in Review.all before entering the loop is obtained in one SQL. SQL is not issued during the loop because it can be acquired collectively before the loop. In general, issuing SQL is a costly process, so performing SQL once will improve performance. Even in the above example, if you look at the total execution time of SQL, you can see that there is a difference in performance.
But what about the first example? Since we have only obtained one review, it cannot be N + 1 in the loop as before.
I think that preloading means that you may use it at least later. Let's look at the following example.
#Get user
#I will use user and review later, so preload it
review = Review.preload(:user, :book).find_by(id: review_id)
#use user
review.user
#use book
review.book
Since preload is attached, user and book are also acquired when review is acquired. The execution result is as follows.
irb(main):007:0> review = Review.preload(:user, :book).find_by(id: review_id)
Review Load (0.8ms) SELECT `reviews`.* FROM `reviews` WHERE `reviews`.`id` = 36 LIMIT 1
User Load (0.5ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1
Book Load (0.4ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 1
=> #<Review id: 36, content: "", user_id: 1, book_id: 1, status: "draft", created_at: "2020-06-30 15:20:01", updated_at: "2020-06-30 15:20:01">
irb(main):008:0> review.user
=> #<User id: 1, name: "1234567890", created_at: "2019-12-12 05:43:52", updated_at: "2019-12-12 05:43:52">
irb(main):009:0> review.book
=> #<Book id: 1, title: "book1", created_at: "2020-06-15 14:21:15", updated_at: "2020-06-15 14:21:15">
So what if you didn't have preload?
irb(main):010:0> review = Review.find_by(id: review_id)
Review Load (0.7ms) SELECT `reviews`.* FROM `reviews` WHERE `reviews`.`id` = 36 LIMIT 1
=> #<Review id: 36, content: "", user_id: 1, book_id: 1, status: "draft", created_at: "2020-06-30 15:20:01", updated_at: "2020-06-30 15:20:01">
irb(main):011:0> review.user
User Load (0.5ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
=> #<User id: 1, name: "1234567890", created_at: "2019-12-12 05:43:52", updated_at: "2019-12-12 05:43:52">
irb(main):012:0> review.book
Book Load (0.6ms) SELECT `books`.* FROM `books` WHERE `books`.`id` = 1 LIMIT 1
=> #<Book id: 1, title: "book1", created_at: "2020-06-15 14:21:15", updated_at: "2020-06-15 14:21:15">
When you get the review, you don't get the user and book, the SQL is issued where you are using it. However, since there is only one review, the number of SQL issued is the same. In the case of this example, the efficiency is the same with or without preload.
But what about the following example?
#Get user
#I will use user and review later, so preload it
review = Review.preload(:user, :book).find_by(id: review_id)
#Use user under certain conditions
if hoge
review.user
end
#Use book under certain conditions
if fuga
review.book
end
Since preload is attached, user and book are also acquired when review is acquired. If hoge or fuga is true, both user and review are used, so the number of SQL is the same with or without preload.
But what about false? For example, if hoge is false, user is not used, so the user obtained by preload is not used. If fuga is false, book will not be used either.
What if I didn't add preload this time?
#Get user
review = Review.find_by(id: review_id)
#Use user under certain conditions
if hoge
review.user
end
#Use book under certain conditions
if fuga
review.book
end
When you get a review, you don't get a user or book. If hoge or fuga is true, review and book will be acquired at the place to be used. If false, it will not be retrieved.
In the case of this implementation, it can be obtained only when using it. By the way, an implementation that retrieves data when needed in this way is called lazy loading.
Did you know which one is more efficient? When preloading one model, the number of SQL is the same as when preload is not attached even if all the models acquired by preload are used. If there is even one pattern that is not used depending on the conditions, the number of SQL will increase.
If you want to get only one item like the first example, it is meaningless to preload it, but it will be inefficient, so be careful.
If you are new to Rails and somehow know preload and eager_load, you should just add it, right? I think many people think that.
Many reviewers point out N + 1, but I feel that few people point out useless preloads like this one (individual impression).
It is easy to think that there is nothing wrong with adding preload and eager_load that defeat N + 1, but there are patterns that become inefficient like this time, so those who were not aware of it should be aware of it. Let's go.