[Ruby] [Rails] Don’t use select method just to narrow down columns!

3 minute read

About the #ActiveRecord select method

When you get the data with ActiveRecord, you basically get all the items in the corresponding table. As you can see from the issued SQL, all items are acquired with *. Since all items have been acquired, you can refer to any item in the subsequent processing.

pry(main)> user = User.first
  User Load (0.7ms) SELECT `users`.* FROM `users` ORDER BY `users`.`id` ASC LIMIT 1
=> #<User id: 1, name: "ham", created_at: "2020-03-10 01:03:37", updated_at: "2020-06-16 02:18:39">
pry(main)> user.id
=> 1
pry(main)> user.name
=> "ham"

However, it is not necessary to use all columns, so it is better to acquire only the necessary columns? I think there is a way of thinking. In such a case, you can narrow down the columns to be acquired by using the method called select. For more information on select Rails Guide.

You can get only the required columns by specifying select. Columns that have not been obtained cannot be referenced in subsequent processing.

pry(main)> user = User.select(:id, :created_at).first
  User Load (0.7ms) SELECT `users`.`id`, `users`.`created_at` FROM `users` ORDER BY `users`.`id` ASC LIMIT 1
=> #<User id: 1, created_at: "2020-03-10 01:03:37">
pry(main)> user.id
=> 1
pry(main)> user.name
ActiveModel::MissingAttributeError: missing attribute: name
from /usr/local/bundle/gems/activemodel-6.0.3.2/lib/active_model/attribute.rb:221:in `value'

Don’t use select that only narrows down # columns!

It is my personal opinion, but I think it is better not to use select that narrows down the columns if multiple people are developing, such as team development.

Why?

See the code below.

def hoge(user_id)
  Get only id and name with #select
  user = User.select(:id, :name).find(user_id)

  ...(Various processes)

  generate_response(user)
end

private

def generate_response(user)
  {id: user.id, name: user.name}
end

What if you later decide to add email to the hoge method response? We assume that the User model has an email column.

Probably you just have to find the relevant part and add email to generate_response! I think that I will modify it as follows.

def generate_response(user)
- {id: current_user.id, name: current_user.name}
+ {id: current_user.id, name: current_user.name, email: current_user.email}
end

All right! It was done in one line! !! Test run! !! !!

pry(main)> {id: user.id, name: user.name, email: user.email}
ActiveModel::MissingAttributeError: missing attribute: email
from /usr/local/bundle/gems/activemodel-6.0.3.2/lib/active_model/attribute.rb:221:in `value'

that? ? It won’t move… Is the received user wrong? I followed it…

That’s right. Since the columns to be fetched are narrowed down by select, it is necessary to add email to them. If you also modify the following, it will work.

def hoge(user_id)
  Get only id and name with #select
- user = User.select(:id, :name).find(user_id)
+ user = User.select(:id, :name, :email).find(user_id)

The test passed!

pry(main)> {id: user.id, name: user.name, email: user.email}
=> {:id=>1, :name=>"hoge", :email=>"[email protected]"}

What do you think?

I think that if you use Rails Active Record, you get all the basic columns, so many people are addicted to it once as above.

It may not be so much time for each time, but if the system is continuously developed, the same thing will happen every time. This is a good cost. In the worst case, it can even cause a bug without being noticed.

Is it necessary to implement this select even if it raises the development cost and the risk of bugs? I prefer code that is difficult for others to misunderstand, even if it’s a bit suboptimal. This is why I think it’s better not to use select, which only narrows the columns.

Where to use # select

The article has completely denied the existence of select, but of course it can be used. It is when using the aggregate function as follows.

users_group_by_name = User.select('name, count(*) AS cnt').group(:name)
users_group_by_name.each do |u|
  p u.name
  You can get the count with # u.cnt
  p u.cnt
end

However, even in this case, there is a high possibility of misunderstanding if the variable name is set to users etc., so it is better to use a variable name that can be recognized as such.

Also, sometimes I use a select to directly access the table to which I joined, but this is also hard to understand, so I think you should stop it.

review = Review.select('reviews.id, users.name').joins(:user).find_by(id: 1)
# Now you can access user.name
review.name

Access it normally via an association or implement delegate.

app/models/review.rb


review = Review.find_by(id: 1)
# Access via association
review.user.name
# Or define delegate in Review model (delegate :name, to: :user, prefix: true)
review.user_name

Summary

Although this article focused on select, I think it is important to write code that is easy for others to understand (difficult to misunderstand) in team development in which multiple people touch the same code. Writing easy-to-read (hard to misunderstand) code will speed up development and reduce bugs.