Difference between "find_in_batches" vs "in_batches" in Ruby on Rails

In Rails, find_in_batchess and in_batches are two similar methods that many developers often find it's confusing to know when to use which, some even think that they identical and can be used interchangeably. While their names and their usage may look similar, they behave very differently under certain conditions.

Before jumping into the details, let's have a quick refresh on two functions, here're what you'll find on the official API documentation:

  • find_in_batches: Yields each batch of records that was found by the find options as an array.
  • in_batches: Yields ActiveRecord::Relation objects to work with a batch of records.

So how are they different? Let's compare the following code for the find_in_batches:

Post.find_in_batches do |group|
   group.each { |post| puts post.title }
end

With the more or less "equivalent" but using in_batches method:

Post.in_batches do |group|
   group.each { |post| puts post.title }
end

While both will output exactly the same results, however, if using find_in_batches then Rails will only send one query per batch to database to retrieve all posts' data for each batch:

SELECT "posts".* FROM "posts" WHERE ...

On the other hand with in_batches, Rails will send two queries to database in each batch. The first query to get list of posts' IDs for the batch:

SELECT "posts"."id" FROM "posts" WHERE ...

And the second query to get all posts' data:

SELECT "posts".* FROM "posts" WHERE ...

Further Reading

If you look in to the source code for those two function, you will see that find_in_batches actually calls in_batches with load: true passed in the argument. However the default value for load is false in in_batches.

And if you look further in the in_batches for the part that uses value of load, it will look like this:

    if load
      records = batch_relation.records
      ids = records.map(&:id)
      yielded_relation = where(primary_key => ids)
      yielded_relation.load_records(records)
    else
      ids = batch_relation.pluck(primary_key)
      yielded_relation = where(primary_key => ids)
    end

I hope this short article will help you clearly understand the difference between find_in_batches and in_batches in Rails.

Content must not be empty

Related Blog