[RUBY] Ractor super introduction

Introduction

This article is for those who are interested in the new feature Ractor for concurrency introduced in Ruby3. While explaining the sample code using a simple Ractor, I wrote it so that I could deepen my understanding (although it seems to be an article for me in the future ...).

It is also an article that summarizes the results of investigating what can be done with Ractor itself. Therefore, the latter half explains the behavior based on the code when playing variously with Ractor.

environment

What is Ractor?

I'm sure some people don't know about Ractor in the first place, so I'll briefly introduce it.

Ractor is a new function for parallel / concurrency introduced in Ruby3. The feature itself has been proposed for several years, at the time under the name Guild.

However, there was a voice from the game industry saying, "I'm using the name Guild, so I'd like you to use a different name," and it changed to the current Ractor.

It seems that it is based on the ʻActormodel, so it was renamed toRactor (Ruby ’s Actor)`.

Ractor is a unit of parallel execution, and each is executed in parallel. For example, in the code below, puts: hello and puts: hello are executed in parallel.

Ractor.new do
  5.times do
    puts :hello
  end
end

5.times do
    puts :world
end

Running this code gives the following result:

world
helloworld

hello
world
helloworld

helloworld

hello

In this way, each process can be executed in parallel.

The Ractor can also send and receive objects to another Ractor and run them in sync. There are two synchronization methods, push type and pull type.

For example, in the case of push type, the code will be as follows.

r1 = Ractor.new do
    :hoge
end

r2 = Ractor.new do
    puts :fuga, Ractor.recv
end

r2.send(r1.take)

r2.take
# => :fuga, :hoge

With Ractor, you can use the send method to send an object to another Ractor. With the above code

r2.send(r1.take)

It is sending to r2 in the part of.

Submitted objects can be received in Ractor with Ractor.recv

r2 = Ractor.new do
    puts :fuga, Ractor.recv # 
end

You can take the object sent by r2.send (r1.take) and pass it to the puts method.

It also uses the take method to receive the result of the Ractor being executed. So r1.take is receiving: hoge.

In other words, r2.send (r1.take) receives the execution result of r1 and sends it to r2. And puts: fuga, Ractor.recv in r2 becomes puts: fuga,: hoge, which means that fuga and hoge are output respectively.

This is the flow of exchanging objects with push type.

On the other hand, the pull type has the following code.

r1 = Ractor.new 42 do |arg|
    Ractor.yield arg
end

r2 = Ractor.new r1 do |r1|
    r1.take
end

puts r2.take

Ractor.newThe argument passed in|arg|You can receive it as a variable that can be used in the block like.

For example, the following code waits for the r1 to execute the take method.

r1 = Ractor.new 42 do |arg|
    Ractor.yield arg
end

You can also pass another Ractor to Ractor.new, so you can write:

r2 = Ractor.new r1 do |r1|
    r1.take
end

Now you can receive the 42 that r1 received as an argument in r2.

Finally, puts r2.take receives and outputs 42.

The pull type is like this.

Roughly explain

--push type: Ractor # send + Ractor.recv --pull type: Ractor.yield + Ractor # take

It's like that.

For a more detailed explanation of Ractor, please refer to the link below.

Ractor code

Ractor generation

Ractor writes the process you want to execute in the block with Ractor.new.

Ractor.new do
  #This block runs in parallel
end

The processing in this block is executed in parallel.

In other words, in the case of the following code

Ractor.new do
    10.times do
        puts :hoge
    end
end

10.times do
    puts :fuga
end

: hoge and: fuga are output in parallel.

Also, since the process you want to execute is passed as a block to Ractor.new, you can also write as follows.

Ractor.new{
    10.times{
        puts :hoge
    }
}

You can also name it using the keyword argument name, and you can also receive the name with Ractor # name.

r = Ractor.new name: 'r1' do
    puts :hoge
end

p r.name
# => "r1"

This will also allow you to see which Ractor is performing the process.

Pass arguments to Ractor

You can pass an object inside a block by passing an argument to Ractor.new.

r = Ractor.new :hoge do |a|
    p a
end

r.take
# => :hoge

You can pass an object via an argument like this.

You can also pass multiple arguments

r = Ractor.new :hoge, :fuga do |a, b|
    p a
    p b
end

r.take
# => fuga
# => hoge

You can also pass ʻArray` like this.

r = Ractor.new [:hoge, :fuga] do |a|
    p a.inspect
end

r.take
# => "[:hoge, :fuga]"

By the way,|a|To|a, b|If you change to

r = Ractor.new [:hoge, :fuga] do |a, b|
    p a
    p b
end

r.take
# => :hoge
# => :fuga

The output result will be. This seems to be interpreted as the same behavior as ʻa, b = [: hoge,: fuga] `.

Also, in the case of Hash

r = Ractor.new({:hoge => 42, :fuga => 21}) do |a|
    p a
    p a[:hoge]
end

r.take
# => {:hoge=>42, :fuga=>21}
# => 42

Is output. By the way, if you do not enclose it with () after Ractor.new, it will be SyntaxError, so be careful.

r = Ractor.new({:hoge => 42, :fuga => 21}) do |a|
    p a
    p a[:hoge]
end

r.take
# => SyntaxError

Return value in Ractor

In Ractor, the return value in the executed block can be received by the take method.

r = Ractor.new do
    :hoge
end

p r.take
# => :hoge

By the way, if you do return in the block, it seems to beLocalJumpError.

r = Ractor.new do
    return :fuga
    :hoge
end

p r.take
# => LocalJumpError

Exceptions within Ractor

Exceptions within Ractor can be received as follows:

r = Ractor.new do
    raise 'error'
end

begin
    r.take
rescue Ractor::RemoteError => e
    p e.message
end

By the way, you can also write exception handling in Ractor.

r = Ractor.new name: 'r1' do
    begin
        raise 'error'
    rescue => e
        p e.message
    end
end

r.take

Also, according to the documentation, it seems that you can catch the exception in the area that receives the value returned from within the Ractor block. In other words, you can also write the following code.


r1 = Ractor.new do
    raise 'error'
end

r2 = Ractor.new r1 do |r1|
    begin
        r1.take
    rescue Ractor::RemoteError => e
        p e.message
    end
end 

r2.take
# => "thrown by remote Ractor."

Parallel execution in Ractor

Simple example

You can execute in parallel with Ractor like this.

Ractor.new do
    3.times do
        puts 42
    end
end

3.times do
    puts 21
end

When executed, the outputs 42 and 21 will be displayed separately.

A little example

You can generate multiple workers with Ractor, pass them via pipe, pass values, and summarize the results as shown below.

require 'prime'

pipe = Ractor.new do
  loop do
    Ractor.yield Ractor.recv
  end
end

N = 1000
RN = 10
workers = (1..RN).map do
  Ractor.new pipe do |pipe|
    while n = pipe.take
      Ractor.yield [n, n.prime?]
    end
  end
end

(1..N).each{|i|
  pipe << i
}

pp (1..N).map{
  r, (n, b) = Ractor.select(*workers)
  [n, b]
}.sort_by{|(n, b)| n}
# => 0 ~Outputs the result of whether numbers up to 999 are prime numbers

This code creates 10 workers and passes an object to each worker via pipe. It also returns the received object with Ractor.yield [n, n.prime?].

You can create multiple workers like this, process them via pipe, and receive the results.

Let's write a class that creates and processes workers etc.

With the previous code, the processing in worker was likely to become large later, so I wrote a class that will generate worker nicely as follows.

class Ninsoku
    def initialize(task, worker_count: 10)
      @task = task
      @pipe = create_pipe
      @workers = create_workers(worker_count)
    end

    def send(arg)
        @pipe.send arg
    end

    def run
        yield Ractor.select(*@workers)
    end

    def create_pipe
        Ractor.new do
            loop do
                Ractor.yield Ractor.recv
            end
        end
    end

    def create_workers(worker_count)
        (1..worker_count).map do
            Ractor.new @pipe, @task do |pipe, task|
                loop do 
                  arg = pipe.take
                  task.send arg
                  Ractor.yield task.take
                end
            end
        end
    end
end

Ninsoku.new is generating pipe and worker. Also, task passes the content you want to process with Ractor and executes it with worker.

It looks like this as a case to actually use.

task = Ractor.new do
  func = lambda{|n| n.downcase }
  loop do
    Ractor.yield func.call(Ractor.recv)
  end
end

ninsoku = Ninsoku.new(task)

('A'..'Z').each{|i|
  ninsoku.send i
}

('A'..'Z').map{
    ninsoku.run{|r, n|
        puts n
    }
}
# => a ~Up to z are output in parallel

~~ I think I'll try this class later on gem. ~~

I tried to make it gem (I haven't pushed to rubygems ...)

S-H-GAMELINKS/rorker

For example, let's process the AED location information of Hamada City, Shimane Prefecture, where I live, using Ractor. For the AED location information of Shimane Prefecture, we used the open data released by Shimane Prefecture. I would like to take this opportunity to thank you.

Shimane Prefecture Open Data Catalog Site

require "rorker"
require "csv"

task = Ractor.new do
  func = lambda{|row| 
    row.map{|value|
      if value =~ /Hamada City/
        row
      end  
    }.compact
  }
  loop do
    Ractor.yield func.call(Ractor.recv)
  end
end

rorker = Rorker.new(task)

csv = CSV.read "a.csv"

csv.each do |row|
  rorker.send row
end

n = 0

while n < csv.count
  rorker.run{|worker, result|
    if !result.empty?
      puts result
    end
  }
  n += 1
end

You can also pass the CSV read like this to the worker line by line and fetch the necessary data in parallel processing.

Use Numbered Parameter in Ractor

Since Ractor passes processing in blocks, you can also use Numbered Parameter to receive arguments.

r = Ractor.new :hoge do
    puts _1
end

r.take
# => hoge

By the way, it works with multiple arguments.

r = Ractor.new :hoge, :hoge do
    puts _1
    puts _2
end

r.take
# => hoge
# => fuga

If you pass more than one, it seems that they are passed from _1 to _9 in the order in which they were passed.

By the way, if you pass Hash, it will look like this.

r = Ractor.new ({hoge: 1, fuga: 2}) do
    _1.map do |key, value|
        p ":#{key} => #{value}"
    end
end

r.take
# => ":hoge => 1"
# => ":fuga => 2"

Hash with => gave similar results

r = Ractor.new({:hoge => 1, :fuga => 2}) do
    _1.map do |key, value|
        p ":#{key} => #{value}"
    end
end

r.take
# => ":hoge => 1"
# => ":fuga => 2"

However, in the case of Array, the behavior is slightly different.

r = Ractor.new [1, 2, 3] do
    puts _1
    puts _1.class
    puts _2
    puts _2.class
    puts _3
    puts _3.class    
end

r.take
#=> 1
#=> Integer
#=> 2
#=> Integer
#=> 3
#=> Integer

Apparently, it seems to be passed in order from the beginning of the Array, like when passing multiple arguments as usual. Perhaps it is interpreted as follows.

_1, _2, _3 = [1, 2, 3]

By the way, if you pass a ʻArray that is larger than the number that can be received by the Numbered Parameter`

r = Ractor.new [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] do
    puts _1
    puts _2
    puts _3
    puts _4
    puts _5
    puts _6
    puts _7
    puts _8
    puts _9
end

r.take
#=> 1
#=> 2
#=> 3
#=> 4
#=> 5
#=> 6
#=> 7
#=> 8
#=> 9

It seems that you can get it like this up to the range where Numbered Parameter can be received.

When using Numbered Parameter in Ractor, when Hash is passed as an argument

r = Ractor.new ({hoge: 1, fuga: 2}) do |hash|
    hash.map do
        p ":#{_1} => #{_2}"
    end
end

r.take
":hoge => 1"
":fuga => 2"

Or it seems to be used when you want to omit it when passing some arguments

r = Ractor.new :hoge, :fuga do
    p _1
    p _2
end

r.take
# => :hoge
# => :fuga

in conclusion

We hope you read this article and become interested in Ractor. I will continue to add the code I tried using Ractor.

reference

Recommended Posts

Ractor super introduction
Groovy super easy introduction
[For super beginners] DBUnit super introduction
[For super beginners] Ant super introduction
[For super beginners] Maven super introduction
[For super beginners] Mirage SQL super introduction
[Super Introduction] About Symbols in Ruby
Lombok ① Introduction
Introduction (self-introduction)
[Java] Introduction
Introduction (editing)