――Everywhere in Ruby reflects Unix system calls, culture and ideas. ――By using Ruby, you can leave low-level things to the language and learn the idea of Unix itself.
--All code is running on the process. --When traffic and resources are tight, you have to look beyond the application code.
――The idea of Unix programming and its technology will be useful for the next 40 years.
--System call --The program cannot operate the kernel directly and must be all via system calls. --The system call interface acts as an intermediary between the kernel and userland.
--man page --All the documentation that is appropriate for learning Unix programming. ――It will be helpful under the following situations.
--man page section
--Process: Unix atom --All code is executed on the process. --Launching ruby from the command line spawns a new process to execute the code. When you finish executing the code, the process ends. --The MySQL server keeps running all the time because the dedicated process of the MySQL server keeps running.
--Cross reference --ps (1) command --Cross-reference the information that pid sees from the kernel. --I often see it in log files. --If you are logging multiple processes in one file, you must be able to identify which process each line in the log file came from. You can solve this problem by including the pid in each line of the log.
--Commands that can be cross-referenced with the information provided by the OS - top(1) --Display the running process in real time. - lsof(8) --list open files. List open files.
--Every process has a parent process.
--Parent process --The process that started the process.
--Example --When I launch "Terminal.app" on Mac OS X, I get a bash prompt. ――Since everything is a process, this behavior means that you started the process of "Terminal.app" and then the bash process. --At this time, the parent process of the bash process becomes the process of "Terminal.app". --If you run the ls (1) command from the bash prompt, the parent process of the ls process is the bash process.
--Practical example --There are not many cases where ppid is actually used, and it may be important if you want to detect a daemon process.
--The number of the opened file is represented as a file descriptor, just as the running process is represented by pid.
--Everything is a file --One of Unix philosophy. --Devices, sockets, pipes, files, etc. are all treated as files.
--File description represents a resource --When a resource is opened in a running process, a file descriptor is assigned. --File descriptions are not shared between unrelated processes. --File descriptions are usually shared through parent-child relationships, but how to share them between processes that are completely unrelated (Use of auxiliary data in UNIX domain sockets There is also man7 / unix.7.html # lbAD)). (Thanks to @ angel_p_57!) --The file descriptor is destroyed when the process that opened the resource terminates. --The file description is released when all file descriptors that refer to the same file description are destroyed.
--File descriptors are destined to live with the process and die with the process.
--In Ruby, open resources are represented by IO classes. Every IO object knows its own file descriptor. --You can get the file descriptor using IO # fileno.
--File descriptors are assigned in order from the smallest unused integer. --When the resource is closed, the file descriptor assigned to it becomes available again.
--Standard stream --Every Unix process comes with three open resources. 0. Standard input (STDIN)
STDIN --Provides a general method for reading from inputs such as keyboard devices and pipes.
STDOUT、 STDERR --Provides a general method for writing to output destinations such as monitors, files, and printers.
--Practical example --File descriptors are the heart of network programming using sockets and pipes.
――How many file descriptors can one process have? --Depends on system settings.
--Resource limits are set for each process by the kernel.
--All processes inherit environment variables from their parent process. --Environment variables are set by the parent process and are passed on to the child process. --Environment variables exist for each process and can be accessed globally in each process.
--ENV partially implements the API of Enumerable and Hash, but it does not have exactly the same functions as Hash.
--Practical example
$ RAILS_ENV=production rails server
$ EDITOR=mate bundle open actionpack
$ QUEUE=default rake resque:work
--Environment variables are often used as a way to pass input to command line tools.
ARGV --A special array that can be referenced by Ruby processes.
argv --argument vector. An array of arguments. --Contains the arguments passed to the process from the command line.
$ cat argv.rb
p ARGV
$ ruby argv.rb foo bar -va
["foo", "bar", "-va"]
--Practical example --If you want to pass the file name to the program. --For example, when writing a program that receives one or more file names from the command line and processes the files. --Analysis of command line arguments
--Unix processes have few means of telling the state of a process. --Invention of log files by programmers. --Log files can be written to the file system to share whatever information the process wants to convey. ――But this is more about the file system level than the process itself. --Open the socket and use the network. --A process can communicate with other processes, but because it depends on the network, this is also different from the level of the process itself.
--Two mechanisms to convey information at the process level.
$ PROGRAM_NAME
.--Exit code value --Every process ends with an exit code value (0-255) that indicates whether it ended normally or above.
--Exit code 0 --At the end of normal --Other exit codes indicate an error.
--How to end the process 1. exit 2. exit! 3. abort 4. raise
Kernel#exit --The easiest way. --Even if the script is terminated without explicitly terminating it, the same processing is implicitly performed.
Kernel#exit! --The default exit code is abnormal termination (1) --The block defined by Kernel # at_exit is not executed.
Kernel#abort --Often used to terminate a problematic process.
Kernel#raise --One of the ways to terminate the process even if the exception thrown by raise is not caught. --raise does not terminate the process immediately, the exception is simply raised towards the caller. --If the exception is not caught anywhere, the process will end as a result.
--Process generation --The process that calls fork (2) is called the "parent process", and the newly created process is called the "child process".
--Child process --The child process takes over all the memory copies used by the parent process. --If a process loads a huge amount of software and it consumes 500MB of memory (ex. Rails app), if you spawn two child processes from this process, each child process will be in memory. It will keep a huge copy of the software efficiently. --With fork, the call comes back immediately, and there are three processes that consume 500MB of memory. --It's really convenient when you want to launch multiple instances of an application at the same time. --The file descriptor opened by the parent process is inherited in the same way. --The same file descriptor as the parent process is assigned to the child process. --Therefore, you can share open files, sockets, etc. between two processes. --Because it is a completely new process, it will be assigned a unique pid. --ppid is the pid of the process that executed fork (2). --The memory copied by the child process can be freely changed without affecting the parent process.
--fork method --The fork method is called once and actually returns twice. --fork is a method to spawn a new process! --One returns to the calling parent process and the other to the spawned child process.
#Both the if and else clauses of the if statement are executed
#On the parent process side, the pid of the created child process is returned, and on the child process side, fork returns nil.
if fork
puts "entered the if block"
else
puts "entered the else block"
end
=> entered the if block
entered the else block
--Is fork multi-core programming? --This will happen if the newly created process can be distributed (in parallel) across multiple CPU cores, but there is no guarantee that it will be processed in multiple cores. --For example, of the four CPUs, all four processes could be processed by a single CPU.
--Use blocks --A common method in Ruby is to pass blocks to fork. --If you call the fork method with a block, the block will only be executed by the child process and ignored by the parent process. --The child process ends there when the processing in the block is completed. The processing of the parent process does not continue.
fork do
#Describe the process to be executed in the child process here
end
#Describe the process to be executed in the parent process here
--The child process remains alive even if the parent process dies. --When you create a child process, for example, if you enter Ctrl-C, process control may not work as to whether the parent or child process should be terminated.
--Manage the orphan process --Demon process ――It is a process that has been intentionally orphaned, and aims to keep moving forever. --Unix signals --How to communicate with a process that does not have a terminal.
--It is a considerable overhead for the child process to copy all the data that the parent process has in memory.
--Copy on Write (CoW, Copy on Write) --A mechanism that delays the actual copy of memory until it needs to be written. --In the meantime, the parent and child processes physically share the same data in memory. --By copying memory only when either the parent or the child needs to change it, the independence of both processes is maintained.
--CoW is very convenient and fast to save resources when spawning child processes with fork (2). --You only have to copy the data that the child process needs, and share the rest.
--For CoW to work well, the Ruby implementation must be written so as not to break this feature provided by the kernel.
--Fire and forget --When you want the child process to process asynchronously and the parent process wants to proceed independently.
message = 'Good Morning'
recipient = '[email protected]'
fork do
#Create a child process and send the data to the statistic collector
#The parent process continues the actual message sending process.
#
#As a parent process, I don't want this work to slow down,
#I don't care if the transmission to the statistic collector fails for some reason.
StatsCollector.record message, recipient
end
#Send a message to the actual destination
--Babysitter --Except for the above cases, in most cases using fork (2), some kind of mechanism that can manage child processes on a regular basis is required.
Change before:
fork do
5.times do
sleep 1
puts "I'm an orphan!"
end
end
abort "Parent process died..."
After change:
fork do
5.times do
sleep 1
puts "I am an orphan!"
end
end
Process.wait
abort "Parent process died..."
I am an orphan!
I am an orphan!
I am an orphan!
I am an orphan!
I am an orphan!
Parent process died...
--The exit status is used as a means of communication between processes by the exit code. --The exit code is used to convey information to other processes, but Process.wait2 allows you to refer to that information directly.
Example of interprocess communication without file system or network:
#Spawn 5 child processes
5.times do
fork do
#Generate a random value for each child process.
#If it is even, it returns 111, and if it is odd, it returns 112 as the exit code.
if rand(5)
exit 111
else
exit 112
end
end
end
5.times do
#Wait for the spawned child process to finish.
pid, status = Process.wait2
#If the exit code is 111
#You can see that the values generated by the child process are even.
if status.exitstatus == 111
puts "#{pid} encountered an even number!"
else
puts "#{pid} encountered an odd number!"
end
end
favourite = fork do
exit 77
end
middle_child = fork do
abort "I want to be waited on!"
end
pid, status = Process.waitpid2 favourite
puts status.exitstatus
--Process.wait and Process.waitpid actually both point to the same function. --You can pass pid to Process.wait to wait for the termination of a particular child process, or you can pass -1 to Process.waitpid to wait for any process. ――As a programmer, it is important to use tools that can express intentions as much as possible, so even if the two methods are the same, it is better to use them properly as follows. --Process.wait to wait for any child process --Process.waitpid when waiting for a specific process
--Since the kernel queues the information of the terminated process, the parent process can always receive the information at the time of termination of the child process. --Therefore, there is no problem even if the parent process takes time for the processing that accompanies the termination of the child process.
--Practical example --Leveraging child processes is the most common pattern in Unix programming. --Called babysitter process, master / worker, prefork, etc. --From one prepared process, multiple child processes are created for parallel processing, and then the child processes are taken care of. --Web server Unicorn --Unicorn specifies how many worker processes to use when starting the server. --If you specify that you need 5 instances, the unicorn process spawns 5 child processes to handle web requests after launch. The parent (or master) process monitors the life and death of each child process so that the child process can respond properly.
--Detach of child process --If you don't want to use Process.wait to wait for the child process to finish, you have to detach the child process.
--The kernel keeps information about the terminated child process until the parent process uses Process.wait to request that information. --If the parent process does not indefinitely request the exit status of the child process, that information will never be removed from the kernel. --It is a waste of kernel resources to create a child process in a "keep shooting" method and leave the exit status of the child process unattended.
Example:
message = 'Goog Morning'
recipient = '[email protected]'
pid = fork do
#Create a child process and send the data to the statistic collector
#The parent process continues the actual message sending process.
#
#As a parent process, I don't want this work to slow down,
#I don't care if the transmission to the statistic collector fails for some reason.
StatsCollector.record message, recipient
end
#Ensure that the child process that collects statistics does not become a zombie.
Process.detach(pid)
--A child process that dies without straddling the parent process becomes a zombie process without exception. --If the child process terminates while the parent process is processing (not waiting for the child process), it will definitely become a zombie. --Once the parent process gets the exit status of the zombie process, that information disappears properly so you don't waste any more kernel resources.
--Process.wait is a blocking call --Process.wait allows the parent process to manage the child process, but the parent process cannot continue processing until the child process terminates.
--Example to supplement SIGCHLD
child_processes = 3
dead_processes = 0
#Spawn 3 child processes
child_processes.times do
fork do
#Sleep for 3 seconds each
sleep 3
end
end
#After this, the parent process gets busy with heavy calculations,
#I want to detect the termination of a child process.
#Therefore,:Supplement the CHLD signal. By doing this
#You can receive notifications from the kernel when a child process terminates.
trap(:CHLD) do
#Process the information of the terminated child process.If you get it with wait,
#You can see which of the spawned child processes has terminated.
puts Process.wait
dead_processes += 1
#Explicitly terminate the parent process when all child processes have terminated.
exit if dead_processes == child_processes
end
#Heavy calculation
loop do
(Math.sqrt(rand(44)) ** 8).floor
sleep 1
end
--Parallel with SIGCHLD --Signal delivery is unreliable. --If another child process terminates while processing a CHLD signal, there is no guarantee that it will be able to capture the next CHLD signal.
--Handle CHLD properly --You need to loop through the call to Process.wait and wait until all notifications that the child process has died are processed.
--Second argument of Process.wait --Correspondence to the situation where you may receive multiple CHLD signals while processing the signal. --You can pass a pid to the first argument, but you can pass a flag to the second argument that tells the kernel not to block if there are no child processes waiting to end.
Process.wait(-1, Process::WNOHANG)
--Signal Guide --Unix signals are asynchronous communication. --When a process receives a signal from the kernel, it does one of the following:
--The signal is sent by the kernel. --The signal has a source. --Signals are sent from one process to another, and the kernel acts as an intermediary.
--The initial use of signals was to specify how to kill a process.
—— Signals are a great tool and work great in certain situations. ――But keep in mind that supplementing signals is like using global variables.
--The process can receive the signal at any time. --Signal reception is asynchronous. --Whenever a process receives a signal, it moves to a signal handler. --It doesn't matter if it's a busy loop or a long sleep. --When all the processing in the handler is completed, the code returns to the suspended code and the processing continues.
--If you know the pid, you can communicate with any process on the system by signal. --Signals are a very powerful means of communication between processes. --Signal transmission using kill (1) from the terminal is a common sight.
――Speaking of signals in the real world, most of them are used by processes that keep running for a long time, such as servers and daemons. --In that case, the sender of the signal is more likely to be a human than an automated program.
Recommended Posts