I understand this! Shortest match

Introduction

There is a shortest match in a regular expression, but let's check how to use it while watching the operation with some samples.

Contents

str =" Ah ah "Ah" I "Uu" Yes "Oh" Kaka? " If you want to extract only the part enclosed in parentheses from such a character string (that is, you want to extract three places of "a", "u", and "oh"), you will use the shortest match.

Before that, let's check from a simple movement.

** + ** does not match in this example because it matches with one or more characters.

str = "Ah ah """
puts str.scan(/「.+」/)
=>Does not match

*** matches with 0 or more characters, so it matches in this example.

str = "Ah ah """
puts str.scan(/「.*」/)
=>「」
What about the following cases?

Since there is one character ** a **, it matches.

str = "Ah ah "Ah""
puts str.scan(/「.+」/)
=>"Ah"

This also matches because there are 0 characters ** a **. The result is the same.

str = "Ah ah "Ah""
puts str.scan(/「.*」/)
=>"Ah"
Let's get into the main subject a little

In this example, the part enclosed by ** "" that appeared first and ** " that appeared last will match. This is not what I want to do.

str = "Ah ah "ah" i "uu" yeah "oh""
puts str.scan(/「.+」/)
=>"Ah" I "Uu" Yes "Oh"
So what should i do

**? If you add **, it will make the shortest match.

str = "Ah ah "ah" i "uu" yeah "oh""
puts str.scan(/「.+?」/)
=>"A" "Uu" "Oh"

Congratulations. : clap :: clap :: clap ::

What if I do this?

If you use *** instead of ** + **, the result is the same, but ...

str = "Ah ah "ah" i "uu" yeah "oh""
puts str.scan(/「.*?」/)
=>"A" "Uu" "Oh"

Depending on whether you want to match ** "" ** or not, you will have to use ** + ** and *** ** properly.

If you don't want to match ** "" **, use ** + **, which matches one or more characters.

str = "Ah ah "Ah" I "Uu" Yeah "Oh" Kaka """
puts str.scan(/「.+?」/)
=>"A" "Uu" "Oh"

If you want to match ** "" **, use ***, which matches with 0 or more characters.

str = "Ah ah "Ah" I "Uu" Yeah "Oh" Kaka """
puts str.scan(/「.*?」/)
=>"A" "Uu" "Oh" ""

C'est fini :sweat_smile:

Recommended Posts

I understand this! Shortest match
I checked this
Java concurrency I don't understand
I don't understand Ruby 3 Ractor
I tried to understand nil guard