[Ruby] Extract string starting with uppercase letter in regular expression (Ruby)

2 minute read

What to do this time

As the title says, it takes out a character string starting with an uppercase letter in a regular expression.

code

The code I wrote this time is as follows


Method to extract the capitalized string from the # array
def upperstr(array)

  # Variable used as the subscript of the array
  count = 0

  # An array that stores a character string starting with uppercase letters
  upper = []

  # retrieve all elements contained in the array with each method
  array.each{|arraystr|

# Extract a character string starting with an uppercase letter in a regular expression and store it in an array (upper)
    upper[count] = arraystr.slice(/^[A-Z].*/)

# Add 1 to array subscript
    count += 1
  }
  
  # If the beginning of the character string is lowercase, nil is stored in the array, so delete it using the delete method.
  upper.delete(nil)

  As the return value of the # method, change the array that stores the character string starting with uppercase letters.
  return upper

end


# User guide
p "Enter the character strings separated by single-byte spaces"

# Assign the character string input from the console to a variable
str = gets

# Input string is delimited by half-width space and assigned to array
ary = str.split(" ")

# Assign the return value of the upperstr method to a variable
# (upperstr method is a method to extract the character string starting with capital letters)
upperary = upperstr(ary)

# User guide
p "taken out a string starting with capital letters"

# retrieve all elements contained in the array with each method
upperary.each{|ustr|
  # Show contents of array
  p ustr
}

Regular expression

*Each method and processing is described in the comments, so the explanation is omitted. This time, we are using regular expressions in the slice method used in the upperstr method. The explanation is as follows:

① Code upper[count] = arraystr.slice(/^[A-Z].*/)

②/ The range enclosed by / and / becomes the range of regular expression pattern

③ ^ Indicates that the character immediately after ^ is the first character

④ [A-Z] It means A to Z, which means uppercase. By using ^[A-Z], you can search for a character string that has a capital letter at the beginning. If you want lower case, it is [a-z].

⑤. . Is any single character.

⑥*

  • Means that the character string immediately before is repeated 0 or more times. By writing .*, it means “any character string repeats 0 times or more”. I don’t think it’s easy to understand, but it means “I don’t care about anything other than the first letter.”

Operation check

Write the result of actually running the program. The input value is “aaa AAA B b CC cc DDDDD ddddd”. The expected result is “AAA B CC DD DDD”, because it takes a string that starts with an uppercase letter.

% ruby regex.rb
"Please enter the characters separated by spaces."
aaa AAA B b CC cc DDDDD ddddd
"Retrieved a string starting with a capital letter"
"AAA"
"B"
"CC"
"DDDDD"

The result was as expected! That’s it for this article!