How to extract the desired character string from a line 4 commands

Python Crawling & Scraping-Practical Development Guide for Data Collection and Analysis- https://www.amazon.co.jp/dp/B01NGWKE0P/ref=dp-kindle-redirect?_encoding=UTF8&btkr=1

Of the book 1.4.1 Get the total number of e-books What I learned from the chapter

Operation to extract only the character string with a regular expression from the html code extracted with grep Four methods are introduced

1.Extract the part that matches the regular expression with the sed command
2.Remove the matched part with the sed command and remove the remaining part
3.Use the cut command to extract the nth string from a string separated by a specific character
4.Extract the nth string from a space-justified string using the awk command

I don't know the command in the first place. .. .. However, there was an explanation of sed and cut on the previous page.

sed (short for Stream EDitor)

When to use: You can replace or delete lines that match specific conditions. Usage:'s / regular expression to search / string to replace / option'

【Example of use】

# .Can be output by replacing with a space/g is the same as replacing all regular expressions to be searched on one line even if they appear multiple times.
XX | sed 's/./ /g'

cut Timing of use: Used to extract some columns of text separated by specific characters 【Example of use】

# ,Output only the first and second columns separated by.-Delimiter with d,-Specify the column number with f.
XX | cut -d , -f 1,2

I would like to take a look at the process of extracting by methods 1 to 4 of the main subject one by one.

1. Use the sed command to extract the part that matches the regular expression

Usage: * sed -E's /. \ * (Regular expression that matches the part you want to extract). * / \ 1 /' Decoding: . Matches any single character

【Example of use】

echo hello_world | sed -E 's/.\*(hello.).*/\1'
#Output result
  hello

2. Use the sed command to remove the matched part and remove the remaining part.

Decoding: ^ In [] indicates negation

【Example of use】

echo'<li class="pagingnumber">130/2098</li>' | sed -E 's/<[^>]*>//g'

3. Use the cut command to extract the nth string from a string separated by a specific character.

Timing of use: When extracting a character string from csv Decoding: '-d, -f 2'is the delimiter, the second item from the delimited string

echo '1,baseball,Hanshin' | cut -d , -f 2
#Output result
baseball

4. Use the awk command to extract the nth string from the space-justified string

It can be used when the digits are aligned with spaces and the delimiters are consecutive. (cut is not suitable when delimiters are consecutive) If you give the character string {print $ n}, you can extract the nth character string.

echo 'A B C D E' | awk '{print $4}'
#Output result
 D

Recommended Posts

How to extract the desired character string from a line 4 commands
# 5 [python3] Extract characters from a character string
[Python] How to invert a character string
[Introduction to Python] How to split a character string with the split function
Try to extract a character string from an image with Python3
Outputs a line containing the specified character string from a text file
How to create a function object from a string
How to extract coefficients from a fractional formula
[Introduction to Python] How to write a character string with the format function
How to create an article from the command line
[Python] How to expand variables in a character string
How to post a ticket from the Shogun API
[Ansible] Example of playbook that adds a character string to the first line of the file
Python learning basics ~ How to output (display) a character string? ~
How to connect the contents of a list into a string
Extract the value closest to a value from a Python list element
[Introduction to Python] How to output a character string in a Print statement
How to use Visual Recognition to get LINE ID from a girl
[EC2] How to install and download chromedriver from the command line
Find all patterns to extract a specific number from the set
How to create a clone from Github
I want to extract an arbitrary URL from the character string of the html source with python
How to operate Linux from the console
How to create a repository from media
How to access the Datastore from the outside
How to put a line number at the beginning of a CSV file
Use BeautifulSoup to extract a link containing a string from an HTML file
How to input a character string in Python and output it as it is or in the opposite direction.
How to calculate the volatility of a brand
How to send a message to LINE with curl
How to open a web browser from python
How to generate a Python object from JSON
How to operate Linux from the outside Procedure
How to create a Kivy 1-line input box
I tried to generate a random character string
Python version (PHP to Python) that deletes the subsequent character string from the specified character string (extension)
I tried to extract a line art from an image with Deep Learning
[Tentative] How to convert a character string to Shift_jis with kivy-ios Memo kivy v1.8.0
[Python] I tried to get the type name as a string from the type function
Read the Python-Markdown source: How to create a parser
[Python] How to remove duplicate values from the list
How to convert / restore a string with [] in python
Get the variable name of the variable as a character string.
A story that failed when trying to remove the suffix from the string with rstrip
How to write a GUI using the maya command
How to create a submenu with the [Blender] plugin
# Function that returns the character code of a string
How to instantly launch Jupyter Notebook from the terminal
I want to split a character string with hiragana
Output a character string with line breaks in PyYAML
How to pass arguments when invoking python script from blender on the command line
[Ruby] How to replace only a part of the string matched by the regular expression?
[Linux] When you want to search for a specific character string from multiple files
How to hit the document of Magic Function (Line Magic)
The world's most easy-to-understand explanation of how to make a LINE BOT (1) [Account preparation]
How to take a captured image from a video (OpenCV)
(Remember quickly) How to use the LINUX command line
[Python] How to call a c function from python (ctypes)
How to create a kubernetes pod from python code
Outputs a line containing the specified character string from a text file
[Python] Read the specified line in the file
Divide the string into the specified number of characters
How to delete the specified string with the sed command! !! !!
Output a character string with line breaks in PyYAML
Output unicode string list
[Python 2/3] Parse the format string
Filter the output of tracemalloc
Read the standard output of a subprocess line by line in Python
How to extract the desired character string from a line 4 commands
[Python] Programming to find the number of a in a character string that repeats a specified number of times.