If you are using grep -f
and it happens that only **" the line at the end of the search string "is hit ***
The line feed code may be the cause.
→ The content is that if you correct the line feed code with sed
or ʻecho`, it will work.
(* For Windows. Maybe it behaves differently for Mac?)
grep -f
or while
statement to read one line at a time (while read line
).grep -f [search string file] [search target file]
while read line do grep "$ line" [Search target file] done <[search string file]
(* In this writing method, it is necessary to break the line at the last line of ** [Search string file] ** (otherwise only the last line will not be read). Lines that meet multiple conditions will appear many times. Please note that it will be done)
As shown below, when searching for each line in the file, the phenomenon that only "the line with the search string at the end" is selected may occur (or rather, it happened).
Search string file (file.txt)
AAA
BBB
Search target file (test.txt)
AAAxxxxxxxx
xxxxxAAAxxx
xxxxxxxxAAA
BBBx BBB xx
xxxxxxxxBBB
xxAAAxxBBBx
xxxCCCxxxxx
Only the line with the last search string is selected
$ grep -f file.txt test.txt
xxxxxxxxAAA
xxxxxxxxBBB
$ while read line
> do
> grep "$line" test.txt
> done < file.txt
xxxxxxxxAAA
xxxxxxxxBBB
Anything is fine, so if you use sed
to recreate ** [search string file] **, it will work fine.
$ sed 's/^//' file.txt > file2.txt
$ grep -f file2.txt test.txt
AAAxxxxxxxx
xxxxxAAAxxx
xxxxxxxxAAA
BBBx BBB xx
xxxxxxxxBBB
xxAAAxxBBBx
For the while
statement, rereading $ line
with ʻecho` also works.
$ while read line
> do
> grep `echo $line` test.txt
> done < file.txt
AAAxxxxxxxx
xxxxxAAAxxx
xxxxxxxxAAA
xxAAAxxBBBx ##
BBBx BBB xx
xxxxxxxxBBB
xxAAAxxBBBx ##Appears many times if multiple conditions are met
Examples of other line feed code conversion methods ・ Conversion of line feed code
The cause of this behavior was that the line feed code was different between Windows and Unix.
In other words, if you use a file created on Windows as a search string, the ** \ r ** part will interfere with the search (search for "** search string + \ r **"), so search at the end. It only hit if there was a string.
(In fact, if you remove the line breaks on the second line of ** file.txt **, "BBB" will be searched normally, and conversely, ** test.txt ** will be sed
. If you change the line feed code from CRLF to LF by executing, nothing will be output.)
OS | Line feed code | 「od -How it looks in "c" |
---|---|---|
Unix | LF | \n |
Mac(OSX) | LF | \n |
Mac(OS9) | CR | \r |
Windows | CR+LF | \r\n |
Quote: Check line feed code
file.When there is no line break in the second line (BBB) of txt
$ grep -f file.txt test.txt
xxxxxxxxAAA
BBBx BBB xx
xxxxxxxxBBB
xxAAAxxBBBx
Search target file (test.When sed is executed in txt) (nothing is output)
$ sed 's/^//' test.txt > test2.txt
$ grep -f file.txt test2.txt
Using sed
or ʻecho` will convert the line feed code from ** CRLF (\ r \ n) ** to ** LF (\ n) ** so that the search will work. Become.
Before and after the sed command
## ----------------------Before sed(CRLF)
$ file file.txt
file.txt: ASCII text, with CRLF line terminators
$ od -c file.txt
0000000 A A A \r \n B B B \r \n
0000012
## ----------------------After sed(LF)
$ file file2.txt
file2.txt: ASCII text
$ od -c file2.txt
0000000 A A A \n B B B \n
0000010
Before and after the echo command
$ cat hoge.txt
hoge
$ while read line
> do
> echo `echo $line` > hoge2.txt
> done < hoge.txt
## ----------------------before echo(CRLF)
$ od -c hoge.txt
0000000 h o g e \r \n
0000006
## ----------------------After echo(LF)
$ od -c hoge2.txt
0000000 h o g e \n
0000005
Reference: ・ [Sed] Command (Basic) -Edit text file -How to replace the changed line feed code with Windows sed and return it from LF to CRLF
In the case of Mac, CR was adopted in the previous MacOS, but after MacOSX it is said that it is the same LF as Unix-like OS.
Recommended Posts