A review note for the class java.util.Scanner

Overview

This is a review note of the java.util.Scanner class introduced in Java 1.5. There are many examples of using standard input (System.in) in the usage examples of the Scanner class, but this article focuses on reading text files without dealing with standard input. It also does not cover all APIs of the Scanner class.

environment

reference

How to create an instance

From File

signature


public Scanner​(File source) throws FileNotFoundException

public Scanner​(File source, String charsetName) throws FileNotFoundException

** Code example **

example


Path in = Paths.get("path/to/sample.in");
Scanner scanner = new Scanner(in.toFile());
try (scanner) {
  // ...abridgement...
}

From InputStream

signature


public Scanner​(InputStream source)

public Scanner​(InputStream source, String charsetName)

** Code example **

example


InputStream in = Files.newInputStream(Paths.get("path/to/sample.in"));
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

As described in the JavaDoc, if the Scanner's input resource implements the Closeable interface, when the Scanner is closed, that resource is also closed and does not need to be included in the try clause.

When the Scanner is closed, if its input source implements the Closeable interface, that source will also be closed.

From Path

signature


public Scanner​(Path source) throws IOException

public Scanner​(Path source, String charsetName) throws IOException

** Code example **

example


Path in = Paths.get("path", "to", "sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

From String

signature


public Scanner​(String source)

** Code example **

example


String in = "apple banana cherry durian elderberry";
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

From Readable

signature


public Scanner​(Readable source)

** Code example **

example


Readable in = new FileReader(new File("path/to/sample.in"));
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

Read a text file

** Sample file **

I used the postal code data that can be downloaded from the website of Japan Post Co., Ltd. The code examples in this article deal with this comma-separated text, but you'll usually use a library such as opencsv.

sample.in


32343,"69917","6991701","Shimanen","Nitagun Okuizumo","Kameda","Shimane Prefecture","Okuizumo Town, Nita District","Tortoise",0,0,0,0,0,0
32343,"69915","6991515","Shimanen","Nitagun Okuizumo","Kamokura","Shimane Prefecture","Okuizumo Town, Nita District","Kamokura",0,0,0,0,0,0
32343,"69915","6991514","Shimanen","Nitagun Okuizumo","Kawachi","Shimane Prefecture","Okuizumo Town, Nita District","Kawachi",0,0,0,0,0,0

Token delimiter

The default token delimiter is a whitespace character. However, the white space character in this case is a white space (character that Character.isWhitespace returns true) according to Java standards, and other than half-width spaces, for example, the following are recognized as delimiters.

** Output result **

Characters with a true return value are Java-based whitespace.

output


Character.isWhitespace(' ');       //Half-width space
// → true
Character.isWhitespace('\u0020');  //Half-width space
// → true
Character.isWhitespace(' ');      //Full-width space
// → true
Character.isWhitespace('\t');      //tab
// → true
Character.isWhitespace('\n');      //new line
// → true
Character.isWhitespace('\f');      //Form feed
// → true
Character.isWhitespace('\r');      //return
// → true
Character.isWhitespace('\u001C');  //File delimiter
// → true
Character.isWhitespace('\u001D');  //Group delimiter
// → true
Character.isWhitespace('\u001E');  //Record delimiter
// → true
Character.isWhitespace('\u001F');  //Unit delimiter
// → true

Character.isWhitespace('\u00a0');  //No break space.So-called 
// → false
Character.isWhitespace('a');
// → false
Character.isWhitespace('Ah');
// → false

Confirmation of Pattern used for delimiter matching

signature


public Pattern delimiter()

** Code example **

example


scanner.delimiter().pattern();

Output result

output


\p{javaWhitespace}+

Specifies the Pattern to use for delimiter matching

Specify any pattern for the delimiter with the useDelimiter method.

signature


public Scanner useDelimiter​(Pattern pattern)

public Scanner useDelimiter​(String pattern)

** Code example **

example


String in = "apple :  banana   : cherry  : durian  :  elderberry";
Scanner scanner = new Scanner(in);
try (scanner) {
  scanner.useDelimiter("\\s*:\\s*");
  while (scanner.hasNext()) {
    System.out.println("[" + scanner.next() + "]");
  }
}

Output result

output


[apple]
[banana]
[cherry]
[durian]
[elderberry]

Read line by line from a text file

Use the hasNextLine and nextLine methods. The hasNextLine method returns true if the scanner still has input lines. The nextLine method also returns the contents from the scanner's current position to the end of the line, moving the scanner's position to the beginning of the next line.

signature


public boolean hasNextLine()

public String nextLine()

** Code example **

example


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  int counter = 0;
  while (scanner.hasNextLine()) {
    System.out.println(String.format("%2d: %s", ++counter, scanner.nextLine()));
  }
}

Output result

output


 1: 32343,"69917","6991701","Shimanen","Nitagun Okuizumo","Kameda","Shimane Prefecture","Okuizumo Town, Nita District","Tortoise",0,0,0,0,0,0
 2: 32343,"69915","6991515","Shimanen","Nitagun Okuizumo","Kamokura","Shimane Prefecture","Okuizumo Town, Nita District","Kamokura",0,0,0,0,0,0
 3: 32343,"69915","6991514","Shimanen","Nitagun Okuizumo","Kawachi","Shimane Prefecture","Okuizumo Town, Nita District","Kawachi",0,0,0,0,0,0

Read from a text file in token units

The hasNext method returns true if the scanner input has another token. The next method also returns a token from the scanner's current position and moves the scanner's position to the next delimiter position.

The default token delimiter is whitespace (half-width, full-width), tab, line feed code, etc., but in this sample data, the token delimiter is a comma, so specify it explicitly with the useDelimiter method, and the line feed code is also a token delimiter. Must be specified as a character.

signature


public boolean hasNext()

public String next()

** Code example **

example


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  scanner.useDelimiter(",|\n");
  int counter = 0;
  while (scanner.hasNext()) {
    System.out.println(String.format("%2d: %s", ++counter, scanner.next()));
  }
}

Output result

output


 1: 32343
 2: "69917"
 3: "6991701"
 4: "Shimanen"
 5: "Nitagun Okuizumo"
 6: "Kameda"
 7: "Shimane Prefecture"
 8: "Okuizumo Town, Nita District"
 9: "Tortoise"
10: 0
11: 0
12: 0
13: 0
14: 0
15: 0
16: 32343
17: "69915"
18: "6991515"
19: "Shimanen"
20: "Nitagun Okuizumo"
21: "Kamokura"
22: "Shimane Prefecture"
23: "Okuizumo Town, Nita District"
24: "Kamokura"
25: 0
26: 0
27: 0
28: 0
29: 0
30: 0
31: 32343
32: "69915"
33: "6991514"
34: "Shimanen"
35: "Nitagun Okuizumo"
36: "Kawachi"
37: "Shimane Prefecture"
38: "Okuizumo Town, Nita District"
39: "Kawachi"
40: 0
41: 0
42: 0
43: 0
44: 0
45: 0

Combined use of next and nextLine

You can also use the next method to read the token at any position and the nextLine method to skip the data up to the end of the line.

** Code example **

example


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  scanner.useDelimiter(",");
  int counter = 0;
  while (scanner.hasNextLine()) {
    int code = scanner.nextInt();        //National local government code
    String zip5 = scanner.next();        //Zip code (5 digits)
    String zip7 = scanner.next();        //Zip code (7 digits)
    scanner.next();                      //skip Prefecture name Half-width katakana
    scanner.next();                      //skip City / ward / town / village name Half-width katakana
    scanner.next();                      //skip Town area name Half-width katakana
    String prefectures = scanner.next(); //Name of prefectures
    String city = scanner.next();        //City name
    String townArea = scanner.next();    //Town area name
    System.out.println(String.format("%2d: %d %s %s %s %s %s", ++counter, code, zip5, zip7, prefectures, city, townArea));
    scanner.nextLine();                  // next line
  }
}

Output result

output


 1: 32343 "69917" "6991701" "Shimane Prefecture" "Okuizumo Town, Nita District" "Tortoise"
 2: 32343 "69915" "6991515" "Shimane Prefecture" "Okuizumo Town, Nita District" "Kamokura"
 3: 32343 "69915" "6991514" "Shimane Prefecture" "Okuizumo Town, Nita District" "Kawachi"

Search for patterns with findInLine

Searches for a character string that matches the search pattern specified in the findInLine method from the current position of the scanner to the end of the line. Returns null if no string matching the pattern is found.

signature


public String findInLine​(String pattern)

public String findInLine​(Pattern pattern)

** Code example **

exmaple


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  int counter = 0;
  while (scanner.hasNextLine()) {
    String find = scanner.findInLine("69915[0-9]{2}");
    System.out.println(String.format("%2d: %s", ++counter, find));
    scanner.nextLine(); // next line
  }
}

Output result

output


 1: null
 2: 6991515
 3: 6991514

API added in Java 9

tokens

Returns a stream of tokens.

signature


public Stream<String> tokens()

** Code example **

example


String in = "apple banana cherry durian elderberry";
Scanner scanner = new Scanner(in);
try (scanner) {
  final List<String> fruits = scanner.tokens()
    .map(String::toUpperCase)
    .collect(Collectors.toUnmodifiableList());
  System.out.println(fruits);
}

Output result

output


[APPLE, BANANA, CHERRY, DURIAN, ELDERBERRY]

findAll

Returns a stream of pattern matching from the scanner.

signature


public Stream<MatchResult> findAll​(Pattern pattern)

public Stream<MatchResult> findAll​(String patString)

** Code example **

example


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  List<String> list = scanner.findAll("\"[0-9]{5,}\"")
    .map(MatchResult::group)
    .collect(Collectors.toUnmodifiableList());
  System.out.println(list);
}

Output result

output


["69917", "6991701", "69915", "6991515", "69915", "6991514"]

API added in Java 10

New constructor

A new constructor has been added that takes a Charset as the second argument. The code example is omitted.

signature


public Scanner​(InputStream source, Charset charset)

public Scanner​(File source, Charset charset) throws IOException

public Scanner​(Path source, Charset charset) throws IOException

public Scanner​(ReadableByteChannel source, Charset charset)

Other review notes

Recommended Posts

A review note for the class java.util.Scanner
A review note for the class java.util.Optional
A review note for the class java.util.Objects
A review note for the package java.time.temporal
A note on the libGDX Utils class
A note about the scope
A note for Initializing Fields in the Java tutorial
A review note of the Spring Framework Resource interface
A murmur about the utility class
A rudimentary note on the Fibonacci sequence
A note for those who live with JMockit
A note when the heroku command becomes unavailable
A quick review of Java learned in class
A note about the Rails and Vue process
Addressing the issue of slow random access for linkedList, a collection type class
[For beginners] Where to review when a class cannot be found at compile time
A review of the code used by rails beginners
I made a check tool for the release module
What is the difference between a class and a struct? ?? ??
A quick review of Java learned in class part4
I read the readable code, so make a note
[Ruby / Rails] Set a unique (unique) value in the class
A quick review of Java learned in class part3
A quick review of Java learned in class part2
How to make a mod for Slay the Spire
About the StringBuilder class
Note No. 6 "Calculate the formula for the one-digit sum difference received as a character string" [Java]
Java inner class review
Nested class (for myself)
SDWebImage: How to clear the cache for a particular UIImageView
Rails: I've summarized the model and database for a moment.
A note about the seed function of Ruby on Rails
I tried JAX-RS and made a note of the procedure
I want to give a class name to the select attribute
I investigated Randoop, a JUnit test class generator for Java
A memorandum to reach the itchy place for Java Gold
Aiming for a basic understanding of the flow of recursive processing
A 25-day review and future efforts for NRI OpenStandia's Keycloak
With the software I've been making for a long time ...
Modeling a Digimon with DDD for the first time Part 1