Overview

This is a review note of the java.util.Scanner class introduced in Java 1.5. There are many examples of using standard input (System.in) in the usage examples of the Scanner class, but this article focuses on reading text files without dealing with standard input. It also does not cover all APIs of the Scanner class.

environment

Windows 10 Professional
OpenJDK 10.0.1

reference

How to create an instance

From File

`signature`


public Scanner(File source) throws FileNotFoundException

public Scanner(File source, String charsetName) throws FileNotFoundException

** Code example **

`example`


Path in = Paths.get("path/to/sample.in");
Scanner scanner = new Scanner(in.toFile());
try (scanner) {
  // ...abridgement...
}

From InputStream

`signature`


public Scanner(InputStream source)

public Scanner(InputStream source, String charsetName)

** Code example **

`example`


InputStream in = Files.newInputStream(Paths.get("path/to/sample.in"));
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

As described in the JavaDoc, if the Scanner's input resource implements the Closeable interface, when the Scanner is closed, that resource is also closed and does not need to be included in the try clause.

When the Scanner is closed, if its input source implements the Closeable interface, that source will also be closed.

From Path

`signature`


public Scanner(Path source) throws IOException

public Scanner(Path source, String charsetName) throws IOException

** Code example **

`example`


Path in = Paths.get("path", "to", "sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

From String

`signature`


public Scanner(String source)

** Code example **

`example`


String in = "apple banana cherry durian elderberry";
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

From Readable

`signature`


public Scanner(Readable source)

** Code example **

`example`


Readable in = new FileReader(new File("path/to/sample.in"));
Scanner scanner = new Scanner(in);
try (scanner) {
  // ...abridgement...
}

Read a text file

** Sample file **

I used the postal code data that can be downloaded from the website of Japan Post Co., Ltd. The code examples in this article deal with this comma-separated text, but you'll usually use a library such as opencsv.

`sample.in`


32343,"69917","6991701","Shimanen","Nitagun Okuizumo","Kameda","Shimane Prefecture","Okuizumo Town, Nita District","Tortoise",0,0,0,0,0,0
32343,"69915","6991515","Shimanen","Nitagun Okuizumo","Kamokura","Shimane Prefecture","Okuizumo Town, Nita District","Kamokura",0,0,0,0,0,0
32343,"69915","6991514","Shimanen","Nitagun Okuizumo","Kawachi","Shimane Prefecture","Okuizumo Town, Nita District","Kawachi",0,0,0,0,0,0

Token delimiter

The default token delimiter is a whitespace character. However, the white space character in this case is a white space (character that Character.isWhitespace returns true) according to Java standards, and other than half-width spaces, for example, the following are recognized as delimiters.

** Output result **

Characters with a true return value are Java-based whitespace.

`output`


Character.isWhitespace(' ');       //Half-width space
// → true
Character.isWhitespace('\u0020');  //Half-width space
// → true
Character.isWhitespace('　');      //Full-width space
// → true
Character.isWhitespace('\t');      //tab
// → true
Character.isWhitespace('\n');      //new line
// → true
Character.isWhitespace('\f');      //Form feed
// → true
Character.isWhitespace('\r');      //return
// → true
Character.isWhitespace('\u001C');  //File delimiter
// → true
Character.isWhitespace('\u001D');  //Group delimiter
// → true
Character.isWhitespace('\u001E');  //Record delimiter
// → true
Character.isWhitespace('\u001F');  //Unit delimiter
// → true

Character.isWhitespace('\u00a0');  //No break space.So-called&nbsp;
// → false
Character.isWhitespace('a');
// → false
Character.isWhitespace('Ah');
// → false

Confirmation of Pattern used for delimiter matching

`signature`


public Pattern delimiter()

** Code example **

`example`


scanner.delimiter().pattern();

Output result

`output`


\p{javaWhitespace}+

Specifies the Pattern to use for delimiter matching

Specify any pattern for the delimiter with the useDelimiter method.

`signature`


public Scanner useDelimiter(Pattern pattern)

public Scanner useDelimiter(String pattern)

** Code example **

`example`


String in = "apple :  banana   : cherry  : durian  :  elderberry";
Scanner scanner = new Scanner(in);
try (scanner) {
  scanner.useDelimiter("\\s*:\\s*");
  while (scanner.hasNext()) {
    System.out.println("[" + scanner.next() + "]");
  }
}

Output result

`output`


[apple]
[banana]
[cherry]
[durian]
[elderberry]

Read line by line from a text file

Use the hasNextLine and nextLine methods. The hasNextLine method returns true if the scanner still has input lines. The nextLine method also returns the contents from the scanner's current position to the end of the line, moving the scanner's position to the beginning of the next line.

`signature`


public boolean hasNextLine()

public String nextLine()

** Code example **

`example`


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  int counter = 0;
  while (scanner.hasNextLine()) {
    System.out.println(String.format("%2d: %s", ++counter, scanner.nextLine()));
  }
}

Output result

`output`


 1: 32343,"69917","6991701","Shimanen","Nitagun Okuizumo","Kameda","Shimane Prefecture","Okuizumo Town, Nita District","Tortoise",0,0,0,0,0,0
 2: 32343,"69915","6991515","Shimanen","Nitagun Okuizumo","Kamokura","Shimane Prefecture","Okuizumo Town, Nita District","Kamokura",0,0,0,0,0,0
 3: 32343,"69915","6991514","Shimanen","Nitagun Okuizumo","Kawachi","Shimane Prefecture","Okuizumo Town, Nita District","Kawachi",0,0,0,0,0,0

Read from a text file in token units

The hasNext method returns true if the scanner input has another token. The next method also returns a token from the scanner's current position and moves the scanner's position to the next delimiter position.

The default token delimiter is whitespace (half-width, full-width), tab, line feed code, etc., but in this sample data, the token delimiter is a comma, so specify it explicitly with the useDelimiter method, and the line feed code is also a token delimiter. Must be specified as a character.

`signature`


public boolean hasNext()

public String next()

** Code example **

`example`


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  scanner.useDelimiter(",|\n");
  int counter = 0;
  while (scanner.hasNext()) {
    System.out.println(String.format("%2d: %s", ++counter, scanner.next()));
  }
}

Output result

`output`


 1: 32343
 2: "69917"
 3: "6991701"
 4: "Shimanen"
 5: "Nitagun Okuizumo"
 6: "Kameda"
 7: "Shimane Prefecture"
 8: "Okuizumo Town, Nita District"
 9: "Tortoise"
10: 0
11: 0
12: 0
13: 0
14: 0
15: 0
16: 32343
17: "69915"
18: "6991515"
19: "Shimanen"
20: "Nitagun Okuizumo"
21: "Kamokura"
22: "Shimane Prefecture"
23: "Okuizumo Town, Nita District"
24: "Kamokura"
25: 0
26: 0
27: 0
28: 0
29: 0
30: 0
31: 32343
32: "69915"
33: "6991514"
34: "Shimanen"
35: "Nitagun Okuizumo"
36: "Kawachi"
37: "Shimane Prefecture"
38: "Okuizumo Town, Nita District"
39: "Kawachi"
40: 0
41: 0
42: 0
43: 0
44: 0
45: 0

Combined use of next and nextLine

You can also use the next method to read the token at any position and the nextLine method to skip the data up to the end of the line.

** Code example **

`example`


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  scanner.useDelimiter(",");
  int counter = 0;
  while (scanner.hasNextLine()) {
    int code = scanner.nextInt();        //National local government code
    String zip5 = scanner.next();        //Zip code (5 digits)
    String zip7 = scanner.next();        //Zip code (7 digits)
    scanner.next();                      //skip Prefecture name Half-width katakana
    scanner.next();                      //skip City / ward / town / village name Half-width katakana
    scanner.next();                      //skip Town area name Half-width katakana
    String prefectures = scanner.next(); //Name of prefectures
    String city = scanner.next();        //City name
    String townArea = scanner.next();    //Town area name
    System.out.println(String.format("%2d: %d %s %s %s %s %s", ++counter, code, zip5, zip7, prefectures, city, townArea));
    scanner.nextLine();                  // next line
  }
}

Output result

`output`


 1: 32343 "69917" "6991701" "Shimane Prefecture" "Okuizumo Town, Nita District" "Tortoise"
 2: 32343 "69915" "6991515" "Shimane Prefecture" "Okuizumo Town, Nita District" "Kamokura"
 3: 32343 "69915" "6991514" "Shimane Prefecture" "Okuizumo Town, Nita District" "Kawachi"

Search for patterns with findInLine

Searches for a character string that matches the search pattern specified in the findInLine method from the current position of the scanner to the end of the line. Returns null if no string matching the pattern is found.

`signature`


public String findInLine(String pattern)

public String findInLine(Pattern pattern)

** Code example **

`exmaple`


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  int counter = 0;
  while (scanner.hasNextLine()) {
    String find = scanner.findInLine("69915[0-9]{2}");
    System.out.println(String.format("%2d: %s", ++counter, find));
    scanner.nextLine(); // next line
  }
}

Output result

`output`


 1: null
 2: 6991515
 3: 6991514

API added in Java 9

tokens

Returns a stream of tokens.

`signature`


public Stream<String> tokens()

** Code example **

`example`


String in = "apple banana cherry durian elderberry";
Scanner scanner = new Scanner(in);
try (scanner) {
  final List<String> fruits = scanner.tokens()
    .map(String::toUpperCase)
    .collect(Collectors.toUnmodifiableList());
  System.out.println(fruits);
}

Output result

`output`


[APPLE, BANANA, CHERRY, DURIAN, ELDERBERRY]

findAll

Returns a stream of pattern matching from the scanner.

`signature`


public Stream<MatchResult> findAll(Pattern pattern)

public Stream<MatchResult> findAll(String patString)

** Code example **

`example`


File in = new File("path/to/sample.in");
Scanner scanner = new Scanner(in);
try (scanner) {
  List<String> list = scanner.findAll("\"[0-9]{5,}\"")
    .map(MatchResult::group)
    .collect(Collectors.toUnmodifiableList());
  System.out.println(list);
}

Output result

`output`


["69917", "6991701", "69915", "6991515", "69915", "6991514"]

API added in Java 10

New constructor

A new constructor has been added that takes a Charset as the second argument. The code example is omitted.

`signature`


public Scanner(InputStream source, Charset charset)

public Scanner(File source, Charset charset) throws IOException

public Scanner(Path source, Charset charset) throws IOException

public Scanner(ReadableByteChannel source, Charset charset)

Other review notes

Java NIO2 review memo
August 15, 2017
Review notes for class java.util.Objects
August 25, 2017
Review of Java Collections Framework
August 30, 2017
Review note of package java.time.temporal
January 30, 2018
Review note of class java.util.Optional
March 22, 2018

A review note for the class java.util.Scanner

Overview

How to create an instance

From File

signature

example

From InputStream

signature

example

From Path

signature

example

From String

signature

example

From Readable

signature

example

Read a text file

sample.in

Token delimiter

output

Confirmation of Pattern used for delimiter matching

signature

example

output

Specifies the Pattern to use for delimiter matching

signature

example

output

Read line by line from a text file

signature

example

output

Read from a text file in token units

signature

example

output

Combined use of next and nextLine

example

output

Search for patterns with findInLine

signature

exmaple

output

API added in Java 9

signature

example

output

signature

example

output

API added in Java 10

New constructor

signature

Other review notes

`signature`

`example`

`signature`

`example`

`signature`

`example`

`signature`

`example`

`signature`

`example`

`sample.in`

`output`

`signature`

`example`

`output`

`signature`

`example`

`output`

`signature`

`example`

`output`

`signature`

`example`

`output`

`example`

`output`

`signature`

`exmaple`

`output`

`signature`

`example`

`output`

`signature`

`example`

`output`

`signature`