Java file input / output processing that can be used through historical background

Introduction

Historically, Java's file I / O process was notorious for being cumbersome. Java's file I / O processing was often the target of attacks when the scripting language camp appealed its high "productivity". It is said that the processing that can be realized in a few lines in a script language becomes dozens of lines in Java.

Time has passed, and the java.nio.file package (NIO2 File API) and try-with-resources syntax introduced in Java 7 in 2011 have greatly improved the situation. Now, even Java can realize file input / output processing with the same amount of code as the scripting language.

However, in articles that search for "Java file I / O" and hit the top, only inappropriate code that uses only the API before the above improvements were introduced is conspicuous. Therefore, in this article, I would like to systematically and concisely summarize the simplest Java file input / output processing that beginners should know at the moment, regardless of the historical background of the past.

Prerequisite knowledge

Representation of the path

The java.nio.file.Path interface (reference: Javadoc is used to represent the path of the file system. Use /Path.html)). There are various ways to create a Path instance, but in reality, the java.nio.file.Paths utility class (Reference: Javadoc Mostly generated using java / nio / file / Paths.html)).

Path path = Paths.get("items.csv");

The path string can be specified as either a relative path or an absolute path.

Path path = Paths.get("/etc/passwd");

Considering the portability of path-delimited strings, it is better to use this variable-length argument.

Path path = Paths.get("/etc", "passwd");

Starting with Java 11, the Path.of (String) method was introduced in addition to the Paths.get (String) method (Reference: Javadoc javase / 11 / docs / api / java.base / java / nio / file / Path.html # of (java.lang.String, java.lang.String ...))). The functionality is the same, but this one is more consistent with List.of () Map.of () etc., which was also introduced in Java 11, and is more natural.

Path path = Path.of("items.csv");

Basics of file processing

File-related processing is done with the Path interface and the java.nio.file.Files utility class (Reference: Javadoc Combine nio / file / Files.html)). For example, it looks like this.

List<String> lines = Files.readAllLines(Paths.get("items.csv"));

(Reference) Relationship with java.io.File

As a class that represents a file in Java, java.io.File (Reference: Javadoc )) You might think of it. This is an older method and can be replaced by the functionality of the java.nio.file package. Note that if the old API only accepts java.io.File instead of java.nio.file.Path, you can convert between Path # toFile () and File # toPath ().

Processing text files

What is a text file?

In this article, we won't go into the details of what a text binary is, but simply read and write a file of type String from Java's perspective as a text file.

Bulk text reading

The simplest way to read text is with the Files.readString (Path) method. This method returns the entire contents of the file as a String.

String content = Files.readString(Paths.get("items.csv"));

You need to be careful about the character set when reading and writing text. The above code does not specify the character set, so the default UTF-8 is used. The method of specifying the character set is shown below.

String content = Files.readString(Paths.get("items.csv"), StandardCharsets.UTF_8);
String content = Files.readString(Paths.get("items.csv"), Charset.forName("MS932"));

Well, here is one unfortunate news. The Files.readString (Path) Files.readString (Path, Charset) method described above is only available in Java 11 and above. Here's another way you can use it from java 7. This method returns the contents of the file as a List <String> type.

List<String> lines = Files.readAllLines(Paths.get("items.csv"), StandardCharsets.UTF_8);

Read subdivided text

If the file size is small, you can read them all at once as described above. However, if you have the opportunity to process files larger than tens or hundreds of megabytes, you should remember how to handle them with Stream <String> here. This method has better performance.

try (Stream<String> lines = Files.lines(Paths.get("items.csv"), StandardCharsets.UTF_8)) {
    lines.forEach(line -> System.out.println(line));
} catch (IOException e) {
    throw new UncheckedIOException(e);
}

When using this method, use the try-with-resources syntax to prevent forgetting to close. The same applies to reading and writing of other subdivisions. For more information, see Java Exception Handling in a Step-by-Step Understanding (https://qiita.com/ts7i/items/d7f6c1cd5a14e55943d4).

Also, if you need to read subdivided text that is not line-oriented, java.io.BufferedReader returned fromFiles.newBufferedReader (Path, Charset)(Reference: Javadoc Use com / javase / jp / 8 / docs / api / java / io / BufferedReader.html)). If you want to read a huge JSON file without line breaks, you will need this method.

try (BufferedReader in = Files.newBufferedReader(Paths.get("items.csv"), StandardCharsets.UTF_8)) {
    for (int ch = 0; (ch = in.read()) > 0;) {
        System.out.print((char) ch);
    }
} catch (IOException e) {
    throw new UncheckedIOException(e);
}

Further abstraction when combined with java.util.Scanner (Reference: Javadoc) High API can be used.

try (BufferedReader in = Files.newBufferedReader(Paths.get("items.csv"), StandardCharsets.UTF_8);
        Scanner sc = new Scanner(in)) {
    sc.useDelimiter("(,|\\n)");
    while (sc.hasNext()) {
        System.out.println(sc.next());
    }
} catch (IOException e) {
    throw new UncheckedIOException(e);
}

Bulk text writing

The simplest way to write text is with the Files.writeString (Path, CharSequence, OpenOption ...) method. This method writes the entire contents of the given CharSequence to a file. Note that String and StringBuilder are a type of CharSequence (Reference: [Javadoc](https://docs.oracle.com/javase/jp/8/docs/api/java/lang/CharSequence. html)).

String content = "0-0\t0-1\t0-2\n1-0\t1-1\t1-2\n";
Files.writeString(Paths.get("items.tsv"), content);

You can add it by setting ʻOpenOption`.

String content = "0-0\t0-1\t0-2\n1-0\t1-1\t1-2\n";
Files.writeString(Paths.get("items.tsv"), content, StandardOpenOption.APPEND);

You can specify the character set here as well as when reading.

String content = "0-0\t0-1\t0-2\n1-0\t1-1\t1-2\n";
Files.writeString(Paths.get("items.tsv"), content, StandardCharsets.UTF_8);
String content = "0-0\t0-1\t0-2\n1-0\t1-1\t1-2\n";
Files.writeString(Paths.get("items.tsv"), content, Charset.forName("MS932"));

Now, here is another unfortunate news. The above Files.writeString (Path, CharSequence, OpenOption ...) Files.writeString (Path, CharSequence, Charset, OpenOption ...) method is only available in Java 11 and above. Before that, you can only use the same method as writing in small pieces.

Subdivided text writing

When writing text in small pieces, java.io.BufferedWriter returned fromFiles.newBufferedWriter (Path, Charset, OpenOption ...)(Reference: Javadoc /jp/8/docs/api/java/io/BufferedWriter.html)) is used.

List<String> lines = new ArrayList<String>();
...
try (BufferedWriter out = Files.newBufferedWriter(Paths.get("items.tsv"), StandardCharsets.UTF_8)) {
    for (String line : lines) {
        out.write(line);
        out.newLine();
    }
} catch (IOException e) {
    throw new UncheckedIOException(e);
}

Processing binary files

What is a binary file?

In this article, a file that is read and written as a byte string from the perspective of Java is simply called a binary file.

Bulk binary reading

The simplest way to read a binary is the Files.readBytes (Path) method. This method returns the entire contents of the file as byte [].

byte[] content = Files.readAllBytes(Paths.get("selfie.jpg "));

For reference, the method of converting from byte [] to String or java.io.InputStream is shown below. Such conversion is necessary depending on the convenience of the API to which the loaded binary is passed.

byte[] content = Files.readAllBytes(Paths.get("items.csv"));
String contentAsString = new String(content, StandardCharsets.UTF_8);
byte[] content = Files.readAllBytes(Paths.get("selfie.jpg "));
InputStream in = new ByteArrayInputStream(content);

Subdivided binary reading

As with text, bulk loading is not recommended for large file sizes. Instead, the Files.newInputStream (Path, OpenOption ...) method returns java.io.InputStream (Reference: Javadoc /api/java/io/InputStream.html))) is used. Regarding the acquired java.io.InputStream, there are more cases where it is processed by an existing library than when it is processed by turning a loop on its own. As an example, the following code passes it to Apache POI.

try (InputStream in = Files.newInputStream(Paths.get("items.xlsx"))) {
    XSSFWorkbook book = new XSSFWorkbook(in);
    ...
} catch (IOException e) {
    throw new UncheckedIOException(e);
}

Bulk binary writing

The simplest way to write a binary is the Files.write (Path, byte [], OpenOption ...) method.

byte[] content = ...;
Files.write(Paths.get("selfie.jpg "), content);

Subdivided binary writing

When writing binaries in small pieces, java.io.OutputStream returned fromFiles.newOutputStream (Path, OpenOption ...)(Reference: Javadoc / 8 / docs / api / java / io / OutputStream.html))) is used. There will be more cases where the existing library handles this as well, rather than handling it yourself. As an example, the following code passes it to Apache POI.

Item item = ...;
try (OutputStream out = Files.newOutputStream(Paths.get("items.xlsx"))) {
    XSSFWorkbook book = new XSSFWorkbook();
    ...
    book.write(out);
} catch (IOException e) {
    throw new UncheckedIOException(e);
}

(Reference) Old complicated writing style

Below is an example of code that was common in the days when there was neither NIO2 File API nor try-with-resources. Certainly, this is a complexity that cannot be complained even if attacked by the scripting language camp. Regarding the multi-stage try-catch part, "Originally, if you close BufferedReader, the lower level ʻInputStream will also be closed, but if you fail to initialize BufferedReader`, a resource leak may occur. I remember that there was an argument such as "It is necessary for ...". I've forgotten the details, and I don't think I need to remember them anymore.

String content = null;
InputStream is = null;
try {
    is = new FileInputStream("items.csv");
    BufferedReader br = null;
    try {
        br = new BufferedReader(new InputStreamReader(is, "UTF-8"));
        StringBuilder sb = new StringBuilder();
        String line = null;
        while ((line = br.readLine()) != null) {
            sb.append(line);
            sb.append(System.lineSeparator());
        }
        content = sb.toString();
    } catch (IOException e) {
        throw new RuntimeException(e);
    } finally {
        try {
            if (br != null) {
                br.close();
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
} catch (IOException e) {
    new RuntimeException(e);
} finally {
    try {
        if (is != null) {
            is.close();
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}

Again, what you're trying to do with the code above can now be achieved with:

String content = Files.readString(Paths.get("items.csv"));

Recommended Posts

Java file input / output processing that can be used through historical background
Static analysis tool that can be used on GitHub [Java version]
Summary of ORM "uroboroSQL" that can be used in enterprise Java
[Android Studio] Description that can be continuously input in SQLite Database [Java]
[Java 8] Until converting standard input that can be used in coding tests into a list or array
Let's write Java file input / output with NIO
Technology excerpt that can be used for creating EC sites in Java training
Organize methods that can be used with StringUtils
[Ruby] Methods that can be used with strings
Java (jdk1.8 or later) file input / output sample program
Write a class that can be ordered in Java
Stream processing of Java 8 can be omitted so far!
About the matter that hidden_field can be used insanely
Convenient shortcut keys that can be used in Eclipse
Syntax and exception occurrence conditions that can be used when comparing with null in Java
[Java 8] Sorting method in alphabetical order and string length order that can be used in coding tests
Summary of css selectors that can be used with Nookogiri
Create a page control that can be used with RecyclerView
Create a jar file that can be executed in Gradle
Firebase-Realtime Database on Android that can be used with copy
Problems that can easily be mistaken for Java and JavaScript
Java (super beginner edition) that can be understood in 180 seconds
Java 14 new features that could be used to write code
Whether options can be used due to different Java versions
Reference memo / In-memory LDAP server that can be embedded in Java
Note that system properties including JAXBContext cannot be used in Java11
File form status check sheet that can be deleted with thumbnails
I made a question that can be used for a technical interview
Power skills that can be used quickly at any time --Reflection
SwiftUI View that can be used in combination with other frameworks
Introduction to Java that can be understood even with Krillin (Part 1)
[Spring Boot] List of validation rules that can be used in the property file for error messages