[JAVA] Summary of FileInputStream and BufferedInputStream

Overview

I found an article somewhere before that setting FileInputStream to BufferedStream speeds up the process. So this time I will write an article about FileInputStream and BufferedInputStream.

2019/10/06 postscript In the comment, he explained why the processing speed is increased even with FileInputStream when using InputStreamReader. Please refer to it.

What is the difference between the two?

The difference between the two was easy to understand by reading the following.

This is a native call to the OS that uses the disk to read 1 byte. This is a heavy operation.

With> BufferedInputStream, this method delegates to an overloaded read () method that reads 8192 bytes and buffers them until needed. Only one byte is still returned (although the other bytes are reserved). In this way, BufferedInputStream reads from a file with fewer native calls to the OS.

Why read a file in bytes faster than using a BufferedInputStream and a FileInputStream?

In other words, FileInputStream reads only one byte at a time, which causes a lot of disk access, while BufferedInputStream reads a large number of bytes at a time, so data can be read with less disk access. is.

The results of the actual experiment are as follows. First, create a file to read as follows.

Read the file created as follows



final FileOutputStream fos = new FileOutputStream(filePath);
final BufferedOutputStream bos = new BufferedOutputStream(fos);
final OutputStreamWriter osw = new OutputStreamWriter(bos, Charset.defaultCharset());
final PrintWriter pr = new PrintWriter(osw);
for (int i = 0; i < 500000; i++) {
    pr.println("Ah");
}
pr.close();

After reading the above file with FileInputStream as process 1, I applied the read file to StringBuilder.

Process 1_FileInputStream


//FileInputStream
StringBuilder sb = new StringBuilder();
final FileInputStream inputStream = new FileInputStream(filePath);

//processing
long startTime = System.currentTimeMillis();
int line;
while (true) {
    line = inputStream.read();
    if (line == -1) {
        break;
    }
    sb.append(line);
}
long endTime = System.currentTimeMillis();
System.out.println("processing time: " + (endTime - startTime));
inputStream.close();

In process 2, FileInputStream was wrapped with BufferedInputStream and used.

Process 2_BufferedInputStream


//BufferedInputStream
StringBuilder sb = new StringBuilder();
final FileInputStream fis = new FileInputStream(filePath);
BufferedInputStream inputStream = new BufferedInputStream(fis);

//processing
long startTime = System.currentTimeMillis();
int line;
while (true) {
    line = inputStream.read();
    if (line == -1) {
        break;
    }
    sb.append(line);
}
long endTime = System.currentTimeMillis();
System.out.println("processing time: " + (endTime - startTime));
inputStream.close();
fis.close();

The result is as follows (unit: ms). FileInputStream First time: 3840 Second time: 3820 Third time: 3772

BufferedInputStream 1st time: 109 Second time: 111 Third time: 117

It's obvious at a glance. Obviously BufferedInputStream is faster. By the way, when the number of For statements was set to 50, it was 2761196ns for FileInputStream and 2198539ns for BufferedInputStream, which are not much different.

Is there any difference if I use InputStreamReader?

The InputStreamReader class is used to convert the byte string of the read file to characters, but it turned out that the difference between FileInputStream and BufferedInputStream is almost eliminated by using InputStreamReader.

Process 3_FileInputStream+InputStreamReader


//FileInputStream + InputStreamReader
StringBuilder sb = new StringBuilder();
final FileInputStream inputStream = new FileInputStream(filePath);
final InputStreamReader reader = new InputStreamReader(inputStream);

//processing
long startTime = System.currentTimeMillis();
int line;
while (true) {
    line = reader.read();
    if (line == -1) {
        break;
    }
    sb.append(line);
}
long endTime = System.currentTimeMillis();
System.out.println("processing time: " + (endTime - startTime));
inputStream.close();
reader.close();

Process 4_BufferdInputStream+InputStreamReader


//BufferdInputStream + InputStreamReader
StringBuilder sb = new StringBuilder();
final FileInputStream inputStream = new FileInputStream(filePath);
final BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
final InputStreamReader reader = new InputStreamReader(bufferedInputStream);

//processing
long startTime = System.currentTimeMillis();
int line;
while (true) {
    line = reader.read();
    if (line == -1) {
        break;
    }
    sb.append(line);
}
long endTime = System.currentTimeMillis();
System.out.println("processing time: " + (endTime - startTime));
inputStream.close();
reader.close();

The result is as follows (unit: ms). FileInputStream + InputStreamReader First time: 114 Second time: 131 Third time: 154

BufferedInputStream + InputStreamReader First time: 163 Second time: 167 Third time: 150

Perhaps because there is time to create an instance of BufferedInputStream, looking at the results alone, it seems faster to read the file without wrapping FileInputStream when using InputStreamReader.

If you want to output the read file as a character string to StringBuilder etc., I think that you will use InputStreamReader, so it means that there is not much difference between FileInputStream and BufferedInputStream. See below for a description of the InputStreamReader.

What is InputStreamReader: JavaA2Z

So should I always use BufferedInputStream?

Obviously, BufferedInputStream is faster, so you might think it's better to use it, but there are cases where using it isn't very effective.

That is when the buffer size is set in the argument of the read method. In InputStream, the buffer size can be set as an argument of read (), and it is possible to control the number of bytes read at one time. If nothing is set, it will read one byte at a time (with the exception of BufferedInputStream).

So, for example, if you rewrite the code of process 1 and process 2 as follows, it will not change much in terms of speed.

Read of process 1()Try to specify the buffer size in



//Process 1 FileInputStream
StringBuilder sb = new StringBuilder();
final FileInputStream inputStream = new FileInputStream(filePath);
// BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
long startTime = System.currentTimeMillis();
int line;
while (true) {
    line = inputStream.read(new byte[8192]);
    if (line == -1) {
        break;
    }
    sb.append(line);
}
long endTime = System.currentTimeMillis();
System.out.println("processing time: " + (endTime - startTime));
inputStream.close();

When processing was performed with both FileInputStream and BufferedInputStream as read (new byte [8192]), the execution time for both was 3ms, and as a result, the speed did not change much.

Regarding this, the description on the following site was helpful, so I will quote it.

It makes sense if you are likely to perform a large number of small reads (one or several bytes at a time), or if you want to use the high level of functionality provided by the buffer API. For example, the BufferedReader.readLine () method.

However, if you use the read (byte []) or read (byte [], int, int) methods to only read large blocks, wrapping the InputStream with a BufferedInputStream has no effect.

java – Should I always wrap the InputStream as a BufferedInputStream

After that, as mentioned above, even if the input file size is extremely small, the speed does not change much regardless of which one is used, so the effect of wrapping is small.

Summary

Before writing this article, I thought that BufferedInputStream would be absolutely faster than FileInputStream, but when I used InputStreamReader (BufferedInputSteramReader), I found that the two are not so different.

I also learned that there are some cases where using BufferedInputStream is not very effective.

As with any class, it seems important to choose the right class for each occasion.

Other reference: Experiment the power of BufferedInputStream When reading data into an array using FIO10-J. read (), make sure that the reading to the array was done as intended /fio10-j.html)

Recommended Posts

Summary of FileInputStream and BufferedInputStream
A brief summary of DI and DI containers
Summary of hashes and symbols in Ruby
Summary of Java Math.random and import (Calendar)
[Java] Personal summary of classes and methods (basic)
Summary of Japan time setting and display method
Summary of OpenJDK sources
Summary of strong parameters
Summary of jar files
Summary of information security
Summary of using FragmentArgs
Summary of using DBFlow
Summary of Java support 2018
Summary of frequently used commands in Rails and Docker
Summary of ToString behavior with Java and Groovy annotations
behavior of didSet and willSet
Overview of Docker and containers
Setup of JMeter and jEnv
Background and mechanism of Fabric-loader
[Java11] Stream Summary -Advantages of Stream-
Summary of using Butter Knife
[Java] Summary of regular expressions
Combination of search and each_with_index
[Java] Summary of operators (operator)
Judgment of JSONArray and JSONObject
Summary of "abstract interface differences"
Summary of Java language basics
[Java] Summary of for statements
Summary of Java Math class
Operator of remainder and exponentiation (exponentiation)
Advantages and disadvantages of Java
Summary of basic functions of ImageJ
Summary of 2020 programming learning output
[Java] Summary of control syntax
Summary of java error processing
[Java] Summary of design patterns
[Java] Summary of mathematical operations
[Webpacker] Summary of how to install Bootstrap and jQuery in Rails 6.0
Summary of problems and countermeasures when operating IE with WebDriver of Selenium2
Basics of conditional branching and return
Summary of rails validation (for myself)
[For beginners] Summary of java constructor
Java release date and EOL summary
About fastqc of Biocontainers and Java
Summary of [Java silver study] package
[Rails] Summary of complicated routing configurations
Proper use of redirect_to and render
Summary of devise controller initial state
This and that of the JDK
Summary of frequently used Docker commands
[Swift] Advantages and disadvantages of Storyboard
Proper use of Mockito and PowerMock
[Java] Judgment of identity and equivalence
[Rails] Differences and usage of each_with_index and each.with_index
About removeAll and retainAll of ArrayList
This and that of Core Graphics
Default implementation of Object.equals () and Object.hashCode ()
Application of downcase and slice methods
Summary of object-oriented programming using Java
Summary about the introduction of Device
This and that of exclusive control