Reading and writing gzip files in Java

Introduction

There are cases where large files are exchanged as they are, with gzip compression. Organized reading and writing files with gzip in java. It's from Java8, but it seems that java11 can also be used.

Read

Take a gzip-compressed csv file as an example

The point is

  1. Use try-with-resources syntax
  2. Wrap InputSream with GZIPInputStream
  3. Wrap GZIPInputStream with InputStreamReader
  4. Encoding can be specified for InputStreamReader
  5. Wrap the InputStreamReader with a BufferedReader

Java



Path path = Paths.get("read_test.csv.gz");
try(
  InputStream is = Files.newInputStream(path);
  GZIPInputStream gis = new GZIPInputStream(is);
  InputStreamReader isReader = new InputStreamReader(gis, StandardCharsets.UTF_8);
  BufferedReader br = new BufferedReader(isReader); 
) {
  br.lines().forEach(System.out::println);
}

Wrapping with BufferedReader is for performance. In practice, it is customary to use a csv loading library. Here is a sample using univocity_parsers.

This is an example of handling with an iterator assuming a large capacity.

Path path = Paths.get("read_test.csv.gz");
try(
  InputStream is = Files.newInputStream(path);
  GZIPInputStream gis = new GZIPInputStream(is);
  InputStreamReader isReader = new InputStreamReader(gis, StandardCharsets.UTF_8);
  BufferedReader br = new BufferedReader(isReader); 
) {
  CsvParserSettings parserSettings = new CsvParserSettings();
  CsvRoutines routines = new CsvRoutines(parserSettings);
  Iterator<TestDTO> iterator = routines.iterate(TestDTO.class, br).iterator();
  iterator.forEachRemaining(x -> System.out.println(x.toString()));
}

writing

The point is almost the same as reading

  1. Use try-with-resources syntax
  2. Wrap the Output Stream with a GZIP Output Stream
  3. Wrap GZIPOutputStream with OutputStreamWriter
  4. OutputStreamWriter can specify the encoding
  5. Wrap OutputStreamWriter with BufferedWriter
Path path = Paths.get("write_test.csv.gz");
try (
  OutputStream os = Files.newOutputStream(path,StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE);
  GZIPOutputStream gzip = new GZIPOutputStream(os);
  OutputStreamWriter ow = new OutputStreamWriter(gzip, StandardCharsets.UTF_8);
  BufferedWriter bw = new BufferedWriter(ow);) {
  List<String> rows = ...;
  rows.stream().forEach(row -> bw.write(row)); 
}  

Here is an example of writing using the csv library.

Path path = Paths.get("write_test.csv.gz");
try (
  OutputStream os = Files.newOutputStream(path,StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE);
  GZIPOutputStream gzip = new GZIPOutputStream(os);
  OutputStreamWriter ow = new OutputStreamWriter(gzip, StandardCharsets.UTF_8);
  BufferedWriter bw = new BufferedWriter(ow);) {
  List<TestDTO> rows = ...;
  CsvWriterSettings writerSettings = new CsvWriterSettings();
  CsvWriter writer = new CsvWriter(bw , writerSettings);
  rows.stream().forEach(rows -> writer.processRecord(row));
}  

Impressions

I don't know how to do it, but it's like how many times I should wrap an object. Basically, if you pass a reader or writer to the csv library, it will often do it for you. In that respect, it's worth wrapping many times.

Reference link

Recommended Posts

Reading and writing gzip files in Java
Reading and writing Java basic files
[Java] Reading and writing files with OpenCSV
Implement writing and reading to Property List (.plist) in Swift
Reading and writing gzip files in Java
[Review] Reading and writing files with java (JDK6)
Differences in writing Java, C # and Javascript classes
Read binary files in Java 1
Read binary files in Java 2
Implementation of gzip in java
I tried to chew C # (reading and writing files)
Implement writing and reading to Property List (.plist) in Swift
Easily read text files in Java (Java 11 & Java 7)
Encoding and Decoding example in Java
StringBuffer and StringBuilder Class in Java
Understanding equals and hashCode in Java
Hello world in Java and Gradle
Difference between final and Immutable in Java
Differences in writing in Ruby, PHP, Java, JS
[Java] for Each and sorted in Lambda
Convert SVG files to PNG files in Java
Play RAW, WAV, MP3 files in Java
Continued Talk about writing Java in Emacs @ 2018
Arrylist and linked list difference in java
Program PDF headers and footers in Java
The story of writing Java in Emacs
Learn Flyweight patterns and ConcurrentHashMap in Java
Java Direction in C ++ Design and Evolution
Difference between int and Integer in Java
Discrimination of Enums in Java 7 and above
Specify the order in which configuration files and classes are loaded in Java
Notes for reading and generating xlsx files from Java using Apache POI
Create barcodes and QR codes in Java PDF
Detect similar videos in Java and OpenCV rev.2
[Java] What should I use for writing files?
Technology for reading Java source code in Eclipse
Partization in Java
Parallel and parallel processing in various languages (Java edition)
Difference between next () and nextLine () in Java Scanner
Changes in Java 11
Put CSV files containing "'" and "" "in MySQL in Ruby 2.3
Rock-paper-scissors in Java
Capture and save from selenium installation in Java
Detect similar videos in Java and OpenCV rev.3
[Java] Understand in 10 minutes! Associative array and HashMap
Basics of threads and Callable in Java [Beginner]
Summarize the differences between C # and Java writing
XXE and Java
Introduction to Apache Beam (1) ~ Reading and writing text ~
Java adds and removes watermarks in word documents
Detect similar videos in Java and OpenCV rev.1
Pi in Java
Represents "next day" and "previous day" in Java / Android
Import files of the same hierarchy in Java
Upload and download notes in java on S3
Encrypt / decrypt with AES256 in PHP and Java
Generate OffsetDateTime from Clock and LocalDateTime in Java
FizzBuzz in Java
Organize your own differences in writing comfort between Java lambda expressions and Kotlin lambda expressions.
[Android / Java] Screen transition and return processing in fragments
Convert JSON and YAML in Java (using Jackson and SnakeYAML)
I tried Mastodon's Toot and Streaming API in Java