There are cases where large files are exchanged as they are, with gzip compression. Organized reading and writing files with gzip in java. It's from Java8, but it seems that java11 can also be used.
Take a gzip-compressed csv file as an example
The point is
Java
Path path = Paths.get("read_test.csv.gz");
try(
InputStream is = Files.newInputStream(path);
GZIPInputStream gis = new GZIPInputStream(is);
InputStreamReader isReader = new InputStreamReader(gis, StandardCharsets.UTF_8);
BufferedReader br = new BufferedReader(isReader);
) {
br.lines().forEach(System.out::println);
}
Wrapping with BufferedReader is for performance. In practice, it is customary to use a csv loading library. Here is a sample using univocity_parsers.
This is an example of handling with an iterator assuming a large capacity.
Path path = Paths.get("read_test.csv.gz");
try(
InputStream is = Files.newInputStream(path);
GZIPInputStream gis = new GZIPInputStream(is);
InputStreamReader isReader = new InputStreamReader(gis, StandardCharsets.UTF_8);
BufferedReader br = new BufferedReader(isReader);
) {
CsvParserSettings parserSettings = new CsvParserSettings();
CsvRoutines routines = new CsvRoutines(parserSettings);
Iterator<TestDTO> iterator = routines.iterate(TestDTO.class, br).iterator();
iterator.forEachRemaining(x -> System.out.println(x.toString()));
}
The point is almost the same as reading
Path path = Paths.get("write_test.csv.gz");
try (
OutputStream os = Files.newOutputStream(path,StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE);
GZIPOutputStream gzip = new GZIPOutputStream(os);
OutputStreamWriter ow = new OutputStreamWriter(gzip, StandardCharsets.UTF_8);
BufferedWriter bw = new BufferedWriter(ow);) {
List<String> rows = ...;
rows.stream().forEach(row -> bw.write(row));
}
Here is an example of writing using the csv library.
Path path = Paths.get("write_test.csv.gz");
try (
OutputStream os = Files.newOutputStream(path,StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE);
GZIPOutputStream gzip = new GZIPOutputStream(os);
OutputStreamWriter ow = new OutputStreamWriter(gzip, StandardCharsets.UTF_8);
BufferedWriter bw = new BufferedWriter(ow);) {
List<TestDTO> rows = ...;
CsvWriterSettings writerSettings = new CsvWriterSettings();
CsvWriter writer = new CsvWriter(bw , writerSettings);
rows.stream().forEach(rows -> writer.processRecord(row));
}
I don't know how to do it, but it's like how many times I should wrap an object. Basically, if you pass a reader or writer to the csv library, it will often do it for you. In that respect, it's worth wrapping many times.
Recommended Posts