Hello, this is Misuda of engineers. This time, I will describe that I tried to write and read UTF-8 with BOM by csv import!
When developing a business system, we receive a request "I want to create data!" In a batch. .. .. At that time, I think about importing with csv first. Creating an API also costs the other party to develop.
** The problem here is how to edit csv. ** ** Are you using * Microsoft Excel *? After all, it's easy to edit!
When considering the use of * Microsoft Excel *, you can edit it by creating a CSV with Shift-JIS. If the DB is UTF-8, it is necessary to convert the character code on the server side. When this happens, it is a battle with the character code. To be honest, I don't feel like winning.
In such a case, UTF-8 with BOM (byte order mark) seems to open with * Microsoft Excel * without garbled characters!
This time, JAVA will generate the file. In the case of UTF-8, the beginning of the file will be [0xEF 0xBB 0xBF].
import java.io.*;
import java.util.Arrays;
import java.util.List;
public class Main {
/**
*Create a CSV file with BOM (character code is UTF)-8)
*
* @param
* @return
*/
public static void main(String[] args) {
File file = new File("File path");
List header = Arrays.asList("Apple","Mandarin orange","banana","Strawberry","melon","Grape");
try(FileOutputStream fos = new FileOutputStream(file);
OutputStreamWriter osw = new OutputStreamWriter(fos, "UTF-8");
PrintWriter writer = new PrintWriter(osw)){
//BOM grant
fos.write(0xef);
fos.write(0xbb);
fos.write(0xbf);
header.forEach(c -> {
writer.print(c);
writer.print(",");
});
} catch (IOException e) {
System.out.println("Failed to generate the file.");
}
}
}
It's okay if the generated file is definitely UTF-8 with BOM, but sometimes it isn't. Enter the judgment and read.
import java.io.*;
import java.nio.charset.StandardCharsets;
import org.apache.commons.codec.binary.Hex;
public class Main {
/**
*Read CSV file with BOM (character code is UTF)-8)
*
* @param
* @return
*/
public static void main(String[] args) {
File file = new File("File path");
try (FileInputStream fs = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(fs, StandardCharsets.UTF_8);
LineNumberReader lnr = new LineNumberReader(isr)) {
//The first line
String row = lnr.readLine();
if (row != null && !row.isEmpty()) {
//Get the first character
String bom = row.substring(0, 1);
//Convert first character to byte to character(Use Apache Commons Codec Hex class)
String bomByte = new String(Hex.encodeHex(bom.getBytes()));
if ("efbbbf".equals(bomByte)) {
//Eliminate BOM
row = row.substring(1);
}
System.out.println(row);
}
//Split information from the second line
} catch (Exception e) {
System.out.println("Failed to read the file.");
}
}
}
Both MacOS and WindowsOS were opened in * Microsoft Excel * and were not garbled and could be edited! After that, I think that he is editing using a text file. I wonder if there is no choice but to support it.
Recommended Posts