The Java program I wrote output the following error.
Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)
at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:306)
at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
at java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at java.base/java.io.OutputStreamWriter.write(OutputStreamWriter.java:211)
at java.base/java.io.BufferedWriter.flushBuffer(BufferedWriter.java:120)
at java.base/java.io.BufferedWriter.flush(BufferedWriter.java:256)
(Hereafter, the stack trace part of the application code is omitted)
It's easy to see that it's a character encoding issue, but I can't think of a reason when I look at the relevant part of the application code below.
File file = ...;
try (BufferedWriter writer = java.nio.file.Files.newBufferedWriter(file, StandardCharsets.UTF_8)) {
String s = ...;
writer.write(s);
writer.flush(); //Since an error occurs here, the argument of the previous write is suspicious...
}
In such cases, it is customary to dump the string s and examine the contents of the s.
System.err.println("[" + s + "]");
Then, it was displayed as follows. What is?, Garbled?
In the first place, does System.err # println
give no error?
[?]
It was a detour that I didn't immediately notice Surrogate Pair.
When I looked up each letter of s with the following code, I got High.
. This is the cause of the first error.
for (char c : s.toCharArray()) {
if (Character.isHighSurrogate(c)) {
System.err.println("High.");
}
if (Character.isLowSurrogate(c)) {
System.err.println("Low.");
}
}
As of 2020, is Java programming that properly considers surrogate pairs a common sense practice?