[Java] Align characters even with mixed half-width and full-width characters

Introduction

When using String # format in Java with a mixture of half-width and full-width characters, the character position may shift. Even if I looked up too much, there was no easy way to solve it, so I tried to build a function.

solution

Immediately, but the solution is as follows. (For the time being, just copy and paste the following format () and getByteLength () to solve the problem)

main


    public static void main(String[] args){
        //Shift format statement
        System.out.println(String.format("%-20s", "Ahhhh")+" | ");
        System.out.println(String.format("%-20s", "aaaaa")+" | ");
        //Format statement that does not shift
        System.out.println(format("Ahhhh", 20)+" | ");
        System.out.println(format("aaaaa", 20)+" | ");
    }

    private static String format(String target, int length){
        int byteDiff = (getByteLength(target, Charset.forName("UTF-8"))-target.length())/2;
        return String.format("%-"+(length-byteDiff)+"s", target);
    }

    private static int getByteLength(String string, Charset charset) {
        return string.getBytes(charset).length;
    }

result


Ahhhh| 
aaaaa                | 
Ahhhh| 
aaaaa                | 

(It is a secret that | was misaligned when copying and pasting to Qiita ...) The following is a digression.

Reasons for deviation and countermeasures

In the first place, the reason why there is a difference between half-width and full-width is that the number of bytes differs as follows when expressing characters with ʻUTF-8`.

So, I am adjusting with ʻint byte Diff in main`. Also, from the following function, you can see that Japanese is 3 bytes and English is 1 byte.

analyze


    public static void main(String[] args){
        analyze("Ahhhh");
        analyze("aaaaa");
    }

    private static void analyze(String target){
        System.out.println("\n"+target);
        int length=20;
        System.out.println(format("length()", length)+":"+target.length());
        System.out.println(format("UTF-8 length()", length)+":"+getByteLength(target, Charset.forName("UTF-8")));
        System.out.println(format("diff", length)+":"+(getByteLength(target, Charset.forName("UTF-8"))-target.length())/2);
    }

result


Ahhhh
length()            :5
UTF-8 length()      :15
diff                :5

aaaaa
length()            :5
UTF-8 length()      :5
diff                :0

reference

Recommended Posts

[Java] Align characters even with mixed half-width and full-width characters
Full-width → half-width conversion with Java String (full-width kana → half-width kana)
[Java] Full-width ⇔ half-width conversion
[Swift] About the inability to distinguish between full-width and half-width characters with NS Predicate
Use java with MSYS and Cygwin
Distributed tracing with OpenCensus and Java
Install Java and Tomcat with Ansible
Use JDBC with Java and Scala.
Output PDF and TIFF with Java 8
Encrypt with Java and decrypt with C #
Monitor Java applications with jolokia and hawtio
Link Java and C ++ code with SWIG
Let's try WebSocket with Java and javascript!
[Java] Reading and writing files with OpenCSV
Challenge to deal with garbled characters with Java AudioSystem.getMixerInfo ()
Build and test Java + Gradle applications with Wercker
Try to link Ruby and Java with Dapr
JSON with Java and Jackson Part 2 XSS measures
Prepare a scraping environment with Docker and Java
KMS) Envelope encryption with openssl and java decryption
Encrypt / decrypt with AES256 in PHP and Java
[Java] Convert and import file values with OpenCSV
[Review] Reading and writing files with java (JDK6)