Overview

My knowledge of Java has almost stopped at "1.4". In order to change the current situation, I will upgrade it little by little.

Currently, the app that I mainly touched in business is running at "5", Originally it was an app made before "1.4", and the old description remains. Through the version upgrade of this app, this time I will only describe what I investigated about the functions added in "6".

What I looked up

Was Java6 a lot of desktop-related enhancements? Maybe you may use it in an app that you are touching at work? I investigated two functions that I thought.

Unicode normalization

Unicode normalization is now possible. Suddenly difficult ...

Reference: Unicode Normalization

There are four types of Unicode normalization:

type	Overview
NFC	After decomposition based on canonical equivalence, resynthesize based on canonical equivalence
NFD	Decomposition based on canonical equivalence
NFKC	After decomposition based on compatibility equivalence, resynthesize based on canonical equivalence
NFKD	Decomposition based on compatibility equivalence

――What is synthesis? Processing that converts a combined string to a synthesized character

――What is disassembly? Processing that converts precomposed characters to combined strings

--What is a combined string? A character string that is a combination of multiple characters but is represented by a single character

--What are precomposed characters? Characters such as "ga", "gi", and "gu" that are expressed as one character by combining "ka", "ki", "ku", and dakuten ("). Reference: [Compilated characters](https://ja.wikipedia.org/wiki/%E5%90%88%E6%88%90%E6%B8%88%E3%81%BF%E6%96%87 % E5% AD% 97)

Unicode normalization sample

System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFC));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFC));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFC));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFC));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFC));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFC));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFC));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFC));
System.out.println("9：① -> " + Normalizer.normalize("①", Normalizer.Form.NFC));

NFKC

System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFKC));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKC));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKC));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFKC));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKC));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKC));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFKC));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFKC));
System.out.println("9：① -> " + Normalizer.normalize("①", Normalizer.Form.NFKC));

System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFD));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFD));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFD));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFD));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFD));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFD));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFD));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFD));
System.out.println("9：① -> " + Normalizer.normalize("①", Normalizer.Form.NFD));

NFKD

System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFKD));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKD));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKD));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFKD));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKD));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKD));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFKD));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFKD));
System.out.println("9：① -> " + Normalizer.normalize("①", Normalizer.Form.NFKD));

When I tried the above code, NFC has no conversion. In NFKC, half-width katakana is converted to full-width katakana, and special characters are converted to the corresponding characters. NFD does not convert half-width katakana. The voiced sound (semi-voiced sound) is divided into a voiced sound (semi-voiced sound) and a clear sound. No conversion for special characters. In NFKD, half-width katakana is divided into full-width katakana, and voiced sound (semi-voiced sound) is divided into voiced sound (semi-voiced sound) and seion. Special characters are converted to the corresponding characters. have become.

Reference: Unicode Normalizer

By using NFKC, it seems that it can be used as a method to convert half-width kana to full-width kana, There seems to be some places to be careful.

String before = "Converting Hankaku Kana to full-width Kana";
String after = Normalizer.normalize(before, Normalizer.Form.NFKC);
System.out.println(before);
System.out.println(after);

http://d.hatena.ne.jp/stealthinu/20140826/p1

There is no problem with the above, but in the following cases I thought that the double-byte yen mark (¥) would be output as it is. It was converted and output.

String before = "Coffee milk ￥ 110";
String after = Normalizer.normalize(before, Normalizer.Form.NFKC);
System.out.println(before);
System.out.println(after);

The character code is as follows.

before:ff7a ff70 ff8b ff70 725b 4e73 20 ffe5 31 31 30 
after :30b3 30fc 30d2 30fc 725b 4e73 20 a5 31 31 30

When converting half-width kana to full-width kana, it may be better to create something like a mapping table and create logic that converts matching characters.

Copy of array

You can copy an array by using ʻArrays.copyOf`.

String[] before = {"coffee", "milk", "110 yen"};
String[] after = Arrays.copyOf(before, before.length);
System.out.println("before:" + Arrays.toString(before));
System.out.println("after :" + Arrays.toString(after));
before[0] = "Strawberry";
after[2] = "150 yen";
System.out.println("before:" + Arrays.toString(before));
System.out.println("after :" + Arrays.toString(after));

before:[coffee,milk,110 yen]
after :[coffee,milk,110 yen]
before:[Strawberry,milk,110 yen]
after :[coffee,milk,150 yen]

It is a deep copy. The toString of the Arrys class is a method added in "1.5".

Next, I'm going to check "7", "8", and "9" together.

What I researched about Java 6

Overview

What I looked up

Unicode normalization

Unicode normalization sample

Copy of array