My knowledge of Java has almost stopped at "1.4". In order to change the current situation, I will upgrade it little by little.
Currently, the app that I mainly touched in business is running at "5", Originally it was an app made before "1.4", and the old description remains. Through the version upgrade of this app, this time I will only describe what I investigated about the functions added in "6".
Was Java6 a lot of desktop-related enhancements? Maybe you may use it in an app that you are touching at work? I investigated two functions that I thought.
Unicode normalization is now possible. Suddenly difficult ...
Reference: Unicode Normalization
There are four types of Unicode normalization:
type | Overview |
---|---|
NFC | After decomposition based on canonical equivalence, resynthesize based on canonical equivalence |
NFD | Decomposition based on canonical equivalence |
NFKC | After decomposition based on compatibility equivalence, resynthesize based on canonical equivalence |
NFKD | Decomposition based on compatibility equivalence |
――What is synthesis? Processing that converts a combined string to a synthesized character
――What is disassembly? Processing that converts precomposed characters to combined strings
--What is a combined string? A character string that is a combination of multiple characters but is represented by a single character
--What are precomposed characters? Characters such as "ga", "gi", and "gu" that are expressed as one character by combining "ka", "ki", "ku", and dakuten ("). Reference: [Compilated characters](https://ja.wikipedia.org/wiki/%E5%90%88%E6%88%90%E6%B8%88%E3%81%BF%E6%96%87 % E5% AD% 97)
System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFC));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFC));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFC));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFC));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFC));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFC));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFC));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFC));
System.out.println("9:① -> " + Normalizer.normalize("①", Normalizer.Form.NFC));
System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFKC));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKC));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKC));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFKC));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKC));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKC));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFKC));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFKC));
System.out.println("9:① -> " + Normalizer.normalize("①", Normalizer.Form.NFKC));
System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFD));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFD));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFD));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFD));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFD));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFD));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFD));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFD));
System.out.println("9:① -> " + Normalizer.normalize("①", Normalizer.Form.NFD));
System.out.println("1:C-> " + Normalizer.normalize("C", Normalizer.Form.NFKD));
System.out.println("2:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKD));
System.out.println("3:Ba-> " + Normalizer.normalize("Ba", Normalizer.Form.NFKD));
System.out.println("4:If-> " + Normalizer.normalize("If", Normalizer.Form.NFKD));
System.out.println("5:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKD));
System.out.println("6:Pacific League-> " + Normalizer.normalize("Pacific League", Normalizer.Form.NFKD));
System.out.println("7:Pa-> " + Normalizer.normalize("Pa", Normalizer.Form.NFKD));
System.out.println("8:㍉ -> " + Normalizer.normalize("㍉", Normalizer.Form.NFKD));
System.out.println("9:① -> " + Normalizer.normalize("①", Normalizer.Form.NFKD));
When I tried the above code, NFC has no conversion. In NFKC, half-width katakana is converted to full-width katakana, and special characters are converted to the corresponding characters. NFD does not convert half-width katakana. The voiced sound (semi-voiced sound) is divided into a voiced sound (semi-voiced sound) and a clear sound. No conversion for special characters. In NFKD, half-width katakana is divided into full-width katakana, and voiced sound (semi-voiced sound) is divided into voiced sound (semi-voiced sound) and seion. Special characters are converted to the corresponding characters. have become.
Reference: Unicode Normalizer
By using NFKC
, it seems that it can be used as a method to convert half-width kana to full-width kana,
There seems to be some places to be careful.
String before = "Converting Hankaku Kana to full-width Kana";
String after = Normalizer.normalize(before, Normalizer.Form.NFKC);
System.out.println(before);
System.out.println(after);
http://d.hatena.ne.jp/stealthinu/20140826/p1
There is no problem with the above, but in the following cases I thought that the double-byte yen mark (¥) would be output as it is. It was converted and output.
String before = "Coffee milk ¥ 110";
String after = Normalizer.normalize(before, Normalizer.Form.NFKC);
System.out.println(before);
System.out.println(after);
The character code is as follows.
before:ff7a ff70 ff8b ff70 725b 4e73 20 ffe5 31 31 30
after :30b3 30fc 30d2 30fc 725b 4e73 20 a5 31 31 30
When converting half-width kana to full-width kana, it may be better to create something like a mapping table and create logic that converts matching characters.
You can copy an array by using ʻArrays.copyOf`.
String[] before = {"coffee", "milk", "110 yen"};
String[] after = Arrays.copyOf(before, before.length);
System.out.println("before:" + Arrays.toString(before));
System.out.println("after :" + Arrays.toString(after));
before[0] = "Strawberry";
after[2] = "150 yen";
System.out.println("before:" + Arrays.toString(before));
System.out.println("after :" + Arrays.toString(after));
before:[coffee,milk,110 yen]
after :[coffee,milk,110 yen]
before:[Strawberry,milk,110 yen]
after :[coffee,milk,150 yen]
It is a deep copy. The toString of the Arrys class is a method added in "1.5".
Next, I'm going to check "7", "8", and "9" together.
Recommended Posts