When using the "yen mark (half-width ¥)" on Android Because the character code has been treated as "backslash (half-width /)" in Windows We have summarized the difference in character code points between the yen mark and backslash and this measure.
With UTF-8 in each environment The following code points are used for the yen code and backslash.
Windows Web Yen mark: 5C Backslash: 5C
Mac / iOS / Android Web Yen mark: A5 Backslash: 5C
iOS / Android native app Yen mark: C2A5 Backslash: 5C
[Reference URL] Character code of yen mark and backslash
As a countermeasure, all the code points of the yen mark, This is a countermeasure for Windows Web (code point: A5 or C2A5 ⇒ C5).
I created a class (CodePointConversion.java) that performs codepoint conversion. Converts the target code point (A5 or C2A5) to the code point (C5).
CodePointConversion.java
import java.util.Map;
import java.util.HashMap;
import java.lang.StringBuilder;
/**
*Code point conversion class
* @author HogeHoge
*/
public class CodePointConversion {
//Code point conversion table. Convert from KEY to VALUE code point.
private static Map<Integer, Integer> conversion_map = new HashMap<Integer, Integer>() {
// ¥ → \
{put(0xA5, 0x5C);
put(0xC2A5, 0x5C);}
};
/**
*Perform code point conversion
* @param str Code point conversion string
* @return String after code point conversion
*/
public static String convertCordPoint(String str) {
//null check
if (str == null) {
return str;
}
StringBuilder sb = new StringBuilder(str);
//Get character loop
for(int i = 0; i < sb.length(); i++) {
//Get the code point of the acquired character
int code_point = sb.codePointAt(i);
for (Map.Entry<Integer, Integer> entry : conversion_map.entrySet()) {
if (code_point == entry.getKey()) {
//If it is a code point conversion target, perform code point conversion.
String converted_char = new String(Character.toChars(entry.getValue()));
sb.replace(i, i+1, converted_char);
}
}
}
return sb.toString();
}
}
This is a test code for checking the operation. For the input value and output value, check the execution result.
CodePointConversionTest.java
import org.junit.jupiter.api.Test;
import static org.junit.Assert.*;
import static org.hamcrest.CoreMatchers.*;
class CodePointConversionTest {
@Test
void testConvertCordPoint001() {
//[Input value] \\ * \ is a half-width character.
String input_str = new String(Character.toChars(0xA5)) + new String(Character.toChars(0xC2A5));
//【Expected value】\\※the first\Is an escape character.
String expect_str = "\\\\";
System.out.println("【Input value】\n" + input_str);
System.out.println("[Input code point]\n" + Integer.toHexString(input_str.codePointAt(0)) + "\n" + Integer.toHexString(input_str.codePointAt(1)));
//Code point conversion. ¥ →\Is converted to.
String result_str = CodePointConversion.convertCordPoint(input_str);
//Output value
System.out.println("\n [Output value]\n" + result_str);
System.out.println("[Output code point]\n" + Integer.toHexString(result_str.codePointAt(0)) + "\n" + Integer.toHexString(result_str.codePointAt(1)));
//Check the output result.
assertThat(result_str, is(expect_str));
}
}
The following execution results.
【Input value】
\?
[Input code point]
a5
c2a5
【Output value】
\\
[Output code point]
5c
5c
The above input value is "?", But it is a problem on the screen display of the Windows terminal. Actually, the character string "Yen mark: A5 Yen mark: C2A5" is included as an input value.
The methods described so far have changed the yen mark to a backslash (code point: A5 or C2A5 ⇒ C5). I described how to convert code points, but On Android, there may be cases where you want to treat it as a backslash ⇒ yen mark (code point: C5 ⇒ A5).
In that case, please handle in the form of reversing the KEY and VALUE of conversion_map (code point conversion table).
(Example) How to treat the code point of C5 as A5.
//Code point conversion table. Convert from KEY to VALUE code point.
private static Map<Integer, Integer> conversion_map = new HashMap<Integer, Integer>() {
// \ → ¥
{put(0x5C, 0xA5);}
};
This time, I described how to perform code point conversion (code point: A5 or C2A5 ⇔ C5) of yen mark and backslash. Just add code points to conversion_map (code point conversion table) It is possible to convert various code points.
If you like, please use it when performing code point conversion in Java.