One letter of the alphabet with a pronunciation distinction, such as umlaut used in German, is displayed as two garbled letters. For example, the place name Kärnten becomes Kärnten.
Most of the other alphabets aren't garbled, so it's hard to notice (in fact, if you google with "Kärnten" you'll see a lot of garbled sites).
This time I had this problem when reading and writing the exif metadata of an image in Java.
utf-8
The character string saved asiso-8859-1
Because it has been read as.
Below is an example of execution in Java REPL.
python
java> String s = new String("Kärnten")
java> byte[] iso = s.getBytes("ISO-8859-1")
byte[] iso = [75, -28, 114, 110, 116, 101, 110]
java> byte[] utf8 = s.getBytes("UTF-8")
byte[] utf8 = [75, -61, -92, 114, 110, 116, 101, 110]
Thus, "ä" is represented by 1 byte ( `-28```) in ISO-8859-1 and 2 bytes (
-61, -92```) in UTF-8. To. If you save the byte string in UTF-8 and then read it as ISO-8859-1, ``
-61 will be interpreted as "Ã" and `` `-92
will be interpreted as" ¤ ". So
python
java> new String(utf8, "ISO-8859-1")
Kärnten
It turns into something like that.
The same applies to other pronunciation distinctions. Example:
Obviously, specify the correct character code for both reading and writing.
python
java> new String(utf8, "ISO-8859-1");
Kärnten
java> new String(iso, "ISO-8859-1");
Kärnten
https://forum.httrack.com/readmsg/18923/indexhtml