In R, if there is Japanese in the script, it sometimes causes inconvenience, so we deal with it in the following way.
# R
intToUtf8(c(12371, 12435, 12395, 12385, 12399))
## [1] "Hello"
Which number the character you want corresponds to
# R
utf8ToInt("Hello")
[1] 12371 12435 12395 12385 12399
I checked it once and tried to write it in the script without using Japanese.
You can also look it up in Python.
# python3
[ord(s) for s in "Hello"]
## [12371, 12435, 12395, 12385, 12399]
For python2 series, u "" is required.
# python2
[ord(s) for s in u"Hello"]
## [12371, 12435, 12395, 12385, 12399]
It seems that you can also specify Unicode in R.
"\u3053\u3093\u306b\u3061\u306f"
## [1] "Hello"
Is the code specified in hexadecimal? There are many ways to get the hexadecimal code.
In R, it looks like this.
# R
sprintf("%x", utf8ToInt("Hello"))
[1] "3053" "3093" "306b" "3061" "306f"
You can use hex in Python.
# python3
[hex(ord(s)) for s in "Hello"]
['0x3053', '0x3093', '0x306b', '0x3061', '0x306f']
By the way, when embedding in R package, if you use a character string of "\ u ..." format in the function definition, the following warning seems to appear.
plotat.Rd: non-ASCII input and no declared encoding
It seems that it is not recommended to use double-byte characters in R help.
Recommended Posts