Myanmar is one of the regions where the development of Internet technology has been left behind from the world due to its historical background. In the past, the Zawgyi character code was the mainstream, but as the market opens and internationalization progresses, It is changing to UNICODE.
In the case of Japan, it resembles the history of changing to UTF8 from the time when there were Web sites such as EUC and SJIS. https://enjoy-yangon.com/ja/enyanblog/351-change-myanmar-font-zawgyi-to-unicode
If you are not a local person, the characters themselves are garbled, so it is true that we engineers and proramers do not know what the problem is. However, if you are an engineer, you need to work on solving the problem.
In other words, you need to determine the requirements required to solve the problem and solve it with software.
Requirement 1 Zawgyi or UNICODE can be judged in sentences Requirement 2 Character code conversion from Zawgyi to UNICODE
These two points are essential requirements.
I searched for Github etc. Google Myanmar Tool has been a hit. https://github.com/google/myanmar-tools
If you check this content, it is written that it has a function to judge Zawgyi or UNICODE. Use this.
Further hints are hidden, use Rabit to convert character code from Zawgyi to UNICODE
Rabbit-Converter https://github.com/Rabbit-Converter
Two libraries were found.
With PHP, all you have to do is install the library with composer, load the class and pass it through. It's easy to use.
python
$ZawgyiDetector = new ZawgyiDetector();
$Rabbit = new Rabbit();
$text = 'Myanmar text';
$check = $ZawgyiDetector->getZawgyiProbability($input1);
if($check >= 0.95){
$newtext = $Rabbit->zg2uni($text);
}
If you correct the character code like this, it will be displayed correctly in UNICODE. For UNICODE Myanmar fonts, the UNICODE version of the web font must be applied to CSS.
When using CMS etc., if you put this code in either when you put it in the database or when you put it out, the garbled characters will be solved. I think it's better to add a check function when putting it in the database. If you run this logic every time, rendering will be slow depending on the number of characters.
This is a rare story, but if you work on the Myanmar-related website, please refer to it.
See you again.
Recommended Posts