Verification environment

--Using Oracle Cloud

Oracle Linux 7.7 (VM.Standard2.1)
Python 3.6
cx_Oracle 7.3
Oracle Database 19.5 (ATP, 1OCPU)
Oracle Instant Client 18.5

Introduction

In the past serialization, we proceeded in a way that did not handle Japanese data. However, most of the people who read this series are probably working in a Japanese environment and using the Oracle Database for Japanese data. This time, I will explain how to handle SELECT of the table where Japanese is stored without garbled characters.

Advance preparation

Please run the following script using SQL * Plus, SQL Developer, etc. You can substitute an existing table that contains Japanese data.

`sample05a.sql`


create table sample05a (col1 varchar2(50));
insert into sample05a values('Oracle Corporation Japan');
commit;

NLS_LANG The following applications are prepared as samples. If you have prepared another table in advance, change the SELECT statement so that the Japanese data column is specified in the first column and execute it. In that case, since this sample application displays only one line, if you can narrow down to one by specifying the primary key in the WHERE clause and take measures such as limiting the number of lines to the first line, extra processing time Do not take.

`sample05b.py`


import cx_Oracle

USERID = "admin"
PASSWORD = "FooBar"
DESTINATION = "atp1_low"
SQL = "select * from sample05a"

with cx_Oracle.connect(USERID, PASSWORD, DESTINATION) as connection:
        with connection.cursor() as cursor:
                print((cursor.execute(SQL).fetchone())[0])

Generally, if you need to support Japanese when running an application that accesses Oracle Database, you need to set the environment variable NLS_LANG in most environments. NLS_LANG is also valid for cx_Oracle. The following is a comparison of the above applications with and without NLS_LANG.

$ echo $LANG
en_US.UTF-8
$ echo $NLS_LANG

$ python sample05b.py
??????????
$ export NLS_LANG=japanese_japan.al32utf8
$ python sample05b.py
Oracle Corporation Japan

Before setting NLS_LANG, the data is not displayed correctly as "??????????". In Oracle Database, "?" Is displayed when the character code conversion between DB and client is not possible. After setting NLS_LANG, it is displayed correctly.

encoding argument

In cx_Oracle, apart from NLS_LANG, the character encoding can be specified in the argument encoding at the time of connection (cx_Oracle's connect () method). The default encoding argument is None, which is not set. UTF-8 is the standard in Python3, so if you connect by specifying UTF-8 in encoding, you can display Japanese data without setting NLS_LANG. However, for example, if you want to receive Oracle Database error messages in Japanese, you need NLS_LANG. You also need to consider non-Python applications such as SQL * Plus, so it's a good idea to move it to NLS_LANG or set both NLS_LANG and encoding.

`sample05c.py(Excerpt)`


with cx_Oracle.connect(USERID, PASSWORD, DESTINATION, encoding="UTF-8") as connection:

[Introduction to cx_Oracle] (5th) Handling of Japanese data