Redash supports many data sources by default. Furthermore, even if some data sources are not supported, you can add new data sources by writing query_runner in Python. You can write query_runner straightforwardly if you have a Python library to connect to the target data source, but some DBs may only provide JDBC drivers. Since the JDBC driver can only be called from the JVM language, it cannot be directly incorporated into redash's query_runner. So I devised a way to incorporate the JDBC driver into redash, and this article will show you how to do it.
This time it is a library for calling JDBC driver from Python. Use JayDeBeApi. https://github.com/baztian/jaydebeapi
This library uses JPype under the CPython environment to bridge between Python and Java. https://github.com/jpype-project/jpype
There is Py4J in the library for calling Java from Python, and there is an introductory article on Qiita, but in comparison, JPype seems to be lighter because it does not need to start a Java process.
https://www.py4j.org/ https://qiita.com/riverwell/items/e90cbbfdac439e6e9d30
This time we will embed the JDBC driver in redash v5.0.2. The JDBC driver to be included is from MySQL, and you can download it from the following URL. https://dev.mysql.com/downloads/connector/j/5.1.html
Since redash supports MySQL as standard, there is no need to forcibly incorporate the JDBC driver, but since it is a widely popular DB, it is used for explanation.
jdbc.py Next, write query_runner. How to write query_runner is well organized in the following articles. https://discuss.redash.io/t/creating-a-new-query-runner-data-source-in-redash/347
I created the following query_runner by referring to the above article.
import json
import logging
import os
import jaydebeapi
from redash.query_runner import *
from redash.settings import parse_boolean
from redash.utils import JSONEncoder
logger = logging.getLogger(__name__)
types_map = {
jaydebeapi.STRING: TYPE_STRING,
jaydebeapi.TEXT: TYPE_STRING,
jaydebeapi.NUMBER: TYPE_INTEGER,
jaydebeapi.FLOAT: TYPE_FLOAT,
jaydebeapi.DECIMAL: TYPE_FLOAT,
jaydebeapi.DATE: TYPE_DATE,
jaydebeapi.DATETIME: TYPE_DATETIME,
jaydebeapi.ROWID: TYPE_STRING,
}
class Jdbc(BaseSQLQueryRunner):
noop_query = "SELECT 1"
@classmethod
def configuration_schema(cls):
schema = {
'type': 'object',
'properties': {
'host': {
'type': 'string',
},
'port': {
'type': 'number',
'default': 3306,
},
'user': {
'type': 'string'
},
'password': {
'type': 'string',
},
'database': {
'type': 'string',
},
},
"order": ['host', 'port', 'user', 'password', 'database'],
'required': ['host', 'port', 'user', 'database'],
'secret': ['password']
}
return schema
@classmethod
def name(cls):
return "jdbc"
@classmethod
def enabled(cls):
try:
import jaydebeapi
except ImportError:
return False
return True
def _get_tables(self, schema):
query = """
SELECT col.table_schema,
col.table_name,
col.column_name
FROM `information_schema`.`columns` col
WHERE col.table_schema NOT IN ('information_schema', 'performance_schema', 'mysql');
"""
results, error = self.run_query(query, None)
if error is not None:
raise Exception("Failed getting schema.")
results = json.loads(results)
for row in results['rows']:
if row['TABLE_SCHEMA'] != self.configuration['database']:
table_name = u'{}.{}'.format(row['TABLE_SCHEMA'], row['TABLE_NAME'])
else:
table_name = row['TABLE_NAME']
if table_name not in schema:
schema[table_name] = {'name': table_name, 'columns': []}
schema[table_name]['columns'].append(row['COLUMN_NAME'])
return schema.values()
def run_query(self, query, user):
import jaydebeapi
connection = None
try:
host = self.configuration.get('host', '')
port = self.configuration.get('port', '')
user = self.configuration.get('user', '')
password = self.configuration.get('password', '')
database = self.configuration.get('database', '')
jclassname = 'com.mysql.jdbc.Driver'
url = 'jdbc:mysql://{}:{}/{}'.format(host, port, database)
driver_args = {'user': user, 'password': password}
jar_path = '/app/redash/query_runner/mysql-connector-java-5.1.47.jar'
connection = jaydebeapi.connect(jclassname, url, driver_args, jar_path)
cursor = connection.cursor()
logger.info("JDBC running query: %s", query)
cursor.execute(query)
data = cursor.fetchall()
if cursor.description is not None:
columns = self.fetch_columns([(i[0], types_map.get(i[1], None)) for i in cursor.description])
rows = [dict(zip((c['name'] for c in columns), row)) for row in data]
data = {'columns': columns, 'rows': rows}
json_data = json.dumps(data, cls=JSONEncoder)
error = None
else:
json_data = None
error = "No data was returned."
cursor.close()
except jaydebeapi.Error as e:
json_data = None
error = str(e.message)
except KeyboardInterrupt:
cursor.close()
error = "Query cancelled by user."
json_data = None
finally:
if connection:
connection.close()
return json_data, error
register(Jdbc)
I think that most of the processing except configuration_schema can be used in common regardless of which JDBC driver is used.
Place the jdbc.py
created in the above procedure and the JDBC driver jar file in redash / query_runner
.
Dockerfile Finally, create a Docker container to run Redash. Since JRE is required for JPype to work, patch the redash standard Dockerfile. Also, create requirements.txt for installing jaydebeapi.
FROM redash/base:latest
#Added JRE installation
RUN apt-get update && apt-get install -y openjdk-8-jre
COPY redash/requirements.txt redash/requirements_dev.txt redash/requirements_all_ds.txt requirements_jaydebeapi.txt ./
RUN pip install -r requirements.txt -r requirements_dev.txt -r requirements_all_ds.txt -r requirements_jaydebeapi.txt
COPY . ./
RUN npm install && npm run build && rm -rf node_modules
RUN chown -R redash /app
USER redash
ENTRYPOINT ["/app/bin/docker-entrypoint"]
CMD ["server"]
requirements.txt
JayDeBeApi==1.1.1
JPype1==0.6.3
At this point, you can start redash.
In addition, in order to load the added query_runner into redash, it is necessary to add redash.query_runner.jdbc
to the REDASH_ADDITIONAL_QUERY_RUNNERS
environment variable and start it.
If you start redash and look at the DataSource screen, you can see that a data source called JDBC has been added.
At the moment, it looks a bit disappointing because there is no icon image showing the data source.
To put an image in this part, place the image jdbc.png
in client / app / assets / images / db-logos
and build the container again.
Now that we have registered the JDBC driver as query_runner, let's run the query.
By using JayDeBeApi I was able to use the JDBC driver with redash. With this, the variety of redash data sources has expanded, and you can have a wonderful BI life.
Recommended Posts