Since I installed Redash in my local environment the other day, this time I will put the data in the database and execute the query with Redash. Please see here for installing Redash.
I'm using a MacBook Air. The OS etc. are as follows.
$sw_vers
ProductName: Mac OS X
ProductVersion: 10.13.6
BuildVersion: 17G14033
$python --version
Python 3.7.3
$docker --version
Docker version 20.10.0, build 7287ab3
It is assumed that Redash is running.
$docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
caa3348d587a redash_scheduler "/app/bin/docker-ent…" 43 minutes ago Up 42 minutes 5000/tcp redash_scheduler_1
00e34fe2ef97 redash_worker "/app/bin/docker-ent…" 43 minutes ago Up 42 minutes 5000/tcp redash_worker_1
0f60c28cecbd redash_server "/app/bin/docker-ent…" 43 minutes ago Up 43 minutes 0.0.0.0:5000->5000/tcp, 0.0.0.0:5678->5678/tcp redash_server_1
518d2c7aad0d postgres:9.5-alpine "docker-entrypoint.s…" 43 minutes ago Up 43 minutes 0.0.0.0:15432->5432/tcp redash_postgres_1
c1e85b4cfa69 redis:3-alpine "docker-entrypoint.s…" 27 hours ago Up 27 hours 6379/tcp redash_redis_1
d32426258263 djfarrelly/maildev "bin/maildev --web 8…" 27 hours ago Up 27 hours 25/tcp, 0.0.0.0:1080->80/tcp redash_email_1
The data uses the easily available scikit-learn Boston dataset. Boston home price data, 506 rows and 14 columns. The data items are described in detail in here, so I will omit the explanation.
First of all, I will take this data from scikit-learn and dump it to a csv file. Run the python code for that.
data.py
import pandas as pd
import sklearn.datasets as skd
data = skd.load_boston()
df_X = pd.DataFrame(data.data, columns=data.feature_names)
df_y = pd.DataFrame(data.target, columns=['y'])
df = pd.concat([df_X, df_y], axis=1)
df.to_csv('data.csv', index=True, index_label='ID')
python data.When executed with py, data.csv is created.
After this, put it in the DB.
# Database preparation
This time the database uses PostgreSQL, which is created when Redash is started.
I want to bring the csv file and sql file to the container created by `` `docker-compose```, so mount the local folder.
I want to change the docker-compose setting a little, so I added lines 61 and 62 for the following files.
#### **`docker-compose.yml `**
```yml
(Omission)
50 postgres:
51 image: postgres:9.5-alpine
52 # The following turns the DB into less durable, but gains significant performance improvements for the tests run (x3
53 # improvement on my personal machine). We should consider moving this into a dedicated Docker Compose configuration for
54 # tests.
55 ports:
56 - "15432:5432"
57 command: "postgres -c fsync=off -c full_page_writes=off -c synchronous_commit=OFF"
58 restart: unless-stopped
59 environment:
60 POSTGRES_HOST_AUTH_METHOD: "trust"
61 volumes:
62 - ../data:/tmp/data
(Omission)
docker-compose build
Whendocker-compose up -d
To reflect the settings.
Create sql files for database creation and table creation.
data/setup_db.sql
DROP DATABASE IF EXISTS boston_datasets;
CREATE DATABASE boston_datasets;
setup_tbl.sql
CREATE TABLE realestate
(
ID INTEGER,
CRIM NUMERIC,
ZN NUMERIC,
INDUS NUMERIC,
CHAS NUMERIC,
NOX NUMERIC,
RM NUMERIC,
AGE NUMERIC,
DIS NUMERIC,
RAD NUMERIC,
TAX NUMERIC,
PTRATIO NUMERIC,
B NUMERIC,
LSTAT NUMERIC,
y NUMERIC
);
Execute the above two sql files.
$docker-compose run --rm postgres psql -f /tmp/data/setup_db.sql -h postgres -U postgres
Creating redash_postgres_run ... done
DROP DATABASE
CREATE DATABASE
$cat ../data/setup_db.sql
DROP DATABASE IF EXISTS boston_datasets;
CREATE DATABASE boston_datasets;
(base) DyamaguchiMBA:redash $docker-compose run --rm postgres psql -f /tmp/data/setup_tbl.sql -h postgres -U postgres -d boston_datasets
Creating redash_postgres_run ... done
CREATE TABLE
Enter the container and check.
$docker exec -it redash_postgres_1 bash
bash-5.0# psql -U postgres -h postgres -d boston_datasets
psql (9.5.24)
Type "help" for help.
boston_datasets=# \dt
List of relations
Schema | Name | Type | Owner
--------+------------+-------+----------
public | realestate | table | postgres
(1 row)
boston_datasets=# \q
bash-5.0# exit
Certainly the real estate table has been created. Insert csv file data into this table.
$docker-compose run --rm postgres psql -h postgres -U postgres -d boston_datasets -c "copy realestate from '/tmp/data/data.csv' encoding 'utf8' csv header delimiter ','"
Creating redash_postgres_run ... done
COPY 506
So far, the database, tables, and data are in PostgreSQL.
Next, let's check the connection from Redash to PostgreSQL. Basically, you should set it as shown in the figure below.
You can leave the Password blank here. Just in case, I performed Test Connection and confirmed the connection correctly.
If you select a data source from "Queries" on the left side of the screen, you can see the DB and table you created earlier.
When I executed select * from realestate
as SQL, the result was returned as shown in the figure below.
We carried out from data preparation to query issuance. Next, I would like to finally visualize it.
Recommended Posts