Runs on Python2 with Spark 1.6 in IBM Data Scientist Experience (1) Define credentials Since I registered my dashDB in DSX in advance, click "insert to code" to set it. (Usernames and passwords are hidden ..)
credentials_2 = {
'port':'50000',
'db':'BLUDB',
'username':'dashXXXXX',
'ssljdbcurl':'jdbc:db2://dashdb-entry-yp-dal09-07.services.dal.bluemix.net:50001/BLUDB:sslConnection=true;',
'host':'dashdb-entry-yp-dal09-07.services.dal.bluemix.net',
'https_url':'https://dashdb-entry-yp-dal09-07.services.dal.bluemix.net:8443',
'dsn':'DATABASE=BLUDB;HOSTNAME=dashdb-entry-yp-dal09-07.services.dal.bluemix.net;PORT=50000;PROTOCOL=TCPIP;UID=dashXXXX;PWD=XXXXXXXXXXX;',
'hostname':'dashdb-entry-yp-dal09-07.services.dal.bluemix.net',
'jdbcurl':'jdbc:db2://dashdb-entry-yp-dal09-07.services.dal.bluemix.net:50000/BLUDB',
'ssldsn':'DATABASE=BLUDB;HOSTNAME=dashdb-entry-yp-dal09-07.services.dal.bluemix.net;PORT=50001;PROTOCOL=TCPIP;UID=dash7836;PWD=82f9d02e61da;Security=SSL;',
'uri':'db2://dashXXXX:[email protected]:50000/BLUDB',
'password':"""XXXXXXXXXXXXX"""
}
(2) Define a function for data acquisition Referenced site ---> http://stackoverflow.com/questions/37688993/how-to-use-pandas-on-spark-notebook-data-on-dashdb-in-python
def getDashData(credentials,schemaName , tableName):
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
props = {}
props['user'] = credentials['username']
props['password'] = credentials['password']
table = schemaName + '.' + tableName
return sqlContext.read.jdbc(credentials['jdbcurl'],table,properties=props)
(3) Extract data from dashDB & check the first 10 records
df_dash = getDashData(credentials_2 , 'DASH7836', 'TEST1' )
df_dash.toPandas().head(10)
Recommended Posts