From the place where you can pull out the data table in Azure SQL Server with the python library called pyodbc
, you can convert the table to numpy
's ʻarray format or
pandas's
DataFrame` format. This is a memo.
→ Definition of library, environment and function is written here. ,, So please refer to that first.
## libraries
import pyodbc
import numpy as np
import pandas as pd
## initial setting
##Please change the set value as appropriate
server = 'abc_server.database.windows.net'
database = 'abc_database'
username = 'abc_user'
password = 'abc_password'
##Function definition
###Define DB connection
def db_connection(sv=server, db=database, un=username, pw=password):
cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER='+sv+';DATABASE='+db+';UID='+un+';PWD='+ pw)
return cnxn.cursor()
###Issue SQL
def query_output(sql):
cursor.execute(sql)
row = cursor.fetchone()
while row:
print row[0]
row = cursor.fetchone()
→ For the meaning of the settings, see Trial Premise Etc.
###DB connection
cursor = db_connection()
###SQL settings
sql = 'select * from SalesLT.Customer;'
cursor.execute(sql)
rows = cursor.fetchall()
Check the contents of rows
around here:
>>> rows
[(1, False, u'Mr.', u'Orlando', u'N.', u'Gee', None, u'A Bike Store', u'adventure-works\\pamela0', u'[email protected]', u'245-555-0173', u'L/Rlwxzp4w7RWmEgXX+/A7cXaePEPcp+KwQhl2fJL7w=', u'1KjXYs4=', u'3F5AE95E-B87D-4AED-95B4-C3797AFCB74F', datetime.datetime(2005, 8, 1, 0, 0)),
(2, False, u'Mr.', u'Keith', None, u'Harris', None, u'Progressive Sports', u'adventure-works\\david8', u'[email protected]', u'170-555-0127', u'YPdtRdvqeAhj6wyxEsFdshBDNXxkCXn+CRgbvJItknw=', u'fs1ZGhY=', u'E552F657-A9AF-4A7D-A645-C429D6E02491', datetime.datetime(2006, 8, 1, 0, 0)),
(3, False, u'Ms.', u'Donna', u'F.', u'Carreras', None, u'Advanced Bike Components', u'adventure-works\\jillian0', u'[email protected]', u'279-555-0130', u'LNoK27abGQo48gGue3EBV/UrlYSToV0/s87dCRV7uJk=', u'YTNH5Rw=', u'130774B1-DB21-4EF3-98C8-C104BCD6ED6D', datetime.datetime(2005, 9, 1, 0, 0)),
:
It's like an array inside an array (likely, in the exact format pyodbc.Row
), so
I will try plunging into numpy's array
array_format = np.array(rows)
dataframe_format = pd.DataFrame(array_format)
Then the numpy
ʻarray` format is ...
>>>array_format
array([[1, False, u'Mr.', ..., u'1KjXYs4=',
u'3F5AE95E-B87D-4AED-95B4-C3797AFCB74F',
datetime.datetime(2005, 8, 1, 0, 0)],
[2, False, u'Mr.', ..., u'fs1ZGhY=',
u'E552F657-A9AF-4A7D-A645-C429D6E02491',
datetime.datetime(2006, 8, 1, 0, 0)],
[3, False, u'Ms.', ..., u'YTNH5Rw=',
u'130774B1-DB21-4EF3-98C8-C104BCD6ED6D',
datetime.datetime(2005, 9, 1, 0, 0)],
...,
[30116, False, u'Ms.', ..., u'HDCU1Bk=',
u'EC409609-D25D-41B8-9D15-A1AA6E89FC77',
datetime.datetime(2007, 7, 1, 0, 0)],
[30117, False, u'Mr.', ..., u'iES3IZA=',
u'6F08E2FB-1CD3-4F6E-A2E6-385669598B19',
datetime.datetime(2005, 8, 1, 0, 0)],
[30118, False, u'Ms.', ..., u'QhHP+y8=',
u'2495B4EB-FE8B-459E-A1B6-DBA25C04E626',
datetime.datetime(2006, 9, 1, 0, 0)]], dtype=object)
>>> array_format.shape
(847L, 15L)
Also, the DataFrame
of pandas
>>>dataframe_format.head()
0 1 2 3 4 5 6 7 \
0 1 False Mr. Orlando N. Gee None A Bike Store
1 2 False Mr. Keith None Harris None Progressive Sports
2 3 False Ms. Donna F. Carreras None Advanced Bike Components
3 4 False Ms. Janet M. Gates None Modular Cycle Systems
4 5 False Mr. Lucy None Harrington None Metropolitan Sports Supply
8 9 10 \
0 adventure-works\pamela0 [email protected] 245-555-0173
1 adventure-works\david8 [email protected] 170-555-0127
2 adventure-works\jillian0 [email protected] 279-555-0130
3 adventure-works\jillian0 [email protected] 710-555-0173
4 adventure-works\shu0 [email protected] 828-555-0186
11 12 \
0 L/Rlwxzp4w7RWmEgXX+/A7cXaePEPcp+KwQhl2fJL7w= 1KjXYs4=
1 YPdtRdvqeAhj6wyxEsFdshBDNXxkCXn+CRgbvJItknw= fs1ZGhY=
2 LNoK27abGQo48gGue3EBV/UrlYSToV0/s87dCRV7uJk= YTNH5Rw=
3 ElzTpSNbUW1Ut+L5cWlfR7MF6nBZia8WpmGaQPjLOJA= nm7D5e4=
4 KJqV15wsX3PG8TS5GSddp6LFFVdd3CoRftZM/tP0+R4= cNFKU4w=
13 14
0 3F5AE95E-B87D-4AED-95B4-C3797AFCB74F 2005-08-01 00:00:00
1 E552F657-A9AF-4A7D-A645-C429D6E02491 2006-08-01 00:00:00
2 130774B1-DB21-4EF3-98C8-C104BCD6ED6D 2005-09-01 00:00:00
3 FF862851-1DAA-4044-BE7C-3E85583C054D 2006-07-01 00:00:00
4 83905BDC-6F5E-4F71-B162-C98DA069F38A 2006-09-01 00:00:00
>>> dataframe_format.shape
(847, 15)
It looks like it's okay.
Reference -Try to extract Azure SQL Server data table with pyodbc