SAS Viya is an AI platform. It is available through languages such as Python, Java and R. A table object called CASTable is used in SAS Viya (CAS stands for Cloud Analytic Services). This time, I will explain how to get data by specifying columns using loc / iloc / ix in CASTable.
First, connect to SAS Viya.
import swat
conn = swat.CAS('server-name.mycompany.com', 5570, 'username', 'password')
Then get the CASTable. This time, I will use CSV of IRIS data.
tbl = conn.loadtable('data/iris.csv', caslib='casuser').casTable
Use loc to specify the column and specify the column name in the second argument.
tbl.loc[:, 'petal_width'].head()
Then you can only get the value of the specified column.
0 2.0
1 2.3
2 2.0
3 2.3
4 2.2
Name: petal_width, dtype: float64
This can be specified in multiple columns.
tbl.loc[:, 'sepal_length':'petal_length'].head()
For multiple columns, the header row is displayed. Since it is specified by the range, the sepal_width that exists between them is also displayed.
sepal_length | sepal_width | petal_length | |
---|---|---|---|
0 | 7.9 | 3.8 | 6.4 |
1 | 7.7 | 2.6 | 6.9 |
2 | 7.7 | 2.8 | 6.7 |
3 | 7.7 | 3.0 | 6.1 |
4 | 7.7 | 3.8 | 6.7 |
If you want to specify the column, specify it as an array.
tbl.loc[:, ['petal_width', 'sepal_width']].head()
petal_width | sepal_width | |
---|---|---|
0 | 2.0 | 3.8 |
1 | 2.3 | 2.6 |
2 | 2.0 | 2.8 |
3 | 2.3 | 3.0 |
4 | 2.2 | 3.8 |
You can specify numbers instead of column names.
tbl.loc[:, 3].head()
0 2.0
1 2.3
2 2.0
3 2.3
4 2.2
Name: petal_width, dtype: float64
The same applies to multiple column specifications.
tbl.iloc[:, 0:3].head()
sepal_length | sepal_width | petal_length | |
---|---|---|---|
0 | 7.9 | 3.8 | 6.4 |
1 | 7.7 | 2.6 | 6.9 |
2 | 7.7 | 2.8 | 6.7 |
3 | 7.7 | 3.0 | 6.1 |
4 | 7.7 | 3.8 | 6.7 |
When specifying a column, the same is true for an array.
tbl.iloc[:, [3, 1]].head()
petal_width | sepal_width | |
---|---|---|
0 | 2.0 | 3.8 |
1 | 2.3 | 2.6 |
2 | 2.0 | 2.8 |
3 | 2.3 | 3.0 |
4 | 2.2 | 3.8 |
Combinations of numbers and column names are also possible.
tbl.ix[:, [3, 'sepal_width']].head()
petal_width | sepal_width | |
---|---|---|
0 | 2.0 | 3.8 |
1 | 2.3 | 2.6 |
2 | 2.0 | 2.8 |
3 | 2.3 | 3.0 |
4 | 2.2 | 3.8 |
There are many situations where you want to analyze only some data. In such a case, use loc, iloc, or ix to quickly extract the data. Numbers are also easy to use for looping, so they should also be useful for automating analytical and computational processing.
Recommended Posts