It's a useful .ix
when referencing a pandas DataFrame, but if you don't understand it, you'll fall into a trap.
For example, suppose you have the following DataFrame.
from pandas import DataFrame
df = DataFrame(
[['a0', 'b0'], ['a1', 'b1'], ['a2', 'b2']],
index=[2, 4, 6],
columns=['a', 'b'])
a | b | |
---|---|---|
2 | a0 | b0 |
4 | a1 | b1 |
6 | a2 | b2 |
What if you want to refer to the second (starting from 0) row
ʻa column here? The expected result is ʻa2
.
.ix
accepts both order and index, so if you refer to it as below ...
df.ix[2, 'a']
The result will be ʻa0. This is because the reference given a number in
.ix` goes to the index if the index exists, and to see the order if it does not exist in the index.
This ambiguity in the .ix
reference seems to be deprecated in pandas version 0.20.
It may not be the best solution, but I will try to refer to it by .iloc
.
However, this only allows references in order, so use pandas.Index.get_loc together.
This is a method that looks up a row name (or column name) and returns the order.
df.iloc[2, df.columns.get_loc('a')]
The expected result, ʻa2`, is now returned.
In the above example, the column name is specified, but when specifying the row name, do as follows.
df.iloc[df.index.get_loc(6), 0]
If you know the row and column names in advance, you can just use .loc
normally.
df.loc[6, 'a']
Similarly, .iloc
is fine if you know the row and column order in advance.
df.iloc[2, 0]
If you do not understand the behavior of .ix
, it will behave unintentionally.
Even if you understand the behavior, you need to know the contents of the index, so it seems better to avoid using .ix
as much as possible.
Recommended Posts