pandas memorandum

pandas memorandum

Overview

――It's just a memo, but I'll summarize it easily. --I will update it from time to time (probably)

reference

Method

Read csv file

In [1]: df = read_csv('./input/hoge.csv')

--It's easy to forget single quotes. ..

Check the contents

In [2]: df
Out[3]:
  label  a   b    c  d
0    aa  1  11  111  e
1    bb  2  22  222  e
2    cc  3  33  333  e
3    dd  4  44  444  e

Extract the first label

In [2]: df[[1]]
Out[2]:
   a
0  1
1  2
2  3
3  4

Extract'a'label

In [3]: df['a']
Out[3]:
0    1
1    2
2    3
3    4
Name: a, dtype: int64

Access the element by specifying the location Part 1 (specified by the label)

In [4]: df.loc[:,['a','b']]
Out[4]:
   a   b
0  1  11
1  2  22
2  3  33
3  4  44

Access the element by specifying the location Part 2 (specified by column)

In [5]: df.iloc[:,[1,2]]
Out[5]:
   a   b
0  1  11
1  2  22
2  3  33
3  4  44

Numerical part statistics

In [6]: df.describe()
Out[6]:
              a          b           c
count  4.000000   4.000000    4.000000
mean   2.500000  27.500000  277.500000
std    1.290994  14.200939  143.300384
min    1.000000  11.000000  111.000000
25%    1.750000  19.250000  194.250000
50%    2.500000  27.500000  277.500000
75%    3.250000  35.750000  360.750000
max    4.000000  44.000000  444.000000

In numpy.ndarray format

In [7]: df.values
Out[7]:
array([['aa', 1, 11, 111, 'e'],
       ['bb', 2, 22, 222, 'e'],
       ['cc', 3, 33, 333, 'e'],
       ['dd', 4, 44, 444, 'e']], dtype=object)

Get data type for each label

>>> df.dtypes
target      int64
v1        float64
v2        float64
v3         object
v4        float64
v5        float64
v6        float64
v7        float64
v8          int64
v9        float64
dtype: object

Select by label data type

>>> df.ix[:, df.dtypes == np.int64]
         target     v8
   No.1      1      2
   No.2      2      2

iteritems(), iterrows()

>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=['a', 'b', 'c'])
>>> df
    A  	B   C
a	1	4    a
b	2	5    b
c	3	6    c

>>> for (key,column) in df.iteritems():
        print key
        print column
A
a    1
b    2
c    3
Name: A, dtype: int64
B
a    4
b    5
c    6
Name: B, dtype: int64
C
a    x
b    y
c    z
Name: C, dtype: object
>>> for (key, row) in df.iterrows():
        print key
        print row
a
A    1
B    4
C    x
Name: a, dtype: object
b
A    2
B    5
C    y
Name: b, dtype: object
c
A    3
B    6
C    z
Name: c, dtype: object

factorize

>>> pd.factorize(df['A'])
(array([0, 1, 2]), Int64Index([1, 2, 3], dtype='int64'))
>>> pd.factorize(df['B'])
(array([0, 1, 2]), Int64Index([4, 5, 6], dtype='int64'))
>>> pd.factorize(df['C'])
(array([0, 1, 2]), Index([u'x', u'y', u'z'], dtype='object'))
>>> df['C'], indexer = pd.factorize(df['C'])
>>> df
	A	B	C
a	1	4	0
b	2	5	1
c	3	6	2
>>> indexer
Index([u'x', u'y', u'z'], dtype='object')

Recommended Posts

Pandas memorandum
pandas memorandum
Pandas operation memorandum
[For recording] Pandas memorandum
Pandas
Memorandum (pseudo Vlookup by pandas)
Memorandum @ Python OR Seminar: Pandas
Matplotlib memorandum
Pandas memo
linux memorandum
jinja2 memorandum
Development memorandum ~ pandas, forecast, data structure ~
Django memorandum
Command memorandum
Python Memorandum 2
plotly memorandum
Slackbot memorandum (1)
multiprocessing memorandum
Memorandum MetaTrader 5
[Linux/LPIC] Memorandum
Pandas notes
ShellScript memorandum
pip memorandum
Python memorandum
python memorandum
python memorandum
DjangoGirls memorandum
Command memorandum
pandas memo
python memorandum
Python memorandum
pandas SettingWithCopyWarning
pandas self-study notes
Python basics memorandum
RAID type memorandum
My pandas (python)
Python pathlib memorandum
Memorandum of sed
Python memorandum (algorithm)
Excel-> pandas-> sqlite
Linux memorandum [links]
Deep Learning Memorandum
[pandas] GroupBy Tips
Read pandas data
About pandas describe
pandas related links
Missing value pandas
9rep --Pandas MySQL
Revit API memorandum
pandas 1.2.0 What's new
Memorandum conda command
setuptools command memorandum
Python memorandum [links]
tslearn trial memorandum
Sort by pandas
Django's basic memorandum
Memorandum about validation
python pandas notes
pandas series part 1
[Note] pandas unstack