[Python] Operation memo of pandas DataFrame

Introduction

Because I had the opportunity to analyze data even though I was a beginner So I will summarize the grammatical elements of the newly obtained Python DataFrame.

Premise

product.csv

id	name	price	category	isPopular
1	eraser	100	stationary	1
2	pencil	200	stationary	0
3	socks	400	clothes	1
4	pants	1000	clothes	0
5	apple	100	food	0

`analyze.py`


import pandas as pd

Extract the value type of a column

df['category'].value_counts().index

Execution result

Index(['stationery', 'clothes', 'food'], dtype='object')

Change / add the value of DataFrame by specifying the condition

df.loc[df.name == 'socks', 'price'] = 500
df.loc[df.category == 'stationery', 'category_id'] = 0
df.loc[df.category == 'clothes', 'category_id'] = 1
df.loc[df.category == 'food', 'category_id'] = 2
df

Execution result

id	name	price	category	isPopular	category_id
1	eraser	100	stationary	1	0.0
2	pencil	200	stationary	0	0.0
3	socks	500	clothes	1	1.0
4	pants	1000	clothes	0	1.0
5	apple	100	food	0	2.0

Change to one-hot expression

#column isPopular and category_Extract only id (it will not work unless it is an integer value)
df_X = df.drop(['id','name','price','category'], axis=1)

from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
enc.fit(df_X)
onehot_array = enc.transform(df_X).toarray()
onehot_df = pd.DataFrame(onehot_array)
df = pd.concat([df_id, onehot_df], axis=1)
df

Execution result

id	0	1	2	3	4
1	0.0	1.0	1.0	0.0	0.0
2	1.0	0.0	1.0	0.0	0.0
3	0.0	1.0	0.0	1.0	0.0
4	1.0	0.0	0.0	1.0	0.0
5	1.0	0.0	0.0	0.0	1.0

Recommended Posts

[Python] Operation memo of pandas DataFrame

Basic operation of Python Pandas Series and Dataframe (1)

[Python] Operation of enumerate

Basic operation of pandas

Basic operation of Pandas

Python decorator operation memo

Python application: Pandas # 3: Dataframe

Automatic operation of Chrome with Python + Selenium + pandas

Formatted display of pandas DataFrame

The Power of Pandas: Python

Summary of pre-processing practices for Python beginners (Pandas dataframe)

[Python] Summary of table creation method using DataFrame (pandas)

Pandas memo

Python hand play (Pandas / DataFrame beginning)

Python memo

python memo

Python memo

Python3 compatible memo of "python start book"

python memo

[Memo] Small story of pandas, numpy

Python memo

Separate display of Python graphs (memo)

pandas memo

Operation memo of Conda virtual environment

Python memo

Python memo

[Python] Summary of how to use pandas

[Learning memo] Basics of class by python

[Python beginner memo] Python character string, path operation

Python application: Pandas Part 4: DataFrame concatenation / combination

Python data structure and operation (Python learning memo ③)

[Pandas_flavor] Add a method of Pandas DataFrame

Pandas of the beginner, by the beginner, for the beginner [Python]

[Python] Add total rows to Pandas DataFrame

Memo of troubles about coexistence of Python 2/3 system

[Python] Memo dictionary

Introduction of Python

My pandas (python)

python beginner memo (9.2-10)

[python] vector operation

python beginner memo (9.1)

[Python] Visualize the heat of Tokyo and XX prefectures (DataFrame usage memo)

Basics of Python ①

★ Memo ★ Python Iroha

Basics of python ①

Python OS operation

Memo of pixel position operation for image data in Python (numpy, cv2)

Copy of python

[Python] EDA memo

Python 3 operator memo

[Python] Matrix operation

Pandas operation memorandum

[My memo] python

Python3 metaclass memo

[Python] Basemap memo

Python beginner memo (2)

python pandas notes

[Python] Numpy memo

Introduction of Python

A memo of a tutorial on running python on heroku

Correspondence summary of array operation of ruby and python