Introduction

How to manipulate data in Pandas, which is essential for handling data analysis in Python I summarized the basics.

From important grammar that you forget about, we have included some tips.

Recommended for people like this → I want to touch Pandas for the first time! → Try to use R in Python. → I can't remember the grammar of Pandas-it would be convenient if there was a list somewhere ... → How much data handling can be done with Python in the first place?

If you want to know about data manipulation, please start from the first half.

◆ Basic summary of data manipulation with Python Pandas-First half: Data creation & operation http://qiita.com/hik0107/items/d991cc44c2d1778bb82e

Let's do the calculation

◆ Statistic calculation

Find statistics for each row or column of a data frame

`math.py`


 
#Column direction total
df_sample["score1"].sum(axis=0) #Calculate the sum of Score1 values
        #axis=0 means to sum in the vertical direction. Since it is 0 by default, it can be omitted.
 
df_sample[["score1","score2"]].sum(axis=0)  #score1,Sum each score2. Two results are output
 
 
#Row direction total
df_sample[["score1","score2"]].sum(axis=1)  
        #Sum the score1 and score2 values in each row. The result is output for each number of columns
        #axis=1 means to sum in the horizontal direction. In Pandas, Axis is the Row direction. "
Remember that you often distinguish between Column directions.

◆Pivoting Pivot table-like crosstab and data structure conversion

`pivot.py`


 
df_sample.pivot_table("score1",     #Specifying variables to aggregate
                       aggfunc="sum",  #Specifying how to aggregate
                       fill_value=0,   #Specifying the padding value when there is no corresponding value
                       rows="class",     #Specifying variables to leave in the row direction
                       columns="day_no")   #Specify variables to expand in the column direction

◆ Group_by operation

`groupby.py`


#In Pandas, the operation of Groupby and the accompanying Aggregation are performed separately.
#If you use the groupby method, it looks like a normal dataframe, but Group_An object with the Key information of By is generated.
#This also applies to R. Group by in Dplyr()A key is set by, and Summarise aggregates according to the key.
 
df_sample_grouped = df_sample.groupby("day_no")  # day_Group with no_Do by.
df_sample_grouped[["score1","score2"]].sum()          
  #Sum for grouped objects.
  #If desired, you can specify a variable to sum.
 
# Group_By Key is forcibly treated as Index
#Therefore, Group_Cannot be treated as a column variable like before by

df_sample_grouped = df_sample.groupby("day_no", as_index=false)
   #   as_index=If false is specified, it will stop being treated as an index.

Let's read and write data

◆ Data import and export

Create DF from csv file or export DF to csv

`file.py`


 
#Import csv data
pd.read_csv("path_of_data")
 
#Export csv data
 
pd.to_csv("path_of_exported_file")

Basic summary of data manipulation in Python Pandas-Second half: Data aggregation