Summary of methods often used in pandas

This is a collection of pandas methods that I often use personally. I search every time when I don't know how to use it, but it's also troublesome, so it's an article as a memo for myself. (Scheduled to be updated at any time)

Data frame display setting (set_option)

`python`


#Suppress floating type display to 3 digits after the decimal point
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x))

#All columns will be displayed with this setting, even if the columns are omitted by default.
pd.set_option('display.max_columns', None)

Creating a data frame (DataFrame)

`python`


#pandas import
import pandas as pd

#Define data, index name and column name
val=[[1,2,3], [21,22,23], [31,32,33]]
index = ["row1", "row2", "row3"]
columns =["col1", "col2", "col3"]

#Create a data frame by specifying the index and column name
df = pd.DataFrame(data=val, index=index, columns=columns)

Create a data frame from reading a CSV file (read_csv)

`python`


#csv file(df.csv)The first line is read as header and automatically becomes the column name
df = pd.read_csv("df.csv")

Create a data frame from reading a CSV file (ver without column name)

`python`


#csv file(df.csv)Read, column names are automatically serialized
df = pd.read_csv("df.csv",header=None)

Change data type (astype)

`python`


#Column.astype(Mold)でstrMoldに変更
df["A"] = df["A"].astype(str)

Apply function

`python`


#Column.apply(function)で指定した列の全てのデータにfunctionを適用する
#Here we apply the round function
df["A"] = df["A"].apply(round)

#Column.apply(Anonymous function)Apply the function to all the data in the column specified in
#Here, the split function deletes the character string after the comma in all the data in column A.
df["A"] = df["A"].apply(lambda x: x.split(",")[0])

Concat data frames

`python`


#Combine data frames d1 and d2 vertically
df3 = pd.concat([df1,df2]).reset_index(drop=True)
#Combine data frames d1 and d2 horizontally
df3 = pd.concat([df1,df2],axis=1).reset_index(drop=True)

Transform data grouped in other columns

`python`


#Column.transform(function)で指定した列の全てのデータにfunctionを適用する
#For each group in column A, fill in the missing values in column B with the median of A in the group
df["B"] = df.groupby("A")["B"].transform(lambda x: x.fillna(x.median()))

List missing data frame columns (isnull)

`python`


#Store the column name containing null data in the list
null_col = df.isnull().sum()[df.isnull().sum()>0].index.tolist()

List data types of columns in a data frame (dtypes)

`python`


#object type column name ob_Store as a list in col
ob_col = df.dtypes[df.dtypes=="object"].index.tolist()

pandas has a lot of useful methods and I have too much to write, but I'll update it little by little.

Recommended Posts

Summary of methods often used in pandas

Grammar summary often used in pandas

Summary of what was used in 100 Pandas knocks (# 1 ~ # 32)

Summary of Pandas methods used when extracting data [Python]

Summary of frequently used commands in matplotlib

Summary of built-in methods in Python list

Processing memos often used in pandas (beginners)

Full disclosure of methods used in machine learning

Summary of tools used in Command Line vol.8

Summary of tools used in Command Line vol.5

Summary of evaluation functions used in machine learning

Selenium webdriver Summary of frequently used operation methods

Summary of processes often performed in Pandas 1 (CSV, Excel file related operations)

A collection of Numpy, Pandas Tips that are often used in the field

A collection of code often used in personal Python

Settings often used in Jupyter

Basic usage of Pandas Summary

A collection of Excel operations often used in Python

Summary of statistical data analysis methods using Python that can be used in business

Processing summary 2 often done in Pandas (data reference, editing operation)

I tried to summarize the code often used in Pandas

Summary of how to write .proto files used in gRPC

A collection of methods used when aggregating data with pandas

Features of pd.NA in pandas 1.0.0 (rc0)

Summary of various operations in Tensorflow

[Anaconda3] Summary of frequently used commands

Installation summary often used for AI projects

[Python] Summary of how to use pandas

Summary of frequently used commands of django (beginner)

Summary of methods for automatically determining thresholds

Disk-related commands often used in Ubuntu (memories)

[Linux] List of Linux commands used in practice

Summary of various for statements in Python

Summary of stumbling blocks in installing CaboCha

Summary of modules and classes in Python-TensorFlow2-

Summary of operations often performed with asyncpg

Summary of probability distributions that often appear in statistics and data analysis

[Python/Django] Summary of frequently used commands (3) <Operation of PostgreSQL>

Python scikit-learn A collection of predictive model tips often used in the field

Summary of how to import files in Python 3

List of frequently used built-in functions and methods

Techniques often used in python short coding (Notepad)

A personal memo of Pandas related operations that can be used in practice

Utilization of recursive functions used in competition pros

Summary of how to use MNIST in Python

Header shifts in read_csv () and read_table () of Pandas

Fix the argument of the function used in map

Frequently used methods of Selenium and Beautiful Soup

Summary of frequently used Python arrays (for myself)

Code often used in Python / Django apps [prefectures]

[Python/Django] Summary of frequently used commands (2) <Installing packages>

Summary of frequently used commands (with petit commentary)

Python scikit-learn A collection of predictive model tips often used in the field

A memorandum of method often used when analyzing data with pandas (for beginners)

A memorandum of method often used in machine learning using scikit-learn (for beginners)

Summary of error handling methods when installing TensorFlow (2)

Summary of Excel operations using OpenPyXL in Python

[Introduction to Python] Summary of functions and methods that frequently appear in Python [Problem format]

[Python] Introduction to web scraping | Summary of methods that can be used with webdriver

Used from the introduction of Node.js in WSL environment

Summary of tools needed to analyze data in Python