Summary of things that were convenient when using pandas

This article is the third day of Furukawa Lab Advent_calendar. This article was written by a student at Furukawa Lab as part of his studies. The content may be ambiguous or the expression may be slightly different.

Introduction

In this article, I'll summarize the commands that were useful when I used pandas for data formatting. It will be sent by beginners of the program, so it would be helpful if you could see it with warm eyes ^^

Text

python


import pandas as pd

df=pd.read_csv('File Path')

Basically read csv with this Actually, there may be situations where you have to read several files, so the method used in such cases is shown below.

When you want to read multiple files at once

python


import glob

#Same hierarchy
file_pass = glob.glob('*.csv')

#You can also specify the hierarchy
file_pass = glob.glob('○○/○○/*.csv')

This will take the path of the .csv file in the specified hierarchy. In the directory ○○ / ○○ data_1.csv , data_1.txt , data_2.csv , data_2.txt If exists

python


[○○/○○/data_1.csv,○○/○○/data_2.csv]

Is returned. The rest is a for statement

python


counter = -1
for i in file_pass
    df = pd.read_csv(i)
    counter = counter + 1
    #Add some operation
    #If you want to save again and save without index, index=Just add False
    df.to_csv('new_name_{0}.csv'.format(counter))

You can format the data at once (to_csv, counter, etc.)

os This is convenient when naming

python


import os

# ()Bring the path inside"../"And refer to the one above the place where this code is written
path = os.path.abspath(filepath)

#Bring the file name out of the path
#It was convenient to use with glob
name = os.path.basename(filepath)

#Sometimes I don't need an extension.Split and split with
name = name.split(".")
name = name[0]

Summary

It was said that it is easy to read the csv file at once and add the same operation when using glob and os. I thought when I was doing it myself, but the operation of pandas itself comes out if I google it like "pandas ○○", but I can not judge whether it is a pandas function or a python library and it works well There were many situations where I couldn't search. I want to know what I can do and develop the ability to google properly ╭ (・ ㅂ ・) و

Recommended Posts

Summary of things that were convenient when using pandas
Summary when using Fabric
Summary of Pandas methods used when extracting data [Python]
[Python] Summary of table creation method using DataFrame (pandas)
Convenient usage summary of Flask
Basic usage of Pandas Summary
Summary of things that need to be installed to run tf-pose-estimation
Precautions when using codecs and pandas
Here's a summary of things that might be useful when dealing with complex numbers in Python
Things to be aware of when building a recommender system using Item2Vec
[Python] Summary of how to use pandas
Summary of methods often used in pandas
Precautions when using for statements in pandas
Summary of snippets when developing with Go
Talking about the features that pandas and I were in charge of in the project
Summary of scikit-learn data sources that can be used when writing analysis articles
Summary of linux command techniques that I knew when I was a fledgling engineer
Summary of examples that cannot be pyTorch backward
Summary of what was used in 100 Pandas knocks (# 1 ~ # 32)
[Pandas] Basics of processing date data using dt
Character encoding when using csv module of python 2.7.3
100 Language Processing Knock-32 (using pandas): Prototype of verb
Document summary when using Cloud Firestore from Java
Summary of error handling methods when installing TensorFlow (2)
Summary of Excel operations using OpenPyXL in Python
Summary of statistical data analysis methods using Python that can be used in business