Basics of pandas for beginners ② Understanding data overview

What is pandas

A data frame object for handling structured data in Python. You can easily read files and perform subsequent SQL operations, and it is necessary for work such as machine learning to process, calculate, and visualize data. A memo list of commonly used syntaxes for data manipulation. This section is an overview of the data.

Library import

Import pandas with the name pd

python


import pandas as pd

Check the number of data

Check the number of "dataflame".

python


print(len(dataflame))

Check data type

python


dataflame.dtypes

Statistic (numerical data) display

python


dataflame.describe

Aggregate count (number of data), mean (mean), std (standard deviation), min (minimum), 25% etc. (quartile), max (maximum).

Statistic (categorical data) display

python


dataflame.describe(include='O')

It's o, not zero. Aggregate count (number of data), unique (number of unique data), top (value of the most frequently occurring element), freq (number of elements). If you want to display it together with the numerical value, use "describe (include ='all')".

Confirmation of missing values

Check if the value is taken correctly after reading the initial data and after joining.

python


dataflame.isnull().sum()

Recommended Posts

Basics of pandas for beginners ② Understanding data overview
Overview of Docker (for beginners)
Pandas basics for beginners ① Reading & processing
[Must-see for beginners] Basics of Linux
Pandas basics summary link for beginners
Pandas basics for beginners ④ Handling of date and time items
Seaborn basics for beginners ① Aggregate graph of the number of data (Countplot)
Pandas basics for beginners ③ Histogram creation with matplotlib
[For beginners] Script within 10 lines (5. Resample of time series data using pandas)
[Pandas] Basics of processing date data using dt
Easy understanding of Python for & arrays (for super beginners)
A memorandum of method often used when analyzing data with pandas (for beginners)
Summary of pre-processing practices for Python beginners (Pandas dataframe)
Analysis of measurement data ①-Memorandum of understanding for scipy fitting-
[Linux] Basics of authority setting by chmod for beginners
Seaborn basics for beginners ④ pairplot
100 Pandas knocks for Python beginners
[For beginners] Basics of Python explained by Java Gold Part 2
How to get an overview of your data in Pandas
[For beginners] Basics of Python explained by Java Gold Part 1
Seaborn basics for beginners ② Histogram (distplot)
Learn the basics of Python ① Beginners
[Pandas] I tried to analyze sales data with Python [For beginners]
Basics of Quantum Information Theory: Data Compression (1)
[Python] Minutes of study meeting for beginners (7/15)
Pandas of the beginner, by the beginner, for the beginner [Python]
Example of efficient data processing with PANDAS
Basics of Quantum Information Theory: Data Compression (2)
[Introduction to Data Scientists] Basics of Python ♬
Best practices for messing with data with pandas
For beginners of SageMaker --Collection of material links -
Convenient tool for beginners (under consideration of ideas)
Basic principles of image recognition technology (for beginners)
[Python for Hikari] Chapter 09-01 Classes (Basics of Objects)
Basic story of inheritance in Python (for beginners)
Roadmap for beginners
Read pandas data
Basics of python ①
Let's analyze Covid-19 (Corona) data using Python [For beginners]
Use data class for data storage of Python 3.7 or higher
[Translation] NumPy Official Tutorial "NumPy: the absolute basics for beginners"
Data Science 100 Knock ~ Battle for less than beginners part3
Data Science 100 Knock ~ Battle for less than beginners part6
Analysis of financial data by pandas and its visualization (2)
Export access data for each user of Google Analytics.
[For beginners] How to study Python3 data analysis exam
List of Python libraries for data scientists and data engineers
Analysis of financial data by pandas and its visualization (1)
Masks are useful for searching within Pandas data frames
I tried the MNIST tutorial for beginners of tensorflow.
Data science 100 knocks ~ Battle for less than beginners part5
Data Science 100 Knock ~ Battle for less than beginners part2
Data Science 100 Knock ~ Battle for less than beginners part1
Data science 100 knocks ~ Battle for less than beginners part10
Overview and tips of seaborn with statistical data visualization
Data Science 100 Knock ~ Battle for less than beginners part7
Summary of Pandas methods used when extracting data [Python]
Techniques for understanding the basis of deep learning decisions
Data Science 100 Knock ~ Battle for less than beginners part4
Data set for evaluation of spam reviewer detection algorithm
Data science 100 knocks ~ Battle for less than beginners part8