A library for handling structured data (table type data) in Python. It is a library that can easily perform file reading and subsequent processing / extraction processing (it can be performed like SQL), and is indispensable for data preprocessing such as machine learning. The table of contents for other items is here.
In this article, it is the processing method of the number of digits. The first thing you should understand is how to adjust the number of digits in pandas itself and how to adjust the number of digits in individual data frames and variables. Also note that pandas rounding is not rounding, but even rounding. If you don't know how to round to even numbers, check it out.
First, import the library. Name pandas pd and import it.
python
import pandas as pd
I will try the sample with Titanic data. If you don't know Titanic, please check "kaggle Titanic".
python
dataframe = pd.read_csv('train.csv')
Various settings of pandas are managed by ʻoption. (There are various other options, so please check if you are interested.) The total number of digits is managed by
display.float_format, and the number of digits after the decimal point is managed by
display.precision`.
Let's actually check it.
In
print(pd.options.display.float_format)
print(pd.options.display.precision)
Out
None
6
There is no limit to the total number of digits, and 6 digits are displayed after the decimal point. Looking at the actual data, for example, Fare is displayed up to four digits after the decimal point. This is because the original CSV data has only 4 digits, but if the number of digits is large, it will be displayed up to 6 digits.
Then change this value to display two decimal places. (Fare display will be 2 digits)
python
pd.options.display.precision = 2
Use reset_option
if you want to initialize.
python
pd.reset_option('display.precision')
Use round ()
for individual settings. If you want to use 2 digits after the decimal point, use the following. (Fare display will be 2 digits)
python
dataframe.round(2)
When setting for each column, it is as follows. (Example: Age is 1 digit and Fare is 3 digits.)
python
dataframe.round({'Age':1, 'Fare':3})
As a beginner can understand, we have summarized the necessary knowledge when implementing machine learning with Python as a simple article. The table of contents is here, so I hope you can refer to other articles as well.
Recommended Posts