I started to get started with machine learning theory for IT engineers. I didn't know what the DataFrame meant, so I got stuck. It is the result of the investigation. By the way, Series
The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything that R’s data.frame provides and much more.
Simple translation Pandas has two main data structures. ** Series is one-dimensional ** ** DataFrame is 2D ** These are used in various fields (finance, statistics ...) From the R user's point of view, DataFrame provides more than R's data.frame provides.
#Numerical library import
import numpy
#Import Series and DataFrame from data analysis library
from pandas import Series, DataFrame
#Series
#data dummy argument:data. array-like, dict, or scalar value
#index dummy argument:Subscript of data. array-like or Index (1d)
#dtype dummy argument:data type. numpy.dtype or None
#copy dummy argument:copy. Default is false
#name formal argument:Name given to the result
#1
print(Series(data=[0,1]))
#2
print(Series(data=[2,3], index=['x', 'y'], name='value'))
#DataFrame
#data dummy argument:data( numpy ndarray (structured or homogeneous), dict, or DataFrame)
#index dummy argument:Index of the element. The default is numbers like a subscript array
#columns Formal argument:Two-dimensional index. The default is a number
#dtype dummy argument:data type. dtype, default None
#copy dummy argument:copy. The default is false.
#3
print(DataFrame(numpy.array([[0,0],[1,1]])))
#4
print(DataFrame(numpy.array([[0,0],[1,1]]), index=['a', 'b']))
#5
print(DataFrame(numpy.array([[0,0],[1,1]]), index=['a', 'b'], columns=['x', 'y']))
#1
0 0
1 1
dtype: int64
#2
x 2
y 3
Name: value, dtype: int64
#3
0 1
0 0 0
1 1 1
#4
0 1
a 0 0
b 1 1
#5
x y
a 0 0
b 1 1
Recommended Posts