A data frame object for handling structured data in Python. You can easily read files and perform subsequent SQL operations, and it is necessary for work such as machine learning to process, calculate, and visualize data. A memo list of commonly used syntaxes for data manipulation. This section is data reading & processing.
Import pandas with the name pd
python
import pandas as pd
python
dataflame = pd.read_csv('file.csv')
Excel etc. can be read by the same method. Official Pandas documentation [Input / output]
Enter the required number in parentheses.
python
dataflame.head(10)
The display from the beginning is "head", and the display from the end is "tail".
Add the existing "column1" and "column2" to make "column3".
python
dataflame['column3'] = dataflame['column1'] + dataflame['column2']
"Left Outer Join" with "dataflame1" and "dataflame2" in the column "key", and make it "join_dataflame".
python
join_dataflame = pd.merge(dataflame1, dataflame2, on = 'key', how = 'left')
If you want to limit the columns, add dataflame1 [['column1','column1']].
Dump the data with csv.
python
dataflame.to_csv('dump_file.csv', index = false, encoding = 'utf-8', sep=",")
"Index" specifies the presence or absence of a header, "encoding" specifies the encoding, and "sep" specifies the delimiter.
Check the number of "dataflame".
python
print(len(dataflame))
Recommended Posts