I had to format the messy data with bytes, so it's a memo at that time The scribble of what I did in R about January is here. import Import of familiar data analysis tool
However, since I often handle data that spans many files, glob is also imported !! pandas
data = pd.read_csv("file name.csv")
--pd.read_ file format () supports various file formats --header = -1 can eliminate header! --names = ['1', '2', '3'] etc.
datas = glob.glob('*')
If you have a large number of files, this will give you all the files in the directory.
In the case of pandas, the called file will be a DataFrame type instead of a numpy array.
It is a stripping off of the disturbing part.
data.drop([1,2])
#Clear line
data.drop([1,2],axis=1)
#Erase columns
By doing this, you can erase the row and column.
pd.concat([data[1],data[0]])
#Join lines
pd.concat([data[1],data[0]],axis=1)
#Join columns
This is useful when you have a lot of files and data!
Make only the data you want!
data.query("1==2")
Now you can only have a column named 1 with a value of 2.
Indispensable when molding files to derive data relationships!
pd.merge(data1, data, on='Column name')
This will stick the data together with the same column values.
Patience is important for blunt data shaping! After that, you can do it at once by using for etc. It is easy to swap columns and rows well by creating an array of numpy and a file in the middle. Think about the data you want and work hard toward it. Thank you for reading my poor memo.
Recommended Posts