Only the parts (X1, Y1) to (X4, Y4) of the Excel data (shown below) read by pandas Do you want to make a data frame separately? I often do. (Because I create Excel data without thinking ahead ...)
(Figure below) The X and Y parts can be read quickly by the method described in (1) above.
It is very convenient because you can enjoy it later if it is a data frame. With such an Excel data sheet, creating a data frame is troublesome in the first place ...
This time, it is a memo to create a child data frame from this data frame. There seems to be some code that can be solved with just one line, which is easier than this, While I was playing around with it, I was able to understand the data frame structure in various ways, so I would like to make a note of it.
Excel files can be read in one shot with pandas and it's easy to win, but in most cases my fucking Excel data does not have a proper index or column.
So, I would like to create a new data frame (child) by specifying the range of the read data (parent).
At the beginning, specify the same import settings and file path as usual.
# coding: utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
#Specify the location of the font file ubuntu(C:\Windows\Fonts\Any Font)
fp = FontProperties(fname="/usr/share/fonts/truetype/fonts-japanese-gothic.ttf")
####################################################################
filepath='~/Desktop/sample2.xls' #Xls file location
Read parent data frame
#Parent data frame
df=pd.read_excel(filepath) #Load into Pandas DataFrame
Immediately, specify the range of the parent df and extract the data. Here, it is specified using iloc.
#Specify the cell range you want to graph
df1=df.iloc[8:20,4:6] #[Line start: end,Column start:End]Specify with
df1=df1.reset_index(drop=True) #Re-roll the index
The entire label including the X1 and Y1 labels in the figure below (rows 10 to 21, columns E-F) is specified. Maybe because of the header and the rule of starting from zero Reading the specified position directly from the cell of the sheet is a little sensuously off. Therefore, it is good to specify the range while adding a hit and printing (df1).
Here, if you look at df1 read from the specified range, In this way (shown below), it is read based on the column data (Unnamed :) of the parent df, and X1 and Y1 are also data. These X1 and Y1 will be used as labels for the next "child data frame" to be created.
Therefore, let's extract the column data (X1, Y1) of df1.
#List the columns in the first row → Use in the columns of the child data frame
COL_LIST1=df1[0:1].values.flatten()
Then COL_LIST1 will return `['X1''Y1']`
.
I got the column data I wanted.
Create a child data frame (co_df1) based on the extracted column data. At this point, there is only column data and no contents.
#Child data frame(empty)Create, the column is the COL created earlier_LIST
co_df1=pd.DataFrame({},columns=COL_LIST1,index=[])
Using the for statement, each column data (COL_LIST1) is hung in each column of the specified range data frame (df1). Move the data to a child data frame.
#Write data to each column of the child data frame(〜[1:]Get the data after the column label by)
#Graph1
for i,col in enumerate(COL_LIST1):
#Read the data in each column
co_df1[col]=df1[[i]][1:].reset_index(drop=True)
With df1 [[i]], the data in the first and second columns of df1 that was specified and read by iloc is read. However, it also contains the label data X1 and Y1. Therefore df1[[i]][1:] By doing so, you can retrieve only the numerical data excluding the label data.
However, if this is left as it is, the original index information remains and it will be troublesome later. df1[[i]][1:].reset_index(drop=True) Reset the index with.
If you put this in the empty child data frame co_df1 sequentially (only in COL_LIST1), the target child data frame will be completed.
After that, for x and y axis data for plot Just throw it in.
x1=co_df1[[0]]
y1=co_df1[[1]]
I made a graph of all the Excel data X1, Y1 to X4, Y4 shown at the beginning as child data frames.
Paste the graph with 4 child data and the one with only one.
# coding: utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
#Specify the location of the font file ubuntu(C:\Windows\Fonts\Any Font)
fp = FontProperties(fname="/usr/share/fonts/truetype/fonts-japanese-gothic.ttf")
####################################################################
filepath='~/Desktop/sample2.xls' #Xls file location
#Parent data frame
df=pd.read_excel(filepath) #Load into Pandas DataFrame
#Create a child data frame from a parent data frame
#Specify the cell range you want to graph
df1=df.iloc[8:20,4:6] #[Line start: end,Column start:End]Specify with
df1=df1.reset_index(drop=True) #Re-roll the index
df2=df.iloc[8:20,7:9]
df2=df2.reset_index(drop=True)
df3=df.iloc[21:33,4:6]
df3=df3.reset_index(drop=True)
df4=df.iloc[21:33,7:9]
df4=df4.reset_index(drop=True)
#Column(Name)Read
COL_LIST1=df1[0:1].values.flatten() #List the columns in the first row → Use in the columns of the child data frame
COL_LIST2=df2[0:1].values.flatten()
COL_LIST3=df3[0:1].values.flatten()
COL_LIST4=df4[0:1].values.flatten()
#Child data frame(empty)Create, the column is the COL created earlier_LIST
co_df1=pd.DataFrame({},columns=COL_LIST1,index=[])
co_df2=pd.DataFrame({},columns=COL_LIST2,index=[])
co_df3=pd.DataFrame({},columns=COL_LIST3,index=[])
co_df4=pd.DataFrame({},columns=COL_LIST4,index=[])
#Write data to each column of the child data frame
#Graph1
for i,col in enumerate(COL_LIST1):
#Read the data in each column
co_df1[col]=df1[[i]][1:].reset_index(drop=True)
#Graph2
for i,col in enumerate(COL_LIST2):
#Read the data in each column
co_df2[col]=df2[[i]][1:].reset_index(drop=True)
#Graph3
for i,col in enumerate(COL_LIST3):
#Read the data in each column
co_df3[col]=df3[[i]][1:].reset_index(drop=True)
#Graph4
for i,col in enumerate(COL_LIST4):
#Read the data in each column
co_df4[col]=df4[[i]][1:].reset_index(drop=True)
#X in the graph,Extraction of Y data
#Parent data frame
x=df[[0]]
y=df[[1]]
#Child data frame
x1=co_df1[[0]]
y1=co_df1[[1]]
x2=co_df2[[0]]
y2=co_df2[[1]]
x3=co_df3[[0]]
y3=co_df3[[1]]
x4=co_df4[[0]]
y4=co_df4[[1]]
####################################################################
#Graph
####################################################################
fig = plt.figure()
#Multiple graph settings
ax1 = fig.add_subplot(221) #Graph1
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
#y-axis scale setting
ax1.set_yscale('linear')
ax2.set_yscale('linear')
ax3.set_yscale('log')
ax4.set_yscale('log')
#Axis range
ax1.set_ylim(-1.1, 1.1)
ax1.set_xlim(0,300)
ax2.set_ylim(0,10)
ax2.set_xlim(0,10)
ax3.set_ylim(1E+0,1E+5)
ax3.set_xlim(0,10)
ax4.set_ylim(1E-13,1E+1)
ax4.set_xlim(0,10)
#Individual title
ax1.set_title("Graph1",fontdict = {"fontproperties": fp},fontsize=12)
ax2.set_title("Graph2",fontdict = {"fontproperties": fp},fontsize=12)
ax3.set_title("Graph3",fontdict = {"fontproperties": fp},fontsize=12)
ax4.set_title("Graph4",fontdict = {"fontproperties": fp},fontsize=12)
#axis
ax1.set_xlabel("x",fontdict = {"fontproperties": fp},fontsize=12)
ax1.set_ylabel("y",fontdict = {"fontproperties": fp},fontsize=12)
#plot
ax1.plot(x1, y1,'blue',label='graph1')
ax2.plot(x2, y2,'green',label='graph2')
ax3.plot(x3, y3,'red',label='graph3')
ax4.plot(x4, y4,'black',label='graph4')
#Legend position
ax1.legend(loc="upper right")
ax2.legend(loc="upper left")
ax3.legend(loc="upper left")
ax4.legend(loc="upper left")
#Layout adjustment
plt.tight_layout()
#The title of the entire graph
fig.suptitle('Graph', fontsize=14)
plt.subplots_adjust(top=0.85)
#Save file Save as both png and eps
plt.savefig("sample.png ")
plt.savefig("sample.eps")
plt.show()
# coding: utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
#Specify the location of the font file ubuntu(C:\Windows\Fonts\Any Font)
fp = FontProperties(fname="/usr/share/fonts/truetype/fonts-japanese-gothic.ttf")
####################################################################
filepath='~/Desktop/sample2.xls' #Xls file location
#Parent data frame
df=pd.read_excel(filepath) #Load into Pandas DataFrame
#Create a child data frame from a parent data frame
#Specify the cell range you want to graph
df1=df.iloc[8:20,4:6] #[Line start: end,Column start:End]Specify with
df1=df1.reset_index(drop=True) #Re-roll the index
#Column(Name)Read
COL_LIST1=df1[0:1].values.flatten() #List the columns in the first row → Use in the columns of the child data frame
#Child data frame(empty)Create, the column is the COL created earlier_LIST
co_df1=pd.DataFrame({},columns=COL_LIST1,index=[])
#Write data to each column of the child data frame(〜[1:]Get the data after the column label by)
#Graph1
for i,col in enumerate(COL_LIST1):
#Read the data in each column
co_df1[col]=df1[[i]][1:].reset_index(drop=True)
#X in the graph,Extraction of Y data
#Parent data frame
x=df[[0]]
y=df[[1]]
#Child data frame
x1=co_df1[[0]]
y1=co_df1[[1]]
####################################################################
#Graph
####################################################################
fig = plt.figure()
#Multiple graph settings
ax1 = fig.add_subplot(111) #Graph1
#y-axis scale setting
ax1.set_yscale('linear')
#Axis range
ax1.set_ylim(-1.1, 1.1)
ax1.set_xlim(0,300)
#Individual title
ax1.set_title("Graph1",fontdict = {"fontproperties": fp},fontsize=12)
#axis
ax1.set_xlabel("x",fontdict = {"fontproperties": fp},fontsize=12)
ax1.set_ylabel("y",fontdict = {"fontproperties": fp},fontsize=12)
#plot
ax1.plot(x1, y1,'blue',label='graph1')
#Legend position
ax1.legend(loc="upper right")
#Layout adjustment
plt.tight_layout()
#The title of the entire graph
fig.suptitle('Graph', fontsize=14)
plt.subplots_adjust(top=0.85)
#Save file Save as both png and eps
plt.savefig("sample.png ")
plt.savefig("sample.eps")
plt.show()
Recommended Posts