One day, Kyo "T-kun, I'll send you the data that came out of the measuring instrument, so make your own graph." T "OK!" One day meeting ... T "This is a graph" Kyo "What was the value here? Did you calculate?" (...?!) Kyo "There are still 0 pieces of data in the same format as this. Please do the calculation." ((^ Ω ^) ... Owata)
There is a background such as I started learning python to automate and speed up data organization. Python was the first time I started programming. After studying little by little and writing my master's thesis, I would like to record the progress related to python. This is my first time writing an article, and I'm not sure if I'll finish writing it (3/18). GitHub also has sample data and a notebook. From here Please.
T "I'm going to write a program while studying python, so can you wait a moment?" Teaching "?? Finished!" (Later, "a little" swells up to a few months)
Now, let's open the data. The data I received was txt format data that can be opened even in Excel. It's a bit disgusting to have a semicolon delimiter. Below is an example of the data.
The first textbook I touched was [python introductory note](https://www.amazon.co.jp/%E8%A9%B3%E7%B4%B0-Python-%E5%85%A5%E9%96%80 % E3% 83% 8E% E3% 83% BC% E3% 83% 88-% E5% A4% A7% E9% 87% 8D-% E7% BE% 8E% E5% B9% B8 / dp / 4800711673) did. After installing Anaconda according to this textbook, I was studying by writing code on spyder.
The textbook described reading and writing files using numpy. For the time being, I read it according to the subject. I tried to pass the file path using tkinter. tkinter I don't know, but I used it like a magic spell.
.py
import tkinter as tk
import tkinter.filedialog as fd
import numpy as np
root=tk.Tk()
root.withdraw()
path = fd.askopenfilename(
title="file---",
filetypes=[("csv","csv"),("CSV","csv")])
if path :
fileobj=np.genfromtxt(path,delimiter=";",skip_header=3)#Read data separated by semicolons by skipping 3 lines
f=fileobj[:,0]#First column data
At that time, it was being read. Later, I came across a useful module called pandas. [Introduction to Jupyter [Practice] for Python users](https://www.amazon.co.jp/Python%E3%83%A6%E3%83%BC%E3%82%B6%E3%81%AE % E3% 81% 9F% E3% 82% 81% E3% 81% AEJupyter-% E5% AE% 9F% E8% B7% B5-% E5% 85% A5% E9% 96% 80-% E6% B1% A0% E5% 86% 85-% E5% AD% 9D% E5% 95% 93 / dp / 4774192236) was used as a reference to launch the notebook environment. Start again with jupyter notebook and pandas.
.ipynb
import tkinter
from tkinter import filedialog
import pandas as pd
root = tkinter.Tk()
root.withdraw()
path = filedialog.askopenfilename(
title="file___",
filetypes=[("txt","txt"),("csv","csv")])
if path:
df = pd.read_csv(path,engine="python",header=None,sep=';',skiprows=3,index_col=0)
If you read it with pandas, the table data will look like this.
Let's graph it quickly with the graph function of pandas. In this experiment, we will use the data in the first and second columns. In DataFrame, data is processed using an indexer (loc, iloc), but it should be noted that the returned object is changed to Series when 1 row or 1 column data is specified.
.ipnb
import matplotlib.pyplot as plt
df=df.iloc[:,[0,1]]
df.set_index(0,inplace=True)#Overwrite df by specifying index
df.plot()
plt.show()
When plotting using pandas, it seems that the index column is automatically taken on the horizontal axis, so the index is specified in advance with .set_inedex (column name). A waveform with a sharp peak appeared around the center as shown in the image. There was noise on the edge. It depends on the number of data points, but so far I have done it in Excel.
The process went smoothly until the graph was created, but the calculation was troublesome. The challenge this time was to ** evaluate the sharpness of the peak **. As a fitting tool, scipy's curve_fit came out as soon as I googled it, so I tried using it. The vertical axis in the graph above is the input power, and the unit is decibel (dBm). When the unit is changed to mW and standardized by the maximum value, it becomes as shown in the figure below.
df.index = df.index*pow(10,-9)
df.index.rename('freq[GHz]',inplace=True)
df['mag'] = pow(10,(df.iloc[:]-df.iloc[:].max())/10)
df['mag'].plot()
plt.show()
The function I want to fit is: It is the Lorentz function plus the baseline. All variables except x. ..
init is the initial value of the parameter. With a roughly predicted value I made a list. The optimum parameter opt and covariance cov can be obtained by curve_fit (function name, x, y, initial value of parameter).
import scipy.optimize
def lorentz(x,A,mu,gamma,B,C):#Define a function that fits
return A*x*gamma/((x-mu)**2+gamma**2)+B*x+C
A = -np.pi*(df.index.max()-df.index.min())/20
mu = df.index.values.mean()
gamma = (df.index.max()-df.index.min())/20
B = 10
C = 0
init = [A,mu,gamma,B,C]#Initial value of the parameter you want to fit
opt, cov = scipy.optimize.curve_fit(lorentz,df.index,df['mag'],init)#fitting
Plot the results. It is very convenient to create a new column with df [column name] = 〇〇. If you want to add it to a Series object, convert it to a DataFrame type via pd.DataFrame (Series object) or .reset_index (). It's nice to use the column names as they are in the legend. The vertical axis is ... ..
df['fit']=lorentz(df.index,opt[0],opt[1],opt[2],opt[3],opt[4])
df.loc[:,['mag','fit']].plot()
plt.show()
It fits nicely.
The sharpness of the peak was evaluated using the obtained $ \ mu $ and $ \ gamma $.
・ $ \ Mu $: Center of peak
・ $ \ Gamma
Analysis of measurement data (2) -Histogram and fitting, lmfit recommendation-
The sample data represented the frequency characteristics of a certain resonant circuit. I started to touch it by myself and finally the analysis program was completed. When I moved for the first time, I was impressed. By the time I graduated, I was processing thousands of pieces of data. That's horrible. Make it and yokatta ... I want to study pandas more and more so that I can handle data freely.
Recommended Posts