How to display histograms and scatter plots on Jupyter Notebook using Matplotlib.
The content of this article is tested in the environment of Jupyter Notebook prepared according to the following article. Easy installation and startup of Jupyter Notebook using Docker (also supports nbextensions and Scala) --Qiita
In this environment, you can access port 8888 with a browser and use Jupyter Notebook. You can open a new note by following New> Python 3 on the upper right button.
See the following article for histograms and scatter plots. Display histogram / scatter plot on Jupyter Notebook --Qiita
We have prepared two sample data assuming that the first column is the x and the second column is the y-axis.
test1.csv
0,100
1,110
2,108
4,120
6,124
test2.csv
0,90
1,95
2,99
3,104
4,108
5,111
6,115
Open Jupyter Notebook and import various things.
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Read the data.
df1 = pd.read_csv("test1.csv", names=["x", "y"])
df2 = pd.read_csv("test2.csv", names=["x", "y"])
df
will be an object of Pandas DataFrame.
See previous article for reading from CSV and handling DataFrame. Try basic operation for DataFrame --Qiita
plt.plot(df1["x"], df1["y"])
df1 ["x "]
and df1 ["y "]
are Pandas Series objects that can be passed to plt.plot
.
matplotlib.pyplot.plot — Matplotlib 3.1.1 documentation
You can also overlay two graphs.
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(df1["x"], df1["y"])
ax1.plot(df2["x"], df2["y"])
y = x^2 + 3x + 80
Let's draw a graph of the function.
x3 = np.linspace(0, 6, 13)
y3 = x3 * x3 + 3.0 * x3 + 80.0
plt.plot(x3, y3)
np.linspace (0, 6, 7)
is an array of NumPy ndarrays with elements 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6
return. It is an array in which a total of 13 numbers are evenly arranged with 0 and 6 at both ends.
numpy.linspace — NumPy v1.17 Manual
If you perform four arithmetic operations on ndarray, it will be ndarray with the same number of elements, so x3 * x3 + 3.0 * x3 + 80.0
will also be ndarray containing 7 numbers.
If you make x and y into an array and pass it to plt.plot
, you can make a graph like CSV data. I passed the Pandas Series to plt.plot
earlier, but it seems that I can also pass the NumPy ndarray.
It can also be displayed superimposed on the CSV file data.
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(df1["x"], df1["y"])
ax1.plot(df2["x"], df2["y"])
ax1.plot(x3, y3)
You can specify the shape of a point in the data by passing the argument marker
to plot
.
x3 = np.linspace(0, 6, 13)
y3 = x3 * x3 + 3.0 * x3 + 80.0
plt.plot(x3, y3, marker=".")
x3 = np.linspace(0, 6, 13)
y3 = x3 * x3 + 3.0 * x3 + 80.0
plt.plot(x3, y3, marker="o")
See the reference below for the character strings that can be specified for marker
.
matplotlib.markers — Matplotlib 3.1.1 documentation
You can specify the destination width with the argument linewidth
. If 0 is specified, there will be no line.
x3 = np.linspace(0, 6, 13)
y3 = x3 * x3 + 3.0 * x3 + 80.0
plt.plot(x3, y3, marker="o", linewidth=0)
Other options are also listed in the references below. matplotlib.pyplot.plot — Matplotlib 3.1.1 documentation
that's all.
Recommended Posts