I've done this.
An example is a scatter plot of MNIST reduced in dimension by t-SNE. Hover your mouse cursor over a plot to see the diagram that corresponds to that plot.
This article focuses on how to use matplotlib. The explanation of MNIST and t-SNE is omitted.
However, what kind of variable name and how it was created depends on the program, so I will post the code there.
import numpy as np
from sklearn.datasets import fetch_openml
from sklearn.manifold import TSNE
width = 28
nskip = 35
mnist = fetch_openml("mnist_784", version=1)
mnist_img = mnist["data"][::nskip, :]
mnist_label = mnist["target"][::nskip]
mnist_int = np.asarray(mnist_label, dtype=int)
x_embedded = TSNE(n_components=2).fit_transform(mnist_img)
width
is the width of the image. nskip
is the sample lottery probability (the reciprocal of). As it is, the sample size is 70,000, which is too much for plotting, so set it to 1/35 and use 2000 samples.
Details of other sequences are as follows.
mnist_img
: (2000, 784) An array of dimensional double precision floating point numbers. Raw image data, stored with pixel values from 0 to 255.mnist_label
: (2000,) An array of dimensions. Numeric labels are stored as character strings.mnist_int
: (2000,) An array of dimensions. An integer type of mnist_label
.Probably the second most straightforward plot.
plt.xlim(x_embedded[:, 0].min(), x_embedded[:, 0].max())
plt.ylim(x_embedded[:, 1].min(), x_embedded[:, 1].max())
for x, label in zip(x_embedded, mnist_label):
plt.text(x[0], x[1], label)
plt.xlabel("component 0")
plt.ylabel("component 1")
plt.show()
The idea of plotting numbers https://qiita.com/stfate/items/8988d01aad9596f9d586 Relied on.
If you use scatter
obediently, the range of x and y axes will be adjusted automatically. Due to the method of placing text
at each point, you can adjust xlim
and ylim
yourself. Must be.
However, you can see at a glance that groups are formed for each number, and sometimes other numbers are mixed like noise.
Look at this
I thought. So, let's display the label by mouse over first.
I found the answer on Stackoverflow.
https://stackoverflow.com/questions/7908636/possible-to-make-labels-appear-when-hovering-over-a-point-in-matplotlib
Change this code for this MNIST.
fig, ax = plt.subplots()
cmap = plt.cm.RdYlGn
sc = plt.scatter(x_embedded[:, 0], x_embedded[:, 1], c=mnist_int/10.0, cmap=cmap, s=3)
annot = ax.annotate("", xy=(0,0), xytext=(20,20),textcoords="offset points",
bbox=dict(boxstyle="round", fc="w"),
arrowprops=dict(arrowstyle="->"))
annot.set_visible(False)
def update_annot(ind):
i = ind["ind"][0]
pos = sc.get_offsets()[i]
annot.xy = pos
text = mnist_label[i]
annot.set_text(text)
annot.get_bbox_patch().set_facecolor(cmap(int(text)/10))
def hover(event):
vis = annot.get_visible()
if event.inaxes == ax:
cont, ind = sc.contains(event)
if cont:
update_annot(ind)
annot.set_visible(True)
fig.canvas.draw_idle()
else:
if vis:
annot.set_visible(False)
fig.canvas.draw_idle()
fig.canvas.mpl_connect("motion_notify_event", hover)
plt.show()
Well, it uses a common event in the GUI.
First, create a blank ʻAnnotation object ʻannot
, and then create a function ʻupdate_annotthat updates its position, contents, etc. Register the
hover function as
fig.canvas.mpl_connect ("motion_notify_event", hover) and call ʻupdate_annot
while displaying ʻannot if the cursor points to any point in that
hover Hide ʻannot
if it doesn't point to a point.
In order to set the color to c = mnist_int / 10.0
in scatter
, I prepared the mnist_int
array with the label as an integer.
Now you can draw an interactive scatter plot like the one in the video above.
I've skipped this time, but I think it would be kinder to display a legend of colors and numbers somewhere.
By doing so far, I got more dissatisfaction.
"How special does a noisy point, for example 7 in a cluster of 1s, look? Maybe the algorithm makes a normal point noisy, so just the label I want to check the raw data as well. "
In order to realize this, it seems good to display the original image by mouse over.
There was officially a demo to display images with annotations.
https://matplotlib.org/3.1.0/gallery/text_labels_and_annotations/demo_annotation_box.html
Combine this with the event registration mentioned earlier.
First of all, since the annotation was only text earlier, ʻAnnotation` was fine, but when it comes to images, it is a little troublesome,
A two-step operation is required. First, import the required class.
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
Preparing the required graph objects.
fig, ax = plt.subplots()
cmap = plt.cm.RdYlGn
After that, prepare ʻOffsetImage`, but at that time, use the 0th image as a dummy image.
img = mnist_img[0, :].reshape((width, width))
imagebox = OffsetImage(img, zoom=1.0)
imagebox.image.axes = ax
Based on that, make ʻAnnotation Bbox`.
annot = AnnotationBbox(imagebox, xy=(0,0), xybox=(width,width),
xycoords="data", boxcoords="offset points", pad=0.5,
arrowprops=dict( arrowstyle="->", connectionstyle="arc3,rad=-0.3"))
annot.set_visible(False)
ax.add_artist(annot)
Note that xybox
does not indicate the size of ʻannot, but the position relative to the annotation point
xy`.
Next is the update of plots and images.
sc = plt.scatter(x_embedded[:, 0], x_embedded[:, 1], c=mnist_int/10.0, cmap=cmap, s=3)
def update_annot(ind):
i = ind["ind"][0]
pos = sc.get_offsets()[i]
annot.xy = (pos[0], pos[1])
img = mnist_img[i, :].reshape((width, width))
imagebox.set_data(img)
I am updating the new image data for ʻimagebox, but it seems that the update process is not necessary for ʻannot
.
Also, as I experimented separately, even if the size of ʻimg` changes, it responds dynamically, so I think that there is a process to add it separately even if there are various sizes.
The rest of the event registration is the same.
def hover(event):
vis = annot.get_visible()
if event.inaxes == ax:
cont, ind = sc.contains(event)
if cont:
update_annot(ind)
annot.set_visible(True)
fig.canvas.draw_idle()
else:
if vis:
annot.set_visible(False)
fig.canvas.draw_idle()
fig.canvas.mpl_connect("motion_notify_event", hover)
plt.show()
I'm sorry that it is displayed only for a moment, but I found that the noise point is certainly a shape that is close to another number and is likely to be misread.
Well, it's a meaningless technique to do with image output, but I think it can be used quite well during trial and error.
In this talk, I learned about the existence of the source abstract class ʻArtist` that describes objects in matplotlib.
Change the plot part you want to display from ʻif False:to ʻif True:
.
We have not confirmed the operation when multiple are set to True
.
import numpy as np
from sklearn.datasets import fetch_openml
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
width = 28
nskip = 35
mnist = fetch_openml("mnist_784", version=1)
mnist_img = mnist["data"][::nskip, :]
mnist_label = mnist["target"][::nskip]
mnist_int = np.asarray(mnist_label, dtype=int)
print(type(mnist_img))
print(mnist_img.max())
print(mnist_img.dtype)
exit()
x_embedded = TSNE(n_components=2).fit_transform(mnist_img)
if True:
plt.xlim(x_embedded[:, 0].min(), x_embedded[:, 0].max())
plt.ylim(x_embedded[:, 1].min(), x_embedded[:, 1].max())
for x, label in zip(x_embedded, mnist_label):
plt.text(x[0], x[1], label)
plt.xlabel("component 0")
plt.ylabel("component 1")
plt.show()
exit()
fig, ax = plt.subplots()
cmap = plt.cm.RdYlGn
if False:
sc = plt.scatter(x_embedded[:, 0], x_embedded[:, 1], c=mnist_int/10.0, cmap=cmap, s=3)
annot = ax.annotate("", xy=(0,0), xytext=(20,20),textcoords="offset points",
bbox=dict(boxstyle="round", fc="w"),
arrowprops=dict(arrowstyle="->"))
annot.set_visible(False)
def update_annot(ind):
i = ind["ind"][0]
pos = sc.get_offsets()[i]
annot.xy = pos
text = mnist_label[i]
annot.set_text(text)
annot.get_bbox_patch().set_facecolor(cmap(int(text)/10))
def hover(event):
vis = annot.get_visible()
if event.inaxes == ax:
cont, ind = sc.contains(event)
if cont:
update_annot(ind)
annot.set_visible(True)
fig.canvas.draw_idle()
else:
if vis:
annot.set_visible(False)
fig.canvas.draw_idle()
fig.canvas.mpl_connect("motion_notify_event", hover)
plt.show()
if False:
img = mnist_img[0, :].reshape((width, width))
imagebox = OffsetImage(img, zoom=1.0)
imagebox.image.axes = ax
sc = plt.scatter(x_embedded[:, 0], x_embedded[:, 1], c=mnist_int/10.0, cmap=cmap, s=3)
annot = AnnotationBbox(imagebox, xy=(0,0), xybox=(width,width),
xycoords="data", boxcoords="offset points", pad=0.5,
arrowprops=dict( arrowstyle="->", connectionstyle="arc3,rad=-0.3"))
annot.set_visible(False)
ax.add_artist(annot)
def update_annot(ind):
i = ind["ind"][0]
pos = sc.get_offsets()[i]
annot.xy = (pos[0], pos[1])
img = mnist_img[i, :].reshape((width, width))
imagebox.set_data(img)
def hover(event):
vis = annot.get_visible()
if event.inaxes == ax:
cont, ind = sc.contains(event)
if cont:
update_annot(ind)
annot.set_visible(True)
fig.canvas.draw_idle()
else:
if vis:
annot.set_visible(False)
fig.canvas.draw_idle()
fig.canvas.mpl_connect("motion_notify_event", hover)
plt.show()
Recommended Posts