[Python] (Line) Extract values from graph images


Is there a graph image but no data ...? If you have an image, extract it.

image.png pandas 0.7.3 documentation -Plotting with matplotlib             \downarrow array([-0.4028436 , -0.09518499, 0.21247362, ..., 39.12322275, 39.12322275, 39.12322275])             + image.png

↑ ~~ If it is so fine, you cannot expect much accuracy. .. .. ~~

Process flow

Get the desired graph by selecting the color gamut   ↓ Average in the vertical direction   ↓ Interpolate for the number of samples you want   ↓ Scale adjustment   ↓ output


** You can run it in Colab here **

import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact
import requests
from PIL import Image
import io

Get image data, ignore alpha for the time being

path = "Image path"
im = plt.imread(path)
if im.shape[2] == 4:im = im[:,:,:-1]
if im.max() > 1:im /= 255
h, w, _ = im.shape

Trim only the range of the graph for later scale adjustment

@interact(x_min=(0, w), x_max=(0, w), y_min=(0,h), y_max=(0,h))
def Plot(x_min=0, x_max=w, y_min=0, y_max=h):
    global imag
    plt.figure(figsize=(7, 7))
    imag = im[min(y_min,y_max-1):max(y_min+1, y_max), min(x_min,x_max-1):max(x_min+1, x_max)]

Select the graph you want to extract by color gamut selection, and adjust the Threshold to prevent unnecessary parts from entering.

@interact(x=(0, imag.shape[1]), y=(0,imag.shape[0]), thresh=(1,10))
def Plot(x, y, thresh):
    global p
    p = ((imag - imag[y, x]) ** 2).sum(axis=2) < (1 / (1<<thresh))
    plt.plot([x, x], [0, imag.shape[0]], color="r")
    plt.plot([0, imag.shape[1]], [imag.shape[0]-y, imag.shape[0]-y], color="r")

Take the average in the vertical direction

p = np.pad(p, 1, "constant")
sx = np.arange(len(p[0]))[p.argmax(axis=0)!=0]
sy = []

for i in p.T:
    j = np.where(i!=0)[0]
    if j.tolist():

Noise removal by selecting the number of samples and moving average (convolution)

@interact(sample=(5, 1250), conv_size=(1, 21, 2))
def fit(sample, conv_size):
    global y
    x = np.linspace(sx.min(), sx.max(), sample)
    y = np.convolve(np.pad(np.interp(x, sx, sy), (conv_size-1)//2, "edge"), np.ones(conv_size) / conv_size, "valid")
    plt.plot(x, y)
    plt.ylim(0, len(p))

Enter the range of the graph that was cut first

yl = list(map(int,input("Y-range of trimmed graph?         ").split(",")))

Scale adjustment, output

y_out = y * (yl[1] - yl[0]) / p.shape[0] + yl[0]

Graph output


↑ is a program for Jupyter, so it cannot be executed unless cells are separated by separate parts.


If you make trimming and color gamut selection more interactive using HTML, it will be easier to use.

