The trigger was that when I was organizing the programs I wrote in the past, the curve fitting of Bezier curves (Curve fitting % 9A% E3% 81% 82% E3% 81% A6% E3% 81% AF% E3% 82% 81)) came out, and I thought it was nostalgic, but the formula was difficult. I couldn't understand it at all.

Why can I approximate it with such a mysterious formula, I'm amazing in the old days.

Apparently it was approximated using the least squares method, and I'd like to explain it in Python after a review. I pulled out the reference book and re-studied it. By the way, the reference is by Dr. Kenichi Kanatani ["Optimized mathematics that you can understand-from basic principles to calculation methods"](https://www.amazon.co.jp/ % E3% 81% 93% E3% 82% 8C% E3% 81% AA% E3% 82% 89% E5% 88% 86% E3% 81% 8B% E3% 82% 8B% E5% BF% 9C% E7 % 94% A8% E6% 95% B0% E5% AD% A6% E6% 95% 99% E5% AE% A4% E2% 80% 95% E6% 9C% 80% E5% B0% 8F% E4% BA % 8C% E4% B9% 97% E6% B3% 95% E3% 81% 8B% E3% 82% 89% E3% 82% A6% E3% 82% A7% E3% 83% BC% E3% 83% 96 % E3% 83% AC% E3% 83% 83% E3% 83% 88% E3% 81% BE% E3% 81% A7-% E9% 87% 91% E8% B0% B7-% E5% 81% A5 % E4% B8% 80 / dp / 4320017382). I recommend this book because it is very easy to understand.

What is the least squares method?

Least Squares Is a typical technique for approximating complex data and functions and is the most important basis for data analysis. The range of application is very wide, and it is a method that has both practicality and beauty that any function can be approximated if it is differentiable.

For example, let's say you have 20 points here.

ダウンロード (1).png

Let's put a straight line that passes the closest to the 20 points.

ダウンロード.png

For the straight line function $ f (x) = ax + b $, find the parameters $ a $, $ b $ that pass the closest to all points. The typical method for finding $ a $ and $ b $ is the ** least squares method **.

It is called the least squares method because it approximates the specified location and the plotted location so that the sum of the squares of the ** errors is minimized **.

Straight line approximation

First, let's try the simplest straight line approximation, and draw a straight line that approximates the following four points.

points = [(107, 101), (449, 617), (816, 876), (1105,1153)]

ダウンロード (2).png

The straight line function is $ f (x) = ax + b $, $ a $ is the slope and $ b $ is the intercept. Find the parameters $ a $ and $ b $ that minimize the square of the error at the specified point for the function $ f (x) $.

Given the specified location $ y $, the function $ j (x, y) $ that finds the square of the error is:

j(x, y) = (y - (ax + b))^2

Since we only need to minimize the sum of $ j $ for N points, the least squares formula is:

J = \frac {1}{2}\sum_{k=1}^{n} j(x_k, y_k)

Find the partial derivative of the error function j (x, y)

Suddenly the esoteric word ** partial derivative ** came out, but don't worry about anything. Even elementary school students can calculate as long as they know the calculation method.

Differentiating a function with two or more variables with respect to one variable is called ** partial differential **. And the function obtained by partial differentiation is called ** partial derivative **.