[Deep Learning from scratch] in Java 1. For the time being, differentiation and partial differentiation

Introduction

There is a book called "Deep Learning from scratch The theory and implementation of deep learning learned with Python". I read it twice, but I think I understand it, I don't understand it. Since it is implemented in Python in the first place, as a Java developer, I feel that it has been fooled. Because of the dynamic typing, the arguments of the same method are sometimes numbers and sometimes arrays, depending on what the caller passes ... too tricky ... ~~ What you should learn about Deeplearning4j obediently ~~ "Yes, let's implement it in Java". Please refer to the book for the explanation because it is only implemented.

For the time being, differentiation

Is it possible to implement differentiation and gradient in Java in the first place (P97 4.3 Numerical differentiation / P03 4.4 Gradient)? If I can't, it's unlikely that I'll be able to do anything, so I experimented for the time being (Java 8 or later).

`ArrayUtil.java`


private static double h = 1e-4; //Very small number
public double numericalDiff(DoubleUnaryOperator func, double x){
	return (func.applyAsDouble(x + h) - func.applyAsDouble(x-h))/ (2*h);
}

The content of the test is P103. It's just like the book, so it's considered good.

`ArrayUtilTest.java`


@Test
public void numericalDiff1(){
	assertThat(target.numericalDiff(p-> p*p+4*4, 3.0), is(6.00000000000378));
	assertThat(target.numericalDiff(p-> 3*3+p*p, 4.0), is(7.999999999999119));
}

Next, partial differential

Implemented P104 of the book. ~~ In the implementation of the book (Python), the original value is assigned to tmp_val, and after calculation, it is returned to the original value. However, if you do it in Java, the original data will change after all because the reference destination is the same. Therefore, a deep copy is used to hold the original data. ~~ → I received a comment that there is no problem if I calculate immediately after substitution. It's reasonable.

`ArrayUtil.java`


private static double h = 1e-4; //Very small number
public double[][] numericalGradient(ToDoubleFunction<double[][]> func, double[][] x){

	int cntRow = x.length;
	int cntCol = x[0].length;

	double[][] result = new double[cntRow][cntCol];
	for (int i=0; i < cntRow; i++){
		for (int j=0; j < cntCol; j++){

			double[][] xPlus = deepCopy(x);
			xPlus[i][j] = xPlus[i][j] + h;

			double[][] xMinus = deepCopy(x);
			xMinus[i][j] = xMinus[i][j] - h;

			result[i][j] = (func.applyAsDouble(xPlus) - func.applyAsDouble(xMinus))/ (2*h);
		}
	}

	return result;
}

public double[][] deepCopy(double[][] x){
	double[][] copy = new double[x.length][];
	for (int i = 0; i < copy.length; i++){
		copy[i] = new double[x[i].length];
		System.arraycopy(x[i], 0, copy[i], 0, x[i].length);
	}
	return copy;
}

The content of the test is P104. Similarly, because it is as per the book, it is considered to be good.

`ArrayUtilTest.java`


@Test
public void numericalGradient(){

	ToDoubleFunction<double[][]> function = p-> p[0][0] * p[0][0] + p[0][1]*p[0][1];
	double[][] x = {{3,4}};
	double[][] result = target.numericalGradient(function, x);

	assertThat(result[0][0], is(6.00000000000378));
	assertThat(result[0][1], is(7.999999999999119));

	result = target.numericalGradient(function, new double[][]{{0,2}});

	assertThat(result[0][0], is(closeTo(0.0, 0.000001)));
	assertThat(result[0][1], is(closeTo(4.0, 0.000001)));
}

in conclusion

Differentiation and partial differentiation seem to be okay. By the way, I implemented all of them. The problem is that the PC is slow and I can not verify whether it is finally outputting proper results orz