[Introduction to Computer Science No. 0: Try Machine Learning] Let's implement k-means clustering in Java

Introduction

Hello, this is Sumiyama water.

I'm thinking of starting with a series, and I'll try to put it together in Qiita while reviewing the clustering I used to do in the past.

First of all, I would like to explain the "k-means clustering", which is the basis of data classification called data mining or clustering, in several steps while gradually implementing it in Java (my review by the name). ..

In this 0th installment, I will talk about the introduction and assumptions for the time being. I wonder if the specific explanation will be from the next time.

Touching, not touching and other precautions

In this series, I will not touch on the explanation of the types of "data mining" and "clustering", and the explanation of their uses, but the purpose is to implement the method called k-means clustering to the end.

The idea is that you can get a feel for the atmosphere by moving your hands rather than increasing your knowledge in a classroom manner.

The purpose is to get a feel for the atmosphere by moving your hands, so I will implement it myself without using an existing analysis library.

Also, I knew about this area more than 10 years ago, and I haven't caught up since then, so please be aware that the information is out of date.

Premise

I will talk on the premise that I have some knowledge of language. Also, I have mixed motives to want to touch Spring Boot, which I use recently at work, even in private, so I will proceed based on Spring Boot.

Even so, I don't write business logic, so I don't think there will be much talk about Spring Boot. I think it will be written almost in Java itself. Even if you use annotations without a preface, it's a story that you shouldn't forgive.

What can the k-means method do?

Assuming that the detailed logic will be turned on from the next time onward, it is very rough.

image.png

Data like this

image.png

It can be classified like this.

In the figure, I put X and Y appropriately, but I think it would be nice if you could imagine something like "the purchase price and time zone of a certain convenience store user".

Well, in reality, there is no data that is collected so neatly, but even so, if you look at this number of samples with the human eye, even if it looks like a group, it is a technique to let the computer discriminate without any prior information. Is necessary.

If the amount of data increases or the axis is not XY, you need to use the power of a computer.

Next time

This time, I briefly talked about the premise and what can be done.

From the next time, I would like to implement it while explaining the parts actually required for the classification logic.

next time [Introduction to Computer Science Part 1: Let's try machine learning] Let's implement k-means clustering in Java-About the concept of coordinates-

Recommended Posts

[Introduction to Computer Science No. 0: Try Machine Learning] Let's implement k-means clustering in Java
[Introduction to Computer Science Part 2: Let's try machine learning] Let's implement k-means clustering in Java-Distance between data-
[Introduction to Computer Science Part 3: Let's try machine learning] Let's implement k-means clustering in Java-Center of data set-
[Introduction to Computer Science Part 1: Let's try machine learning] Let's implement k-means clustering in Java-About the concept of coordinates-
Try to implement Yubaba in Java
Try to implement n-ary addition in Java
I tried to implement deep learning in Java
Try to implement Yubaba in Ruby
Let's use Twilio in Java! (Introduction)
[Java] Try to implement using generics
How to implement date calculation in Java
How to implement Kalman filter in Java
Try to solve Project Euler in Java
How to implement coding conventions in Java
Machine learning (DeepLeaning4j) in Java Try to learn a document and extract words that are highly related to a specific word
Try to create a bulletin board in Java
There seems to be no else-if in java
Try to implement TCP / IP + NIO with JAVA
[Java] Introduction to Java
Introduction to java
I tried to implement Firebase push notification in Java
Quick learning Java "Introduction?" Part 2 Let's write the process
Try to solve a restricted FizzBuzz problem in Java
Summary of how to implement default arguments in Java
I tried to implement the Euclidean algorithm in Java