[Roughly] Clustering by KMeans

Overview

approach

  1. Store data in dataframe
  2. Determine the number of clusters
  3. Clustering with KMeans
  4. Output the clustering result

code

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from collections import Counter

##Prepare dataframe df###

num_clus = 4 #Set the number of clusters
kmeans = KMeans(n_clusters=num_clus, random_state=0).fit(df)

print(Counter(kmeans.labels_)) #Output the number of people in each cluster

df['cluster_id']=kmeans.labels_ #Add cluster number to original dataframe

for i in range(0,num_clus): #Output the average value of each cluster
    print(df[df['cluster_id']==i].mean())

Recommended Posts

[Roughly] Clustering by KMeans
Clustering experiment by sampling
Extract dominant color of image by k-means clustering
Clustering and principal component analysis by K-means method (beginner)
Try using scikit-learn (1) --K-means clustering
Classify data by k-means method
Explainable AI ~ Explainable k-Means and k-Medians Clustering ~
I implemented the K-means method (clustering method)
100 language processing knock-97 (using scikit-learn): k-means clustering
Try to classify O'Reilly books by clustering