Extract dominant color of image by k-means clustering

Dominant color is a hue that dominates the overall color scheme. --Google

Extract 5 colors and draw a pie chart in proportion.

input.jpg out.png input.jpg out.png input.jpg out.png

Package to use

$ pip install opencv-python
$ pip install scikit-learn
$ pip install matplotlib

Load image

Make an RGB list to make the image data that can be k-means clustered

import cv2
import itertools

image = cv2.imread('./input.jpg')
rgbs = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
rgb_list = list(itertools.chain(*rgbs.tolist()))

k-means Number of colors to extract = Number of clusters Here are 5 examples:

from sklearn.cluster import KMeans

clusters = KMeans(n_clusters=5).fit(rgb_list)

The center of the cluster is the dominant color

colors = clusters.cluster_centers_
[[ 25.29093216 119.84721127 142.13737995]
 [223.23362209 201.96734673 193.59849205]
 [176.3426999  108.01350558 118.93074255]
 [  8.36396613  14.71480369  27.54413049]
 [ 98.95068783  32.240443    48.93265647]]

Calculate the percentage of each cluster

import numpy as np

def cluster_percents(labels):
    total = len(labels)
    percents = []
    for i in set(labels):
        percent = (np.count_nonzero(labels == i) / total) * 100
        percents.append(round(percent, 2))
    return percents
percents = cluster_percents(clusters.labels_)
[9.16, 9.6, 11.51, 48.37, 21.35]

Draw a pie chart

Scale because matplotlib color only accepts RGB scaled from 0 to 1.

import matplotlib.pyplot as plt

colors = clusters.cluster_centers_ / 255
colors = colors.tolist()

Sort the proportions from large to small to make the pie chart look nice.

percents = cluster_percents(clusters.labels_)
tup = zip(colors, percents)
sorted_tup = sorted(tup, key=lambda n: n[1], reverse=True)
sorted_colors = [c for c,p in sorted_tup]
sorted_percents = [p for c,p in sorted_tup]

Draw a pie chart

plt.pie(sorted_percents, colors=sorted_colors, counterclock=False, startangle=90)

