epilys

Mapping concepts to colors (terribly) with the Oklab perceptual colorspace

TL;DR

Oklab HSL RGB
[‘0.45’, ‘-0.04’, ‘-0.06’] [‘200.00’, ‘0.44’, ‘0.32’] [‘46.00’, ‘92.00’, ‘120.00’]
#2d5b76
Oklab HSL RGB
[‘0.60’, ‘0.18’, ‘0.04’] [‘350.00’, ‘0.66’, ‘0.54’] [‘220.00’, ‘62.00’, ‘95.00’]
#d73e5f
Oklab HSL RGB
[‘0.56’, ‘-0.07’, ‘-0.05’] [‘190.00’, ‘0.49’, ‘0.39’] [‘51.00’, ‘130.00’, ‘150.00’]
#338193
Oklab HSL RGB
[‘0.45’, ‘-0.02’, ‘-0.16’] [‘220.00’, ‘0.70’, ‘0.40’] [‘30.00’, ‘79.00’, ‘180.00’]
#1e4eaf
Oklab HSL RGB
[‘0.53’, ‘-0.08’, ‘0.03’] [‘150.00’, ‘0.36’, ‘0.36’] [‘59.00’, ‘120.00’, ‘89.00’]
#3a7c59
Oklab HSL RGB
[‘0.32’, ‘-0.00’, ‘-0.15’] [‘230.00’, ‘0.70’, ‘0.29’] [‘23.00’, ‘34.00’, ‘130.00’]
#16217e
Oklab HSL RGB
[‘0.76’, ‘-0.11’, ‘0.11’] [‘93.00’, ‘0.49’, ‘0.56’] [‘140.00’, ‘200.00’, ‘88.00’]
#89c658
Oklab HSL RGB
[‘0.77’, ‘0.05’, ‘0.11’] [‘31.00’, ‘0.77’, ‘0.63’] [‘230.00’, ‘160.00’, ‘86.00’]
#e9a156
Results for “programming”

What’s a colorspace and what’s “Perceptually uniform”?

A colorspace is basically a way to model colors to attributes. The well known RGB colorspace maps colors to Red, Green and Blue.

If that space has three attributes, we can view them as coordinates on a 3D space (Any n attributes can be viewed as an n-dimensional vector space). Then we define color distance as the usual Euclidean distance we use for tangible stuff in the real world.

A uniformly perceptual colorspace aims to have the following identity: “identical spatial distance between two colors equals identical amount of perceived color difference”. The actual definitions of those terms can be found in color science books and research.

Oklab is a perceptual color space designed by Björn Ottosson to make working with colors in image processing easier. After reading the introductory blog post, I wondered if I could apply it to finding dominant colors of an image.

Oklab has three coordinates:

Uniformly sampling the Oklab colorspace in 8 parts per coordinate.
Uniformly sampling the Oklab colorspace in 16 parts per coordinate.

Dominant colors

I guess we would take an image and average all colors. What would that produce?

#d2c6b6

Terrible. Obviously the approach can’t work with multiple colors apparent in a picture. If the picture was mostly one color it’d be somewhat useful:

#94706f

k-means clustering

From signal processing comes this dazzling technique: Given a set of colors c, partition them to k buckets as follows:

  1. Initially assign k average colors somehow. You can pick them randomly for example. We will incrementally improve on those averages to arrive to a centroid color, or the mean (average) color of a cluster.
  2. Assign every color c to the average closest to it mκ by calculating Euclidean distances to each m.
  3. Recalculate mκ as the average of the updated cluster κ.
  4. Repeat until assignments are the same as the previous step; we’ve reached convergence which is not necessarily correct/optimal.

Since we will use a perceptually uniform colorspace, we expect each cluster to be perceivably close to the actual colors it contains.

And since we will be working with lots of sample images, we can calculate the overall dominant colors by putting all the colors together.

Implementation

To visualize the results, I chose to calculate the dominant colors for each image, then calculate the overall dominant colors from those.

I also uniformly split the Oklab colorspace into colors and clustered all the dominant colors again, in order to see the difference of the calculated dominant colors and the uniformly sampled ones:

Uniformly partitioning dominant colors.

The image results for most queries are stock photos or text, hence there is a lot of black and white. We can deduce how black or greyscale looking is a color by looking at its coordinates. In Oklab, the a, b coordinates will be close to zero. In HSL (Hue-Saturation-Lightness) a low L value means the color is close to black. We can discard such colors by checking those values.

Results

Searching for non abstract things such as fruits returns pictures of the things themselves so we get good results:

Oklab HSL RGB
[‘0.74’, ‘-0.01’, ‘0.12’] [‘48.00’, ‘0.52’, ‘0.52’] [‘200.00’, ‘170.00’, ‘70.00’]
#c4ab46
Oklab HSL RGB
[‘0.38’, ‘-0.02’, ‘0.05’] [‘59.00’, ‘0.38’, ‘0.20’] [‘70.00’, ‘69.00’, ‘31.00’]
#46451f
Oklab HSL RGB
[‘0.89’, ‘-0.04’, ‘0.12’] [‘60.00’, ‘0.63’, ‘0.69’] [‘230.00’, ‘230.00’, ‘130.00’]
#e1e27f
Oklab HSL RGB
[‘0.22’, ‘-0.00’, ‘0.02’] [‘49.00’, ‘0.39’, ‘0.08’] [‘29.00’, ‘26.00’, ‘13.00’]
#1c190c
Oklab HSL RGB
[‘0.62’, ‘-0.00’, ‘0.10’] [‘46.00’, ‘0.51’, ‘0.41’] [‘160.00’, ‘130.00’, ‘51.00’]
#9d8433
Oklab HSL RGB
[‘0.52’, ‘0.00’, ‘0.08’] [‘42.00’, ‘0.42’, ‘0.34’] [‘120.00’, ‘100.00’, ‘51.00’]
#7c6732
Results for “banana” (is that an impostor?)

Searching for pharmaceuticals returns lots of pictures of colorful pills:

Oklab HSL RGB
[‘0.59’, ‘-0.04’, ‘-0.08’] [‘210.00’, ‘0.39’, ‘0.49’] [‘77.00’, ‘130.00’, ‘170.00’]
#4d81ae
Oklab HSL RGB
[‘0.45’, ‘-0.01’, ‘-0.01’] [‘190.00’, ‘0.08’, ‘0.33’] [‘78.00’, ‘88.00’, ‘91.00’]
#4d585a
Oklab HSL RGB
[‘0.43’, ‘-0.02’, ‘-0.15’] [‘220.00’, ‘0.66’, ‘0.37’] [‘33.00’, ‘72.00’, ‘160.00’]
#20479d
Oklab HSL RGB
[‘0.84’, ‘-0.02’, ‘-0.05’] [‘210.00’, ‘0.59’, ‘0.81’] [‘180.00’, ‘210.00’, ‘230.00’]
#b1cfea
Oklab HSL RGB
[‘0.54’, ‘0.15’, ‘0.07’] [‘360.00’, ‘0.50’, ‘0.49’] [‘190.00’, ‘61.00’, ‘62.00’]
#ba3d3e
Oklab HSL RGB
[‘0.58’, ‘-0.10’, ‘0.05’] [‘140.00’, ‘0.36’, ‘0.40’] [‘66.00’, ‘140.00’, ‘88.00’]
#418b58
Results for “pharmaceuticals”

Searching for ethics returns pictures of signs that point to stuff such as “Right” and “Wrong” and “Principles”:

Oklab HSL RGB
[‘0.74’, ‘-0.07’, ‘0.11’] [‘79.00’, ‘0.40’, ‘0.53’] [‘150.00’, ‘180.00’, ‘88.00’]
#99b757
Oklab HSL RGB
[‘0.71’, ‘-0.04’, ‘-0.06’] [‘200.00’, ‘0.43’, ‘0.62’] [‘120.00’, ‘170.00’, ‘200.00’]
#76a6c8
Oklab HSL RGB
[‘0.95’, ‘-0.00’, ‘0.01’] [‘64.00’, ‘0.16’, ‘0.92’] [‘240.00’, ‘240.00’, ‘230.00’]
#edede6
Oklab HSL RGB
[‘0.68’, ‘0.02’, ‘0.02’] [‘20.00’, ‘0.17’, ‘0.60’] [‘170.00’, ‘150.00’, ‘140.00’]
#a99387
Results for “ethics”

Searching for design returns a boring sea of brown and beige thanks to interior design trends:

Oklab HSL RGB
[‘0.64’, <‘0.01’, ‘0.02’] [‘34.00’, ‘0.10’, ‘0.53’] [‘150.00’, ‘140.00’, ‘120.00’]
#93887b
Oklab HSL RGB
[‘0.79’, ‘0.01’, ‘0.02’] [‘35.00’, ‘0.17’, ‘0.71’] [‘190.00’, ‘180.00’, ‘170.00’]
#c2b7a9
Oklab HSL RGB
[‘0.84’, ‘0.00’, ‘0.01’] [‘43.00’, ‘0.08’, ‘0.78’] [‘200.00’, ‘200.00’, ‘190.00’]
#cbc8c2
Oklab HSL RGB
[‘0.53’, ‘0.01’, ‘0.02’] [‘28.00’, ‘0.12’, ‘0.41’] [‘120.00’, ‘100.00’, ‘92.00’]
#74675c
Results for “design”

Searching for programming identifies the classic green terminal color along with other syntax highlighting palettes:

Oklab HSL RGB
[‘0.60’, ‘0.18’, ‘0.04’] [‘350.00’, ‘0.66’, ‘0.54’] [‘220.00’, ‘62.00’, ‘95.00’]
#d73e5f
Oklab HSL RGB
[‘0.45’, ‘-0.02’, ‘-0.16’] [‘220.00’, ‘0.70’, ‘0.40’] [‘30.00’, ‘79.00’, ‘180.00’]
#1e4eaf
Oklab HSL RGB
[‘0.76’, ‘-0.11’, ‘0.11’] [‘93.00’, ‘0.49’, ‘0.56’] [‘140.00’, ‘200.00’, ‘88.00’]
#89c658
Oklab HSL RGB
[‘0.77’, ‘0.05’, ‘0.11’] [‘31.00’, ‘0.77’, ‘0.63’] [‘230.00’, ‘160.00’, ‘86.00’]
#e9a156
Results for “programming”

Finally, philosophy returns pictures of books and statues, so the results are predictable and omitted:

Results for “philosophy”

Improving the sample source

I’ve had some luck getting “better” results by searching for “book about {query}” and “book about {query} cover” expecting topical books to share color schemes, like the distinctive palettes O’Reilly uses in its programming books.

I found Google Images to show less junk results but they have no API you can use without an account.

Conclusions and notes

As expected, this doesn’t produce particularly mind blowing results since abstract concepts lack color association in general. Even if you have any type of vision synesthesia, the colors you see are usually unique for each person.

To get back to the original motivation behind this experiment, which was associating post tags with colors: you can achieve this by clustering existing colors and for each new tag calculate dominant colors, and choose one that belongs to the smallest cluster. That way you can avoid common colors like black/white/blue/orange saturating your tag cloud.

Sample code

import decimal
import itertools
from wand.image import Image
import numpy as np
from scipy.cluster.vq import vq, kmeans
import colorio

wand_color_to_arr = lambda c: np.array([c.red_int8, c.green_int8, c.blue_int8])

OKLAB = colorio.cs.OKLAB()
color_abs = lambda v: 0xFF if v > 0xFF else v if v >= 0 else 0

oklab_to_rgb255 = lambda o: OKLAB.to_rgb255(o)
rgb_to_hex = lambda rgb: "#%s" % "".join(("%02x" % p for p in rgb))
oklab_to_hex = lambda o: rgb_to_hex(map(color_abs, map(int, oklab_to_rgb255(o))))

dec_ctx = decimal.Context(prec=2, rounding=decimal.ROUND_HALF_DOWN)
arr_display = lambda arr: ["%.2f" % dec_ctx.create_decimal_from_float(i) for i in arr]


def image_to_colors(img: Image):
    img.thumbnail(200, 200)
    colors = set(c for row in img for c in row)
    ret = []
    for c in colors:
        ret.append(OKLAB.from_rgb255(wand_color_to_arr(c)))
    return ret


class Bucket:
    def __init__(self, rep):
        self.rep = rep
        self.colors = []

    def __len__(self):
        return len(self.colors)

    def append(self, color):
        self.colors.append(color)


def dominant_colors(oks, n=20):
    _r, _ = kmeans(oks, min(n, len(oks)))
    # sort dominant colors by cluster size
    buckets = [Bucket(rep) for rep in _r]
    _s, _ = vq(oks, _r)
    for idx, c in enumerate(oks):
        bucket_idx = _s[idx]
        buckets[bucket_idx].append(c)
    buckets.sort(key=lambda b: len(b), reverse=True)
    return [b.rep for b in buckets]


def make_uniform_clusters(oks, n=20):
    def make_grid(n=20):
        code_steps = np.linspace(-1.0, 1.0, num=n)
        return list(itertools.product(code_steps, code_steps, code_steps))

    prod = make_grid(n)
    buckets = [Bucket(rep) for rep in prod]
    _r, _ = vq(oks, prod)
    for idx, c in enumerate(oks):
        bucket_idx = _r[idx]
        buckets[bucket_idx].append(c)
    buckets.sort(key=lambda b: len(b), reverse=True)
    return buckets