Reference Palette Corpus Analysis¶

Status: draft Scope: structural analysis of the stabilized categorical reference corpus before pairwise Leonardo review Source: Reference Categorical Palette Corpus, Reference Categorical Palette Sheet

This note summarizes the reference corpus at a structural level before doing pairwise accessibility analysis. The goal is to understand how the corpus tends to open, how the first four slots are commonly organized, and what broad strategy families show up across BI tools, libraries, and editorial systems.

Dataset¶

27 palette rows
categories represented: BI and Analytics, OSS and Chart Libraries, Editorial and Journalism
exact duplicate rows already collapsed in the source corpus: Tableau / Vega and D3 / RAWGraphs

First-Color Bias¶

The clearest pattern in the corpus is strong blue-first ordering.

20 of 27 palettes are blue-first, about 74%
2 are cyan-first: Apache Superset, Datawrapper
2 are green-first: Qlik Sense, Grafana (light theme)
1 is red-first: Apache ECharts
1 is pink-first: Financial Times
1 is neutral-first: Reuters Graphics

This means the corpus is not merely "often blue." It is overwhelmingly blue-led at slot 1, especially for general-purpose tools. That makes blue-first the baseline convention rather than an incidental trend.

By category:

BI and Analytics is especially blue-led. 9 of 12 rows open on blue.
OSS and Chart Libraries also leans blue-first. 6 of 8 rows open on blue.
Editorial and Journalism is more varied. It still has several blue-first rows, but also includes cyan-first, pink-first, and neutral-first systems.

First-Step Blue Families¶

Within the blue-first majority, the first swatches are not all doing the same job. Looking at them in OKLCH separates them into three practical families:

Muted Reference Blues These are lower-chroma, steadier blues that read as trustworthy defaults.
Electric Indigo Blues These are more saturated and more purple-leaning. They feel product-bright and digitally assertive.
Bright Sky Blues These are lighter and cleaner, often pushing upward in lightness while retaining moderate chroma.
First-Step Blue Groups PNG

First-step blue groups in OKLCH

This split is useful because "blue-first" turns out to hide multiple design intents:

a muted reference/default blue
an electric app/brand blue
a lighter sky or cyan-leaning blue

That is helpful for DFT, because the main design question is not just whether the first color should be blue. It is what kind of blue it should be.

First-Four Slot Patterns¶

The first four positions are much more structured than later slots.

Slot 1¶

blue dominates decisively
non-blue leads are unusual enough to feel like a deliberate stance

Slot 2¶

the most common second-slot hues are red and orange
blue sometimes repeats in slot 2, but that is less common and often more brand-led than category-led

Slot 3¶

green is the most common third-slot hue
red and orange are the next most common

Slot 4¶

red is still common, but the field broadens
cyan, orange, and purple become much more normal here

Slot 5¶

there is no single dominant fifth-slot behavior
by slot 5 the corpus starts to fan out into multiple strategies rather than one shared convention

The practical takeaway is that the corpus has a strong shared logic in slots 1-4 and much weaker consensus after that. That supports using slots 1-5 as the main scope for deeper evaluation.

Sankey View¶

To make the slot-by-slot structure easier to see, the corpus can be abstracted into a small set of hue families and rendered as a transition diagram across the first five positions.

Hue groups used here:

Blue
Cyan
Green
Yellow
Orange
Red
Purple
Pink
Neutral

This abstraction intentionally compresses small shade differences and brand nuance so the structural movement is easier to read.

Reference palette corpus sankey

sankey-beta
  "1 Blue","2 Blue",2
  "1 Blue","2 Cyan",1
  "1 Blue","2 Green",2
  "1 Blue","2 Orange",7
  "1 Blue","2 Purple",1
  "1 Blue","2 Red",7
  "1 Cyan","2 Blue",1
  "1 Cyan","2 Cyan",1
  "1 Green","2 Green",1
  "1 Green","2 Yellow",1
  "1 Neutral","2 Neutral",1
  "1 Pink","2 Red",1
  "1 Red","2 Blue",1
  "2 Blue","3 Cyan",1
  "2 Blue","3 Green",2
  "2 Blue","3 Red",1
  "2 Cyan","3 Cyan",1
  "2 Cyan","3 Orange",1
  "2 Green","3 Blue",1
  "2 Green","3 Orange",1
  "2 Green","3 Purple",1
  "2 Neutral","3 Neutral",1
  "2 Orange","3 Green",2
  "2 Orange","3 Neutral",1
  "2 Orange","3 Orange",1
  "2 Orange","3 Red",3
  "2 Purple","3 Red",1
  "2 Red","3 Cyan",1
  "2 Red","3 Green",4
  "2 Red","3 Orange",2
  "2 Red","3 Red",1
  "2 Yellow","3 Blue",1
  "3 Blue","4 Blue",1
  "3 Blue","4 Orange",1
  "3 Cyan","4 Cyan",2
  "3 Cyan","4 Red",1
  "3 Green","4 Blue",1
  "3 Green","4 Orange",1
  "3 Green","4 Purple",2
  "3 Green","4 Red",4
  "3 Neutral","4 Orange",1
  "3 Neutral","4 Yellow",1
  "3 Orange","4 Green",1
  "3 Orange","4 Orange",1
  "3 Orange","4 Purple",1
  "3 Orange","4 Red",2
  "3 Purple","4 Red",1
  "3 Red","4 Cyan",4
  "3 Red","4 Orange",1
  "3 Red","4 Purple",1
  "4 Blue","5 Blue",1
  "4 Blue","5 Cyan",1
  "4 Cyan","5 Cyan",2
  "4 Cyan","5 Green",3
  "4 Cyan","5 Orange",1
  "4 Green","5 Purple",1
  "4 Orange","5 Cyan",1
  "4 Orange","5 Purple",1
  "4 Orange","5 Red",1
  "4 Orange","5 Yellow",2
  "4 Purple","5 Orange",1
  "4 Purple","5 Pink",2
  "4 Purple","5 Yellow",1
  "4 Red","5 Blue",2
  "4 Red","5 Green",1
  "4 Red","5 Neutral",1
  "4 Red","5 Orange",1
  "4 Red","5 Purple",2
  "4 Red","5 Yellow",1
  "4 Yellow","5 Blue",1

What the sankey makes obvious:

slot 1 is dominated by Blue
slot 2 quickly branches toward Orange and Red
slot 3 consolidates around Green, Red, and Orange
slot 4 becomes more mixed, especially with Cyan, Orange, Red, and Purple
by slot 5 the corpus is visibly plural rather than convergent

Common Opening Shapes¶

Several opening structures repeat across the corpus:

blue -> orange -> red -> cyan Examples: Tableau / Vega, Hex, Observable Plot
blue -> red -> green -> purple Examples: Plotly.js, Metabase
blue -> red -> orange -> green Examples: Google Charts, Looker Studio
blue -> green -> orange -> red Examples: ApexCharts
blue -> blue -> red -> purple or another adjacent-blue variant Examples: Power BI, Highcharts, Redash

These patterns suggest that most tools are not trying to maximize hue distance greedily at every slot. Instead they tend to prioritize:

a trustworthy lead color
a warm contrast in slot 2
a cooler or greener balancing color in slot 3
a fourth slot that broadens the field without necessarily preserving perfect alternation

Strategy Families¶

The corpus falls into a few repeatable families.

Blue-Led Balanced Defaults¶

These feel like the canonical categorical pattern most people expect:

Tableau / Vega
D3 / RAWGraphs
Google Charts
Observable Plot
Hex

Common traits:

blue lead
early warm contrast
strong 2-4 category readability
little visible brand eccentricity in the opening slots

Brand-Led BI and App Palettes¶

These are still categorical, but they behave more like product signatures:

Power BI
Looker Studio
Redash
Highcharts
ApexCharts
Chart.js Colors Plugin

Common traits:

higher saturation
stronger reliance on product-bright blues and warm accents
more frequent repeated hue families in the first four slots

Editorially Restrained Systems¶

These prioritize tone, context, and narrative control over generic palette neutrality:

The Economist
Financial Times
Bloomberg Graphics
Our World in Data

Common traits:

more muted or selective opening slots
more willingness to use earth tones, dusty reds, or restrained secondary colors
less pressure to behave like a universal dashboard default

System-Led Outliers¶

These appear in the corpus but do not behave like conventional balanced categorical defaults:

Datawrapper
Reuters Graphics

Common traits:

strong system logic rather than evenly spaced category logic
opening slots that feel editorial, semantic, or utility-driven
useful as references, but not ideal baseline comparators for generic categorical design

Implications For Next Analysis¶

Before pairwise Leonardo analysis, the corpus already suggests a few important guardrails:

blue-first is still the dominant market convention
slots 1-4 matter much more than total palette length
later slots diverge sharply across tools, so they should be judged more generously than the opening core
editorial references are valuable, but some of them are not aiming at the same problem as generic BI defaults

That supports the next phase:

keep corpus-wide structural analysis broad
run Leonardo pairwise checks on a curated subset rather than every row
focus the pairwise pass on colors 1-5, not the entire palette length