Skip to content

Reference Palette Corpus Analysis

Status: draft Scope: structural analysis of the stabilized categorical reference corpus before pairwise Leonardo review Source: Reference Categorical Palette Corpus, Reference Categorical Palette Sheet

This note summarizes the reference corpus at a structural level before doing pairwise accessibility analysis. The goal is to understand how the corpus tends to open, how the first four slots are commonly organized, and what broad strategy families show up across BI tools, libraries, and editorial systems.

Dataset

  • 27 palette rows
  • categories represented: BI and Analytics, OSS and Chart Libraries, Editorial and Journalism
  • exact duplicate rows already collapsed in the source corpus: Tableau / Vega and D3 / RAWGraphs

First-Color Bias

The clearest pattern in the corpus is strong blue-first ordering.

  • 20 of 27 palettes are blue-first, about 74%
  • 2 are cyan-first: Apache Superset, Datawrapper
  • 2 are green-first: Qlik Sense, Grafana (light theme)
  • 1 is red-first: Apache ECharts
  • 1 is pink-first: Financial Times
  • 1 is neutral-first: Reuters Graphics

This means the corpus is not merely "often blue." It is overwhelmingly blue-led at slot 1, especially for general-purpose tools. That makes blue-first the baseline convention rather than an incidental trend.

By category:

  • BI and Analytics is especially blue-led. 9 of 12 rows open on blue.
  • OSS and Chart Libraries also leans blue-first. 6 of 8 rows open on blue.
  • Editorial and Journalism is more varied. It still has several blue-first rows, but also includes cyan-first, pink-first, and neutral-first systems.

First-Step Blue Families

Within the blue-first majority, the first swatches are not all doing the same job. Looking at them in OKLCH separates them into three practical families:

  • Muted Reference Blues These are lower-chroma, steadier blues that read as trustworthy defaults.
  • Electric Indigo Blues These are more saturated and more purple-leaning. They feel product-bright and digitally assertive.
  • Bright Sky Blues These are lighter and cleaner, often pushing upward in lightness while retaining moderate chroma.

  • First-Step Blue Groups PNG

First-step blue groups in OKLCH

This split is useful because "blue-first" turns out to hide multiple design intents:

  • a muted reference/default blue
  • an electric app/brand blue
  • a lighter sky or cyan-leaning blue

That is helpful for DFT, because the main design question is not just whether the first color should be blue. It is what kind of blue it should be.

First-Four Slot Patterns

The first four positions are much more structured than later slots.

Slot 1

  • blue dominates decisively
  • non-blue leads are unusual enough to feel like a deliberate stance

Slot 2

  • the most common second-slot hues are red and orange
  • blue sometimes repeats in slot 2, but that is less common and often more brand-led than category-led

Slot 3

  • green is the most common third-slot hue
  • red and orange are the next most common

Slot 4

  • red is still common, but the field broadens
  • cyan, orange, and purple become much more normal here

Slot 5

  • there is no single dominant fifth-slot behavior
  • by slot 5 the corpus starts to fan out into multiple strategies rather than one shared convention

The practical takeaway is that the corpus has a strong shared logic in slots 1-4 and much weaker consensus after that. That supports using slots 1-5 as the main scope for deeper evaluation.

Sankey View

To make the slot-by-slot structure easier to see, the corpus can be abstracted into a small set of hue families and rendered as a transition diagram across the first five positions.

Hue groups used here:

  • Blue
  • Cyan
  • Green
  • Yellow
  • Orange
  • Red
  • Purple
  • Pink
  • Neutral

This abstraction intentionally compresses small shade differences and brand nuance so the structural movement is easier to read.

Reference palette corpus sankey

sankey-beta
  "1 Blue","2 Blue",2
  "1 Blue","2 Cyan",1
  "1 Blue","2 Green",2
  "1 Blue","2 Orange",7
  "1 Blue","2 Purple",1
  "1 Blue","2 Red",7
  "1 Cyan","2 Blue",1
  "1 Cyan","2 Cyan",1
  "1 Green","2 Green",1
  "1 Green","2 Yellow",1
  "1 Neutral","2 Neutral",1
  "1 Pink","2 Red",1
  "1 Red","2 Blue",1
  "2 Blue","3 Cyan",1
  "2 Blue","3 Green",2
  "2 Blue","3 Red",1
  "2 Cyan","3 Cyan",1
  "2 Cyan","3 Orange",1
  "2 Green","3 Blue",1
  "2 Green","3 Orange",1
  "2 Green","3 Purple",1
  "2 Neutral","3 Neutral",1
  "2 Orange","3 Green",2
  "2 Orange","3 Neutral",1
  "2 Orange","3 Orange",1
  "2 Orange","3 Red",3
  "2 Purple","3 Red",1
  "2 Red","3 Cyan",1
  "2 Red","3 Green",4
  "2 Red","3 Orange",2
  "2 Red","3 Red",1
  "2 Yellow","3 Blue",1
  "3 Blue","4 Blue",1
  "3 Blue","4 Orange",1
  "3 Cyan","4 Cyan",2
  "3 Cyan","4 Red",1
  "3 Green","4 Blue",1
  "3 Green","4 Orange",1
  "3 Green","4 Purple",2
  "3 Green","4 Red",4
  "3 Neutral","4 Orange",1
  "3 Neutral","4 Yellow",1
  "3 Orange","4 Green",1
  "3 Orange","4 Orange",1
  "3 Orange","4 Purple",1
  "3 Orange","4 Red",2
  "3 Purple","4 Red",1
  "3 Red","4 Cyan",4
  "3 Red","4 Orange",1
  "3 Red","4 Purple",1
  "4 Blue","5 Blue",1
  "4 Blue","5 Cyan",1
  "4 Cyan","5 Cyan",2
  "4 Cyan","5 Green",3
  "4 Cyan","5 Orange",1
  "4 Green","5 Purple",1
  "4 Orange","5 Cyan",1
  "4 Orange","5 Purple",1
  "4 Orange","5 Red",1
  "4 Orange","5 Yellow",2
  "4 Purple","5 Orange",1
  "4 Purple","5 Pink",2
  "4 Purple","5 Yellow",1
  "4 Red","5 Blue",2
  "4 Red","5 Green",1
  "4 Red","5 Neutral",1
  "4 Red","5 Orange",1
  "4 Red","5 Purple",2
  "4 Red","5 Yellow",1
  "4 Yellow","5 Blue",1

What the sankey makes obvious:

  • slot 1 is dominated by Blue
  • slot 2 quickly branches toward Orange and Red
  • slot 3 consolidates around Green, Red, and Orange
  • slot 4 becomes more mixed, especially with Cyan, Orange, Red, and Purple
  • by slot 5 the corpus is visibly plural rather than convergent

Common Opening Shapes

Several opening structures repeat across the corpus:

  • blue -> orange -> red -> cyan Examples: Tableau / Vega, Hex, Observable Plot
  • blue -> red -> green -> purple Examples: Plotly.js, Metabase
  • blue -> red -> orange -> green Examples: Google Charts, Looker Studio
  • blue -> green -> orange -> red Examples: ApexCharts
  • blue -> blue -> red -> purple or another adjacent-blue variant Examples: Power BI, Highcharts, Redash

These patterns suggest that most tools are not trying to maximize hue distance greedily at every slot. Instead they tend to prioritize:

  • a trustworthy lead color
  • a warm contrast in slot 2
  • a cooler or greener balancing color in slot 3
  • a fourth slot that broadens the field without necessarily preserving perfect alternation

Strategy Families

The corpus falls into a few repeatable families.

Blue-Led Balanced Defaults

These feel like the canonical categorical pattern most people expect:

  • Tableau / Vega
  • D3 / RAWGraphs
  • Google Charts
  • Observable Plot
  • Hex

Common traits:

  • blue lead
  • early warm contrast
  • strong 2-4 category readability
  • little visible brand eccentricity in the opening slots

Brand-Led BI and App Palettes

These are still categorical, but they behave more like product signatures:

  • Power BI
  • Looker Studio
  • Redash
  • Highcharts
  • ApexCharts
  • Chart.js Colors Plugin

Common traits:

  • higher saturation
  • stronger reliance on product-bright blues and warm accents
  • more frequent repeated hue families in the first four slots

Editorially Restrained Systems

These prioritize tone, context, and narrative control over generic palette neutrality:

  • The Economist
  • Financial Times
  • Bloomberg Graphics
  • Our World in Data

Common traits:

  • more muted or selective opening slots
  • more willingness to use earth tones, dusty reds, or restrained secondary colors
  • less pressure to behave like a universal dashboard default

System-Led Outliers

These appear in the corpus but do not behave like conventional balanced categorical defaults:

  • Datawrapper
  • Reuters Graphics

Common traits:

  • strong system logic rather than evenly spaced category logic
  • opening slots that feel editorial, semantic, or utility-driven
  • useful as references, but not ideal baseline comparators for generic categorical design

Implications For Next Analysis

Before pairwise Leonardo analysis, the corpus already suggests a few important guardrails:

  • blue-first is still the dominant market convention
  • slots 1-4 matter much more than total palette length
  • later slots diverge sharply across tools, so they should be judged more generously than the opening core
  • editorial references are valuable, but some of them are not aiming at the same problem as generic BI defaults

That supports the next phase:

  • keep corpus-wide structural analysis broad
  • run Leonardo pairwise checks on a curated subset rather than every row
  • focus the pairwise pass on colors 1-5, not the entire palette length