YAML Language Design Guide¶
Principles for evolving the Dataface YAML dashboard language. This guide is for contributors designing new syntax — fields, chart config, layout types, variable inputs, query types, or style properties. It captures the patterns that make the existing syntax coherent so new additions stay consistent.
For how to write face YAML files, see the YAML Style Guide. For the full field catalog, see the Field Reference.
Related: dataface/core/render/chart/DESIGN.md
for chart rendering philosophy and the Vega-Lite wrapper contract.
1. Structure Is the Type Declaration¶
The presence of a key determines the type. Don't add explicit type: fields
when structure alone is unambiguous.
# Layout type is inferred from the key: rows: [...] # row layout cols: [...] # column layout grid: {...} # grid layout tabs: {...} # tab layout # Query type is inferred from its fields: sql: | # → SQL query SELECT ... metrics: [...] # → MetricFlow query model: ref(...) # → dbt model query rows: [...] # → values query (inline data)
When designing new syntax: if you're adding a new variant of something
(new query type, new variable input, new layout), first ask whether the new
fields are distinctive enough to infer the type. Only add an explicit type:
field when structural inference would be ambiguous.
The order of checking matters — document it. For queries, the existing
precedence is: metrics → model → sql → string (default SQL).
2. Data-Binding Key: column, Not field¶
Every place Dataface YAML binds a data position uses column: — the
channel grammar (color.column, background.column, etc.), table column
configs (style.columns[*].column), and variable options
(variables.x.options.column). The corresponding Pydantic attributes are
named column too.
Historical note. v1.x used the Vega-Lite term
field:for the channel grammar and table columns. Renamed tocolumn:pre-launch (tasks/workstreams/dft-core/tasks/rename-field-to-column-across-all- dataface-yaml-and-types.md) because every Dataface data source is a table and every binding target is a column — the Vega-Lite word was a euphemism. Oldfield:keys now raiseValidationError.
The Vega-Lite output boundary still uses field: inside the rendered
VL spec — that's VL's own key, not Dataface's. The rename only covers
the authored surface and Python-type attributes; render-layer output
dicts that flow into VL retain "field".
3. Naming: snake_case Only — No camelCase Ever¶
All Dataface YAML property names use snake_case. No exceptions. Never use
camelCase, even when porting directly from Vega-Lite or JavaScript sources.
If a property arrives as camelCase from an upstream spec, convert it to
snake_case at the Dataface boundary.
When wrapping a Vega-Lite concept, use VL's own name converted to snake_case:
| Vega-Lite | Dataface |
|---|---|
cornerRadius |
corner_radius |
labelFontSize |
label_font_size |
strokeWidth |
stroke_width |
Do NOT invent parallel names unless there is a deliberate, documented
reason. If Vega-Lite calls it encoding.y.axis, don't create
settings.y_axis or y_axis_config. Map it to encoding.y.axis (or a
shorthand that resolves there). When you do diverge from a VL name, record
it in the Known Divergences table below so it doesn't look like an
accident.
When adding a concept that has no Vega-Lite equivalent (KPI, table, spark, interactions), name it descriptively in snake_case and follow the naming style of surrounding fields.
Namespace Repeated Prefixes¶
When three or more keys share a prefix (e.g. font_size, font_weight,
font_family), that prefix should be a nested namespace instead of a flat
prefix:
# BAD — flat prefix repeated 3+ times title_font_family: "Source Serif 4" title_font_weight: 600 title_overflow: wrap-two title_height: 40 # GOOD — namespace the shared prefix title: font: family: "Source Serif 4" weight: 600 overflow: wrap-two min_height: 40
This reduces repetition, makes the hierarchy self-documenting, and mirrors
how CSS groups properties (the font shorthand, border shorthand, etc.).
When you see a flat prefix_* pattern accumulating, refactor it into
prefix: with nested keys before it spreads further.
4. Shorthand + Full Form¶
When a field has a common single-value case but also supports richer configuration, it should support two forms:
- Shorthand — a string or scalar for the common case
- Full form — a dict for the complete configuration
Not every field needs this. Only apply the pattern when there's a clear 80%
case that benefits from being a single value. Fields that are always
structured (like grid:) don't need a shorthand.
# Shorthand source: my_database format: "$,.0f" spark: line projection: albersUsa # Full form sources: default: my_database other_db: type: postgres connection_string: ... format: spec: ",.0f" prefix: "$" suffix: " USD" spark: type: line color: "#3b82f6" show_last: true # Progress spark — max is optional; omitting it auto-scales to column max spark: type: progress # max: 50000 ← explicit cap; omit to auto-scale to the column's observed max projection: type: albersUsa center: [-98, 38] scale: 1000
When designing new syntax: always ask "what's the 80% case?" and make that expressible as a single string or scalar. The full-form dict handles the remaining 20%.
The normalizer is responsible for expanding shorthand to full form. Downstream code should only see the full form.
5. Relationship to Vega-Lite¶
Dataface is deeply influenced by Vega-Lite's design: its declarative grammar, its encoding channel model, and its layered config/style inheritance. Today, Vega-Lite is the rendering backend for most chart types. But Dataface is not a Vega-Lite wrapper — it's a chart language that happens to target Vega-Lite for many chart types, and renders KPI, table, spark, and map charts through its own SVG renderers.
The goal is one cohesive chart library. A user should not perceive a seam between "Vega-Lite charts" and "Dataface-native charts." All chart types should feel like they belong to the same language — same field names, same style system, same shorthand patterns, same config inheritance. If Dataface eventually replaces the VL backend, the authored YAML should not need to change.
Thin Ergonomic Layer¶
For concepts Vega-Lite supports natively, the Dataface YAML should be a thin ergonomic layer — not a parallel chart language.
x,y,color,size,shape,thetamap directly to VL encoding channelstype: barmeans VLmark: bar- Axis, scale, legend config passes through to VL's native properties
Adding Dataface shorthand is appropriate when it significantly simplifies a common pattern:
# Dataface shorthand — common case is one field name x: date y: revenue # What this resolves to in Vega-Lite encoding: x: { field: date, type: temporal } y: { field: revenue, type: quantitative }
Adding Dataface shorthand is NOT appropriate when it just renames a Vega-Lite property:
# BAD — don't do this y_axis_side: right # Just renaming VL's axis.orient # GOOD — pass through the VL property style: axis_y: orient: right # Maps directly to VL axisY.orient
Test: Would a Vega-Lite user recognize the field? If not, it needs a very good reason to exist.
Known Divergences from Vega-Lite¶
These are intentional naming or structural differences. They exist for good reasons — don't "fix" them back to VL names, but don't add new divergences without equally good reasons.
| Dataface | Vega-Lite | Why |
|---|---|---|
style (board/chart) |
config |
Dataface splits VL's config into style presets (scaffold) and themes (painting). style is the user-facing name because it better describes what the user is doing: styling their chart. config is reserved for project-level settings (dataface.yml). |
style.charts.* sub-models use snake_case |
VL config.* uses camelCase |
Dataface is a Python/YAML ecosystem — snake_case is idiomatic. The style_to_vega_lite() mapper handles translation. |
style.charts.table, style.charts.kpi, style.charts.inference |
(no VL equivalent) | Non-VL sections that live alongside VL-mapped sections under style.charts. Consumed directly by Dataface renderers. |
| Style presets + themes (two layers) | Single config object |
Dataface separates scaffold (HTML-like: what exists, where it sits) from painting (CSS-like: fonts, colors, strokes). See chart DESIGN.md. |
When porting a VL property, check this table first. If the Dataface location differs from where VL puts it, there's likely a reason. Follow the existing pattern, don't create a third convention.
Stylistic Influence¶
Even where Dataface diverges from VL's names, it draws heavily from VL's design philosophy:
- Declarative grammar — the chart is described, not imperatively drawn
- Layered config inheritance — base defaults → style preset → theme → board-level style → chart-level style, each layer overriding the one below
- Encoding channels —
x,y,color,size,shapeas the primary way to map data fields to visual properties - Mark types as the fundamental chart taxonomy
- Config/style composition — a standalone chart should look good with zero style overrides; each layer adds specificity
New features should feel like they belong in this grammar. If something feels imperative or procedural ("first do X, then apply Y"), rethink it as a declarative property.
Layered Charts (Mixed-Mark Composition)¶
When a chart needs multiple mark types (e.g. bars + lines), use type: layered
with a layers list. Parent-level channels supply shared defaults; each layer
specifies its own type and can override y, color, size, shape.
charts: revenue_vs_target: query: monthly_data type: layered x: month layers: - type: bar y: revenue - type: line y: target
This maps directly to Vega-Lite's layer composition. The translation is
mechanical: parent-level channels become shared encoding, each layer becomes
a VL layer entry with its own mark and encoding overrides.
Design decisions:
typestays as the primary authored field —type: layeredrather than introducing amark:field orcomposition:conceptlayersis a list (ordered) because layer order affects rendering (later layers draw on top)- Each layer's
typemust be a primitive mark (bar, line, area, point, etc.), notlayered(no nesting) orauto - Pie/donut stay outside the layered path unless explicitly layered — they use theta encoding, not x/y
6. Data Transformations Belongs to Queries, Not Charts¶
The query layer owns data meaning: grain, aggregation, filtering, ordering. The chart layer owns visual encoding: marks, axes, colors, layout.
New chart fields should NEVER:
- Aggregate, regroup, or bucket data
- Derive new semantic columns
- Perform analytical reordering that changes meaning
- Silently fix wrong-shaped data
If the chart needs different data, the right answer is "change the query." The YAML should make this obvious — chart config touches presentation, query config touches data.
7. Progressive Disclosure¶
The simplest useful form should require the fewest fields. Complexity is opt-in.
# Level 1 — bare minimum rows: - query: sales type: bar x: month y: revenue # Level 2 — add polish rows: - query: sales type: bar x: month y: revenue title: "Monthly Revenue" color: region sort: by: revenue order: desc # Level 3 — full control rows: - query: sales type: bar x: month y: revenue title: "Monthly Revenue" color: region sort: by: revenue order: desc style: orientation: horizontal bar: corner_radius: 4 legend: orient: bottom encoding: y: axis: format: "$,.0f"
When adding new features: the zero-config version should do the right thing. Advanced configuration goes in nested keys that users can ignore until they need them.
8. Defaults Live in Config YAML, Not Code¶
All default values live in YAML config files:
| What | Where |
|---|---|
| Chart dimensions, axis limits | defaults/default_config.yml |
| Default theme name | defaults/default_config.yml > vega.default_theme |
| Visual styling, axis/legend/mark colors, scaffold defaults | defaults/themes/*.yaml (compiled to VL via style_to_vega_lite) |
| Color palettes (categorical, sequential, diverging, semantic, scaffold) | defaults/palettes/<family>/*.yml |
| Chart-type-specific settings | style.charts.<type> in theme YAML |
When introducing a new field:
- Don't hardcode its default in Python
- Do add the default to the appropriate YAML config file
- Do read it via
get_config()at runtime
This ensures users can override any default via dataface.yml or board-level
style: without touching code.
The default config is the field catalog¶
default_config.yml should be readable as a complete reference of what
properties exist and how they relate to each other. Two conventions make
this work:
1. Comment inherited/cascaded properties at each level. When a section inherits properties from a parent, list them as commented-out placeholders so readers can see the full surface area:
style: title: font_family: "'Source Serif 4', Georgia, serif" font_weight: 600 overflow: wrap-two min_height: 40 charts: title: # inherits from style.title: font_family, font_weight, min_height font_size: 18 # overrides parent (board titles are larger) overflow: wrap-two # same as parent, explicit for clarity
When a section adds no overrides, leave a commented-out block showing what it inherits:
kpi: # title: # inherits all from style.charts.title # (font_family, font_size, font_weight, overflow) value_font_weight: "bold"
2. Add explanation comments for non-obvious properties. Any property whose purpose isn't clear from its name should have a brief comment explaining what it controls:
# Pixel gap between cards when card_gap is enabled on a board. # Only takes effect when a face sets card_gap: true. card_gap: 24.0 # Minimum floor for face title height in layout sizing. # The actual title height may be larger based on text wrapping. title_height: 40.0
These conventions make default_config.yml the single place to understand
what's available, what cascades, and what each property does — without
reading source code.
9. Recursive Composition¶
Boards nest arbitrarily. Any layout item can be a full board with its own variables, queries, charts, and layout.
New features should respect this. If a feature makes sense at the top-level board, ask whether it should also work at nested board scope. Usually the answer is yes — scoped variables, scoped queries, and scoped charts all follow this pattern.
9a. Property Inheritance (What Cascades)¶
Because boards nest arbitrarily (§8), every property needs a clear answer: does a child board inherit this value from its parent?
The Rule¶
"Would a child board typically want the same value as its parent?"
If yes, the property should cascade (child inherits unless it overrides). If no, each board gets its own value independently.
This is the same rule CSS uses, and it maps to a clean split:
- Text/content properties cascade. You set
font_familyon a parent board and expect all nested boards to use the same font. You set a color palette and expect nested charts to share it. - Box/layout properties do NOT cascade. You set
padding: 24pxon a parent board and do NOT expect child boards to also have 24px padding. Each board has its own geometry. - Context properties cascade. Theme name, data source, and variables establish a context that children operate within.
What cascades and what doesn't¶
| Property category | Cascades? | Examples | Why |
|---|---|---|---|
| Typography (font, size, weight) | Yes | style.font.family, style.title.font.family |
Text should be consistent |
| Colors (palette, scheme) | Yes | color, chart palette |
Visual coherence |
| Chart styling (axis, legend, marks) | Yes | style.charts.* |
Charts in nested boards should match |
| Theme / style preset name | Yes | theme, style_preset |
Establishes visual context |
| Data source | Yes | source |
Children query the same database |
| Variables | Yes | variables |
Template context flows down |
| Width, height, min_height | No | face.width, face.min_height |
Box geometry — per-board |
| Padding, margin | No | style.board.margin, style.board.card_padding |
Box spacing — per-board |
| Background | No | style.background |
Each board has its own canvas |
| Border, border_radius | No | style.border |
Box chrome — per-board |
| Gap, card_gap | No | face.card_gap |
Layout spacing — per-board |
When designing new properties¶
Before adding a property, decide: does it cascade? Apply the rule:
- Text/content property (font, color, chart config) → put it in
StyleCompiledso it cascades via_propagate_style_compiled() - Box/layout property (width, height, padding, border, gap) → put it
in
FaceStyleor face config. No cascade. - Context property (source, theme, variables) → cascade via
parent_contextin the normalizer
Document the cascade behavior in the field reference when adding the field.
10. Lists for Ordered, Maps for Named¶
- Lists (
[...]) for things where order matters: layout items, grid items, table columns, transform steps - Maps (
{...}) for things identified by name: queries, charts, variables, sources
Don't use a list of {name: ..., ...} objects when a map of
name: {...} would work. Maps are more readable and enable direct
reference by key.
# YES — map for named queries queries: sales: { sql: "SELECT ..." } products: { sql: "SELECT ..." } # NO — list with explicit name fields queries: - name: sales sql: "SELECT ..." - name: products sql: "SELECT ..."
11. Cross-File References Use Dot Syntax¶
External references use filename.resource_id:
query: _shared_queries.sales
The filename (without extension) is the namespace. Partials use _ prefix
convention.
New features that reference named things should use the same dot syntax.
Don't introduce new reference mechanisms ($ref, imports, include
directives, etc.).
12. Porting Config from Vega-Lite¶
When moving Vega-Lite configuration options into Dataface YAML:
Do¶
- Always convert camelCase → snake_case:
cornerRadius→corner_radius. No camelCase property should ever appear in authored YAML. This is the most common source of inconsistency — catch it in review. - Keep the same semantic grouping: if VL groups it under
axis, keep it underaxis(oraxis_y,axis_x) - Provide shorthand only for truly common patterns: if 80% of users will set the same thing, make it a top-level field
- Document what VL property it maps to: in the field reference and in code comments at the mapping site
Don't¶
- Rename for the sake of renaming:
orientshould stayorient, not becomesideorposition - Flatten VL's nesting without reason: VL groups
axis.labelFontSizeandaxis.labelColortogether because they're related. Keep them grouped. - Create wrapper fields: don't add
show_grid: truewhen the typed style surface already hasaxis.grid.hidden - Bundle multiple VL properties into one Dataface field: don't create
axis_style: "minimal"that secretly sets grid + domain + ticks. Use style presets for that.
Checklist for each new field ported from VL¶
- What is the Vega-Lite property name?
- What is the snake_case Dataface name? (should be obvious conversion)
- Where does it go in the Dataface YAML hierarchy?
- Does it need a shorthand form?
- What is the default? (goes in config YAML, not code)
- Is there an existing Dataface field that already covers this? (don't duplicate)
13. Adding New Chart Types¶
The user should not perceive a seam between VL-backed chart types and Dataface-native ones. All chart types live in one flat namespace, share the same top-level fields, and use the same style system.
When adding a chart type that Vega-Lite doesn't support natively (like KPI, table, spark_bar):
- Use the same top-level chart fields where they apply (
query,title,type,style) - Add type-specific fields at the chart level, not buried in nested
config:
valuefor KPI,style.columnsfor table - Follow the same shorthand + full form pattern
- Participate in style/theme inheritance — chart-level
style:should work the same way it does for VL chart types - Document in the field reference alongside VL chart types — not in a separate "custom charts" section
When adding an alias for a VL chart type (like scatter → point,
heatmap → rect):
- Add it to the
ChartTypeenum - Map it mechanically in the profile layer
- Document it as an alias, not a new type
14. Adding New Variable Input Types¶
When adding a new input type:
- Prefer type-inference-by-structure if the new input has distinctive fields
- If not distinctive enough, use
input: new_type - Type-specific fields go under the variable definition, not in a separate config block
- Support sensible defaults —
default:should work the same way - Consider whether it needs query-driven options (like select does)
15. Error Messages Over Silent Behavior¶
When the YAML is wrong, the system should produce a clear error, not silently do something unexpected.
This applies to language design too: if a new field combination doesn't make
sense (e.g., theta on a bar chart), validate and error at compile time.
Don't silently ignore it.
16. Diagnostic Suppression¶
Dataface runs structural diagnostics on SQL queries at compile time
(fanout_risk, reaggregation, missing_join_predicate). When a flagged
pattern is intentional, authors can suppress the diagnostic at three levels.
Suppression is the union of all layers — any layer can silence a code.
Layer 1 — SQL-inline (-- dft:ignore)¶
queries: sales_by_customer: sql: | -- dft:ignore fanout_risk SELECT customer_id, SUM(o.amount), SUM(li.quantity) FROM orders o JOIN line_items li ON o.id = li.order_id GROUP BY customer_id
Multiple codes on one line: -- dft:ignore fanout_risk reaggregation.
Blanket suppress all (except parse_error): -- dft:ignore.
Follows the sqlfluff -- noqa: convention but uses dft:ignore to avoid
collision.
Layer 2 — YAML ignore property¶
queries: sales_by_customer: sql: SELECT ... ignore: [fanout_risk]
First-class field on the Query model. Discoverable, IDE-completable.
Layer 3 — Project-wide via meta.yaml¶
# meta.yaml lint: ignore: - fanout_risk ignore_queries: sales_by_customer: - reaggregation
lint.ignore suppresses everywhere. lint.ignore_queries suppresses per
query name. Cascades through the meta chain like other meta.yaml settings.
Design rules¶
parse_erroris never suppressible — if SQL can't be parsed, that's always an error.- Suppressed diagnostics are recorded for audit (
--show-suppressedin the CLI,suppressed_warningsonCompileResult). They are not silently discarded. - When adding a new diagnostic code, decide whether it belongs in
_UNSUPPRESSIBLE_CODES. Most codes should be suppressible.
17. Visibility Toggles¶
Three mechanisms exist for hiding things. Use the right one:
| Mechanism | When to use | Example |
|---|---|---|
hidden: true |
Dataface-authored field: renders nothing but the slot still exists | variables.hidden: true, style.spark_bar.labels_hidden: true |
null / omit |
Remove a property from the compiled spec entirely | title: null, legend: null |
disable: true |
Vega-Lite passthrough only — matches VL's own naming | axis.disable: true |
Rule: Any new visibility toggle on a Dataface-authored field uses hidden: bool = false
(positive-sense: false = visible). Never use show_* prefix or enabled for visibility.
pagination.enabled is not a visibility toggle — it activates a feature. It follows the
enabled pattern because it controls whether the feature exists, not whether it renders.
Checklist:
- [ ] New toggle hides something → hidden: false default, not show_*
- [ ] VL passthrough toggle → use VL's disable: name
- [ ] Removing from spec entirely → null, not hidden
Summary Checklist for New Syntax¶
Before adding or approving any new YAML field/feature:
- [ ] Uses
snake_casenaming - [ ] Uses VL's name (converted to snake_case) if wrapping a VL concept, or divergence is documented
- [ ] Doesn't invent a parallel name for something VL already names (without good reason)
- [ ] No flat
prefix_*repeated 3+ times — namespace them underprefix:instead - [ ] Supports shorthand + full form where a clear 80% single-value case exists
- [ ] Type is inferred from structure where possible (no unnecessary
type:fields) - [ ] Default value is in YAML config, not hardcoded in Python
- [ ] Inherited properties are documented at each config level (commented-out placeholders or
# inherits:notes) - [ ] Works at nested board scope if it works at top level
- [ ] Cascade behavior is explicit: text/content properties cascade, box/layout properties don't (§8a)
- [ ] Chart fields are pure presentation — no data transformation
- [ ] Simplest form requires fewest fields (progressive disclosure)
- [ ] Invalid combinations produce clear errors, not silent behavior
- [ ] Documented in the field reference
- [ ] Follows existing patterns in the same section of the YAML
- [ ] Visibility toggle uses
hidden: false, notshow_*orenabled(§16)