Sources¶
Sources define where your data comes from—databases, files, or APIs. They're the foundation for all queries in your dashboards.
Designed for dbt Users
If you're familiar with dbt's profiles.yml, you already know Dataface sources. We use the same field names (dbname, host, schema, etc.) and support the env_var() syntax for credentials.
Key difference: Dataface sources are flat (no targets/outputs nesting) because dashboards read data, they don't build models.
Quick Start¶
The simplest way to set up sources is with a default source at the dashboard level:
title: Sales Dashboard source: my_database # Default for all queries queries: orders: sql: SELECT * FROM orders # Uses my_database
Source Types¶
Dataface supports three categories of sources:
Database Sources¶
| Type | Description |
|---|---|
postgres |
PostgreSQL database |
snowflake |
Snowflake data warehouse |
bigquery |
Google BigQuery |
redshift |
Amazon Redshift |
mysql |
MySQL database |
duckdb |
DuckDB (file or in-memory) |
File Sources¶
| Type | Description |
|---|---|
csv |
CSV files (local or remote URL) |
parquet |
Parquet files |
json |
JSON files or endpoints |
API Sources¶
| Type | Description |
|---|---|
http |
REST API endpoints |
Source Configuration¶
Database Example (Postgres)¶
{% raw %}
# In _sources.yaml or dataface.yml sources: analytics: type: postgres host: "{{ env_var('ANALYTICS_HOST') }}" port: 5432 dbname: analytics schema: public user: "{{ env_var('ANALYTICS_USER') }}" password: "{{ env_var('ANALYTICS_PASSWORD') }}"
DuckDB Example¶
sources: local_db: type: duckdb path: ./data/analytics.duckdb
CSV Example¶
sources: sales_data: type: csv file: assets/data/sales.csv delimiter: "," encoding: utf-8
HTTP Example¶
{% raw %}
sources: model_api: type: http url: "https://api.example.com" headers: Authorization: "Bearer {{ env_var('API_KEY') }}"
The _sources.yaml Convention¶
For projects with multiple dashboards, define sources in a shared file:
{% raw %}
# _sources.yaml sources: default: analytics # Project-wide default analytics: type: postgres host: "{{ env_var('DB_HOST') }}" dbname: analytics user: "{{ env_var('DB_USER') }}" password: "{{ env_var('DB_PASSWORD') }}" sales_csv: type: csv file: assets/data/sales.csv
Then reference in your dashboards:
# dashboard.yml title: My Dashboard # Uses _sources.yaml definitions source: analytics queries: orders: sql: SELECT * FROM orders # Uses analytics
Source Defaults & Hierarchy¶
Sources are resolved in this order (most specific wins):
- Query-level:
source:on individual query - Board-level:
source:orsources.defaulton dashboard - Nested boards: Inherit from parent
- Project-level:
sources.defaultindataface.yml
Example: Overriding Defaults¶
title: Multi-Source Dashboard source: main_db # Default for all queries queries: # Uses main_db (inherits default) orders: sql: SELECT * FROM orders # Overrides default with specific source external_data: source: external_api path: /data method: GET
Environment Variables¶
Use env_var() to keep credentials out of your YAML files:
{% raw %}
sources: analytics: type: postgres user: "{{ env_var('DB_USER') }}" password: "{{ env_var('DB_PASSWORD') }}" # With default value host: "{{ env_var('DB_HOST', 'localhost') }}"
Store credentials in .env:
Security Note
env_var() is only available in source configurations, not in queries. This prevents accidental exposure of secrets in query results.
For dbt Users¶
If you're already using dbt, use source: with your dbt profile name:
queries: orders: sql: SELECT * FROM orders source: my_dbt_profile # Looks up in profiles.yml target: dev # Optional target
Or mix dbt sources with Dataface sources:
{% raw %}
source: main_postgres # Dataface source for most queries queries: orders: sql: SELECT * FROM orders # Uses main_postgres dbt_data: sql: SELECT * FROM {{ ref('fct_orders') }} source: my_dbt_profile # Uses dbt profile
Field Name Compatibility¶
Dataface uses the same field names as dbt:
| Database | Field | Notes |
|---|---|---|
| Postgres | dbname, host, port, user, password, schema |
Same as dbt |
| Snowflake | account, database, warehouse, user, password, schema |
Same as dbt |
| BigQuery | project, dataset, keyfile |
Same as dbt |
Complete Examples¶
Dashboard with Database Source¶
title: Analytics Dashboard source: analytics_db queries: monthly_revenue: sql: | SELECT DATE_TRUNC('month', order_date) AS month, SUM(amount) AS revenue FROM orders GROUP BY 1 ORDER BY 1 charts: revenue_trend: query: monthly_revenue type: line x: month y: revenue rows: - revenue_trend
Dashboard with CSV Source¶
title: CSV Data Dashboard queries: sales: source: type: csv file: assets/data/sales.csv charts: sales_table: query: sales type: table rows: - sales_table
Dashboard with Multiple Sources¶
title: Multi-Source Dashboard sources: default: main_db queries: # Uses main_db orders: sql: SELECT * FROM orders # Uses inline CSV source products: source: type: csv file: assets/data/products.csv # Uses HTTP API weather: source: type: http url: "https://api.weather.com" path: /forecast params: city: "{{ city }}" charts: orders_chart: query: orders type: bar products_table: query: products type: table rows: - cols: - orders_chart - products_table
Read-only posture¶
Dataface dashboards read data — they never intentionally write to your warehouse. But "intentionally" is not the same as "enforced at the connection level." The enforcement story differs by warehouse:
DuckDB is the only warehouse with in-process read-only enforcement. File-based DuckDB databases are opened with the driver's native read_only=True flag; the driver itself refuses all writes. External file access (read_csv, read_parquet, remote URLs) is also disabled by default.
Every other warehouse has no native read-only flag in its dbt-adapter. The connection-level read-only posture for those warehouses is entirely on the operator side: bind a credential that can only SELECT.
| Warehouse | In-process enforcement | Recommended operator posture |
|---|---|---|
| DuckDB | read_only=True (driver-enforced) |
Handled automatically by Dataface |
| Postgres | None | Bind a SELECT-only role to the dbt profile credential |
| MySQL / MariaDB | None | Bind a SELECT-only user to the dbt profile credential |
| Snowflake | None | Bind a SELECT-only role to the Snowflake user or key-pair |
| BigQuery | None | Bind credentials to roles/bigquery.dataViewer (or equivalent) |
| Databricks | None | Unity Catalog: bind service principal to a SELECT-only role |
| Redshift | None | IAM/grants: bind to a SELECT-only IAM role or database user |
| SQL Server | None | Login-level SELECT-only grants |
In addition to the credential layer, Dataface enforces a SQL allowlist (sql_guard) in-process for all warehouses. The two defenses are complementary: sql_guard blocks disallowed statement types before execution; SELECT-only credentials enforce the same intent at the warehouse connection level.
Non-DuckDB warehouses
If you bind a credential that can write to Postgres, MySQL, or any other non-DuckDB warehouse, Dataface's in-process guard is sql_guard, not a driver-level write lock. Use SELECT-only credentials as your second layer of defense.
See Also¶
- Queries - Writing queries against sources
- Variables - Dynamic values in queries
- Getting Started - Full setup guide