Materializations

Materializations define how LeapSQL creates and updates your model outputs in the database. Choose the right materialization strategy based on your data volume, query patterns, and freshness requirements.

Available Materializations

Type	Description	Use Case
`table`	Creates a physical table	Default; most analytics models
`view`	Creates a SQL view	Simple transformations; staging
`incremental`	Merges new data into existing table	Large fact tables; append-only data

Table Materialization

The default materialization. Creates a physical table by:

Dropping the existing table (if it exists)
Creating a new table with the query results

sql

/*---
name: customer_summary
materialized: table
---*/

SELECT
    customer_id,
    COUNT(*) as order_count,
    SUM(amount) as total_spent
FROM orders
GROUP BY customer_id

Generated SQL:

sql

DROP TABLE IF EXISTS customer_summary;
CREATE TABLE customer_summary AS
SELECT
    customer_id,
    COUNT(*) as order_count,
    SUM(amount) as total_spent
FROM orders
GROUP BY customer_id;

When to Use Tables

Models that are queried frequently
Aggregations and complex transformations
Final "mart" models consumed by BI tools
When query performance is more important than storage

Considerations

Full table rebuild on each run
Requires enough storage for complete dataset
Downtime during rebuild (table briefly doesn't exist)

View Materialization

Creates a SQL view instead of a physical table. The query runs each time the view is accessed.

sql

/*---
name: stg_customers
materialized: view
---*/

SELECT
    id as customer_id,
    TRIM(name) as customer_name,
    LOWER(email) as email
FROM raw_customers

Generated SQL:

sql

CREATE OR REPLACE VIEW stg_customers AS
SELECT
    id as customer_id,
    TRIM(name) as customer_name,
    LOWER(email) as email
FROM raw_customers;

When to Use Views

Staging models (light transformations)
When you want changes to propagate immediately
Simple column renaming or type casting
When storage is a concern

Considerations

Query runs every time the view is accessed
No performance benefit from pre-computation
Can be slower for complex transformations
Excellent for development iteration

Incremental Materialization

Updates only new or changed records instead of rebuilding the entire table. This is essential for large datasets where full rebuilds are impractical.

sql

/*---
name: fct_orders
materialized: incremental
unique_key: order_id
---*/

SELECT
    order_id,
    customer_id,
    amount,
    status,
    created_at,
    updated_at
FROM raw_orders
WHERE updated_at > '{{ var("last_run_at", "1900-01-01") }}'

How Incremental Works

First run: Creates the table with all data (like table materialization)
Subsequent runs:
- Executes your query (which should filter to new/changed rows)
- Merges results into the existing table using the unique_key

Generated SQL (subsequent runs):

sql

-- Using DuckDB's INSERT OR REPLACE
INSERT OR REPLACE INTO fct_orders
SELECT
    order_id,
    customer_id,
    amount,
    status,
    created_at,
    updated_at
FROM raw_orders
WHERE updated_at > '2024-01-15 10:30:00';

unique_key Configuration

The unique_key identifies which rows to update vs insert:

Single column:

sql

/*---
unique_key: order_id
---*/

Composite key:

sql

/*---
unique_key:
  - customer_id
  - date
---*/

Incremental Strategies

Append-Only Data

For data that's only inserted, never updated:

sql

/*---
name: fct_events
materialized: incremental
unique_key: event_id
---*/

SELECT *
FROM raw_events
WHERE event_timestamp > '{{ var("last_run_at", "1900-01-01") }}'

Data with Updates

For data that can be modified after initial insert:

sql

/*---
name: fct_orders
materialized: incremental
unique_key: order_id
---*/

SELECT *
FROM raw_orders
WHERE updated_at > '{{ var("last_run_at", "1900-01-01") }}'

When to Use Incremental

Large fact tables (millions+ rows)
Data with clear incremental patterns
When full rebuilds take too long
Append-only event data

Considerations

Requires a reliable unique_key
Need a column to identify new/changed rows (timestamp, ID)
First run still processes all data
More complex to debug than full rebuilds
Late-arriving data needs special handling

Choosing the Right Materialization

                    ┌─────────────────┐
                    │  Is the model   │
                    │    complex?     │
                    └────────┬────────┘
                             │
                 ┌───────────┴───────────┐
                 │                       │
                 ▼                       ▼
            No (simple             Yes (joins,
            transforms)            aggregations)
                 │                       │
                 ▼                       │
            ┌────────┐                   │
            │  view  │                   │
            └────────┘                   │
                                         ▼
                              ┌─────────────────┐
                              │  Is the table   │
                              │  very large?    │
                              └────────┬────────┘
                                       │
                           ┌───────────┴───────────┐
                           │                       │
                           ▼                       ▼
                     No (< 1M rows)          Yes (> 1M rows)
                           │                       │
                           ▼                       ▼
                      ┌────────┐            ┌─────────────┐
                      │ table  │            │ incremental │
                      └────────┘            └─────────────┘

Performance Comparison

Aspect	Table	View	Incremental
Build time	O(n)	O(1)	O(delta)
Query time	Fast	Varies	Fast
Storage	Full	None	Full
Complexity	Low	Low	Medium

Best Practices

1. Start with Tables

Use table materialization by default. Only optimize to incremental when you have a proven performance need.

2. Use Views for Staging

Staging models rarely need to be materialized as tables:

sql

/*---
name: stg_customers
materialized: view  -- Light transformation, view is sufficient
---*/

3. Design for Incremental

When building incremental models, ensure your source data supports it:

Has a reliable timestamp column
Has a stable unique identifier
Doesn't have backdated updates (or handle them explicitly)

4. Full Refresh Option

Sometimes you need to rebuild an incremental model completely:

bash

# Full refresh a specific model
leapsql run --select fct_orders --full-refresh

Next Steps

Dependencies - How LeapSQL detects model dependencies
Models - Model configuration basics
CLI Reference - Run command options

Materializations ​

Available Materializations ​

Table Materialization ​

When to Use Tables ​

Considerations ​

View Materialization ​

When to Use Views ​

Considerations ​

Incremental Materialization ​

How Incremental Works ​

unique_key Configuration ​

Incremental Strategies ​

Append-Only Data ​

Data with Updates ​

When to Use Incremental ​

Considerations ​

Choosing the Right Materialization ​

Performance Comparison ​

Best Practices ​

1. Start with Tables ​

2. Use Views for Staging ​

3. Design for Incremental ​

4. Full Refresh Option ​

Next Steps ​

Materializations

Available Materializations

Table Materialization

When to Use Tables

Considerations

View Materialization

When to Use Views

Considerations

Incremental Materialization

How Incremental Works

unique_key Configuration

Incremental Strategies

Append-Only Data

Data with Updates

When to Use Incremental

Considerations

Choosing the Right Materialization

Performance Comparison

Best Practices

1. Start with Tables

2. Use Views for Staging

3. Design for Incremental

4. Full Refresh Option

Next Steps