Project Lint Rules
LeapSQL includes 13 project lint rules organized into 3 categories.
Modeling
Rules about model structure and DAG organization.
PM01 - root-models
Severity: warning
Models with no sources (broken DAG lineage)
Why This Matters
Non-staging models without upstream dependencies indicate broken lineage. These "root" models don't reference any tables, suggesting either a configuration error, a model that should be a seed, or missing FROM/JOIN clauses. Proper DAG lineage is essential for understanding data flow.
Bad
-- models/marts/fct_orders.sql
SELECT 1 AS id, 'test' AS name -- No FROM clause, no sourcesGood
-- models/marts/fct_orders.sql
SELECT id, name
FROM {{ ref('stg_orders') }}How to Fix
Add appropriate FROM/JOIN clauses to reference upstream models or sources, or convert to a seed if this is static data.
PM02 - source-fanout
Severity: warning
Source referenced by multiple non-staging models
Why This Matters
Each raw source should be referenced by exactly one staging model, which then provides a clean interface for downstream models. When multiple non-staging models reference the same source directly, transformation logic gets duplicated and changes to the source require updates in multiple places.
Bad
-- models/marts/fct_orders.sql
SELECT * FROM raw_orders -- Direct source reference
-- models/marts/fct_revenue.sql
SELECT * FROM raw_orders -- Same source, duplicated referenceGood
-- models/staging/stg_orders.sql
SELECT * FROM raw_orders -- Single staging model for source
-- models/marts/fct_orders.sql
SELECT * FROM {{ ref('stg_orders') }} -- Reference staging modelHow to Fix
Create a staging model for the source and have all downstream models reference the staging model instead.
PM03 - staging-depends-staging
Severity: warning
Staging model references another staging model
Why This Matters
Staging models should only reference raw sources, not other staging models. When staging models depend on each other, it blurs the boundary between data cleaning (staging) and data transformation (intermediate/marts). If you need to combine staging models, create an intermediate model instead.
Bad
-- models/staging/stg_orders_enhanced.sql
SELECT o.*, c.name
FROM {{ ref('stg_orders') }} o -- Staging depending on staging
JOIN {{ ref('stg_customers') }} c ON o.customer_id = c.idGood
-- models/intermediate/int_orders_with_customers.sql
SELECT o.*, c.name
FROM {{ ref('stg_orders') }} o
JOIN {{ ref('stg_customers') }} c ON o.customer_id = c.idHow to Fix
Move the model to the intermediate layer if it combines staging models, or reference raw sources directly if it's truly staging.
PM04 - model-fanout
Severity: warning
Model has too many direct downstream consumers
Why This Matters
Models with many downstream consumers become bottlenecks for changes. A "God Model" that many models depend on makes refactoring risky since changes affect many downstream models. Consider whether the model should be split into focused models or if an abstraction layer is needed.
Bad
-- stg_orders is consumed by 10+ models directly
-- Any change to stg_orders requires checking all consumersGood
-- Create focused intermediate models
-- int_order_metrics, int_order_dates, int_order_status
-- Each downstream model references only what it needsHow to Fix
Split the model into smaller, focused models, or create intermediate abstraction layers to reduce direct dependencies.
Configuration
This rule accepts the following configuration options: threshold
PM05 - too-many-joins
Severity: warning
Model references too many upstream models
Why This Matters
High join counts often indicate a "God Model" that tries to do too much in a single query. These models are hard to understand, slow to execute, and difficult to maintain. Breaking complex queries into smaller intermediate models improves readability and allows for incremental processing.
Bad
-- fct_comprehensive_report.sql with 8+ JOINs
SELECT * FROM stg_orders
JOIN stg_customers ON ...
JOIN stg_products ON ...
JOIN stg_payments ON ...
JOIN stg_shipments ON ...
-- ... more joinsGood
-- Break into focused intermediate models
-- int_order_details.sql (orders + customers + products)
-- int_order_fulfillment.sql (orders + shipments + payments)
-- fct_report.sql (join intermediates)How to Fix
Create intermediate models that pre-join related tables, then compose them in the final model.
Configuration
This rule accepts the following configuration options: threshold
PM06 - downstream-on-source
Severity: warning
Marts or intermediate model depends directly on source (not staging)
Why This Matters
The recommended transformation pattern is Sources → Staging → Intermediate → Marts. When marts or intermediate models reference sources directly, they bypass data cleaning in staging, leading to duplicated transformation logic and making lineage harder to understand.
Bad
-- models/marts/fct_orders.sql
SELECT * FROM raw.orders -- Direct source reference in martsGood
-- models/staging/stg_orders.sql
SELECT * FROM raw.orders
-- models/marts/fct_orders.sql
SELECT * FROM {{ ref('stg_orders') }} -- Reference stagingHow to Fix
Create a staging model for the source and reference it instead of the raw source.
PM07 - rejoining-upstream
Severity: warning
Unnecessary intermediate model in a fan-in pattern (A→B, A→C, B→C where B has no other consumers)
Why This Matters
When model B has exactly one consumer (C), and B's upstream (A) is also a direct upstream of C, model B serves no purpose as a reusable abstraction. The pattern A→B→C with A→C means B's logic could be inlined into C, eliminating an unnecessary model and simplifying the DAG.
Bad
-- stg_orders (A) → int_order_totals (B) → fct_report (C)
-- stg_orders (A) → fct_report (C)
-- int_order_totals only has one consumer and doesn't add reusable valueGood
-- Either inline B into C:
-- stg_orders → fct_report (with B's logic inlined)
-- Or give B more consumers to justify its existence:
-- stg_orders → int_order_totals → fct_report
-- stg_orders → int_order_totals → fct_dashboardHow to Fix
Either inline the intermediate model's logic into its single consumer, or add more consumers to justify it as a reusable abstraction.
Lineage
Rules about data lineage and column dependencies.
PL01 - passthrough-bloat
Severity: warning
Model has too many passthrough columns
Why This Matters
Models with many passthrough columns (direct copies without transformation) indicate a "SELECT *" style that doesn't add value and increases data movement. Explicit column selection ensures only necessary data is processed and makes dependencies clear.
Bad
SELECT
id, name, email, phone, address, -- All passthrough
created_at, updated_at, deleted_at,
field1, field2, field3, field4, field5 -- 20+ columns just copied
FROM {{ ref('stg_customers') }}Good
SELECT
id,
name,
email,
COALESCE(phone, 'N/A') AS phone, -- Actual transformation
created_at
FROM {{ ref('stg_customers') }}How to Fix
Remove unnecessary passthrough columns and only select the columns that are actually needed or transformed.
Configuration
This rule accepts the following configuration options: threshold
PL02 - orphaned-columns
Severity: info
Columns not used by any downstream model
Why This Matters
Columns that are computed but never used by downstream models represent wasted compute and storage. These "orphan" columns often accumulate over time as requirements change but models aren't cleaned up. Removing them reduces costs and simplifies the data model.
Bad
-- int_orders.sql outputs: id, amount, tax, discount, shipping, notes
-- fct_revenue.sql only uses: id, amount, tax
-- 'discount', 'shipping', 'notes' are orphanedGood
-- int_orders.sql outputs: id, amount, tax
-- fct_revenue.sql uses: id, amount, tax
-- All columns are consumed downstreamHow to Fix
Remove columns that are not consumed by any downstream model, or document why they should be retained.
PL04 - implicit-cross-join
Severity: warning
JOINs with no visible join keys in column lineage
Why This Matters
When a model references multiple tables but no column expression bridges them, it may indicate a missing JOIN condition (Cartesian product). Cross joins are rarely intentional and can cause massive data explosion. This rule uses column lineage to detect potential cross-join scenarios.
Bad
SELECT
o.id,
o.amount,
c.name -- No column references both 'o' and 'c' tables
FROM orders o, customers c -- Implicit cross joinGood
SELECT
o.id,
o.amount,
c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id -- Explicit join conditionHow to Fix
Add explicit JOIN conditions between all referenced tables, or confirm that a cross join is intentional.
PL05 - schema-drift
Severity: warning
SELECT * from source with changed schema since last run
Why This Matters
When using SELECT *, upstream schema changes silently propagate to your model. Added columns may break downstream processes, while removed columns cause runtime errors. This rule compares current source schemas against snapshots from the last run to catch breaking changes early.
Bad
-- Model uses SELECT * and upstream added a breaking column
SELECT *
FROM {{ source('raw', 'orders') }}
-- raw.orders added 'internal_notes' column that shouldn't be exposedGood
-- Explicit column selection protects against schema drift
SELECT
id,
customer_id,
amount,
created_at
FROM {{ source('raw', 'orders') }}How to Fix
Replace SELECT * with explicit column selection, or review and accept the schema changes if they are expected.
Structure
Rules about project structure and naming conventions.
PS01 - model-naming
Severity: warning
Model naming convention mismatch
Why This Matters
Consistent naming conventions make it easy to identify model types at a glance. Models in specific directories should follow the expected prefix convention: staging models use 'stg_', intermediate models use 'int_', and marts models use 'fct_' or 'dim_'.
Bad
-- models/staging/orders.sql (missing stg_ prefix)
-- models/marts/order_metrics.sql (missing fct_ or dim_ prefix)Good
-- models/staging/stg_orders.sql
-- models/marts/fct_order_metrics.sql
-- models/marts/dim_customers.sqlHow to Fix
Rename the model to include the appropriate prefix for its directory location.
PS02 - model-directory
Severity: warning
Model directory mismatch
Why This Matters
A model's name prefix should match its directory location. When a model is named 'stg_orders' but placed in 'marts/', it creates confusion about the model's purpose and breaks organizational conventions that teams rely on for navigation.
Bad
-- models/marts/stg_orders.sql (stg_ model in marts directory)
-- models/staging/fct_revenue.sql (fct_ model in staging directory)Good
-- models/staging/stg_orders.sql
-- models/marts/fct_revenue.sqlHow to Fix
Move the model to the directory that matches its name prefix, or rename it to match its current location.