Seeds
Seeds are CSV files that LeapSQL loads directly into your database as tables. They're useful for static reference data, lookup tables, and sample data for development.
What are Seeds?
Seeds are CSV files in your seeds/ directory that become database tables. Unlike models (which transform data), seeds provide source data.
my-project/
├── seeds/
│ ├── country_codes.csv
│ ├── status_mappings.csv
│ └── sample_customers.csv
└── models/
└── ...Creating Seeds
Basic CSV Format
Create a CSV file with headers in the first row:
code,name,region
US,United States,North America
CA,Canada,North America
UK,United Kingdom,Europe
DE,Germany,Europe
JP,Japan,AsiaRequirements
- First row must be headers: Column names come from the header row
- Valid CSV format: Proper quoting, escaping, delimiters
- Consistent columns: All rows must have the same number of columns
- File extension: Must be
.csv
Loading Seeds
Use the seed command to load all seeds:
leapsql seedThis:
- Scans the
seeds/directory for CSV files - Creates or replaces tables for each file
- Reports the number of rows loaded
Output
Loading seeds...
country_codes: 5 rows
status_mappings: 4 rows
sample_customers: 100 rows
Done! Loaded 3 seeds.Table Names
The table name matches the filename (without .csv):
| File | Table Name |
|---|---|
country_codes.csv | country_codes |
raw_customers.csv | raw_customers |
2024_q1_data.csv | 2024_q1_data |
Using Seeds in Models
Reference seed tables like any other table:
/*---
name: customer_regions
---*/
SELECT
c.customer_id,
c.name,
c.country_code,
cc.name as country_name,
cc.region
FROM stg_customers c
LEFT JOIN country_codes cc ON c.country_code = cc.codeLeapSQL automatically detects the dependency on country_codes and ensures seeds are loaded before models run.
Data Types
LeapSQL infers data types from CSV content:
| CSV Content | Inferred Type |
|---|---|
123, 456 | INTEGER |
123.45, 67.89 | DOUBLE |
true, false | BOOLEAN |
2024-01-15 | DATE |
2024-01-15 10:30:00 | TIMESTAMP |
| Everything else | VARCHAR |
Type Inference Examples
id,amount,is_active,created_at,name
1,99.99,true,2024-01-15,Alice
2,149.50,false,2024-01-16,BobResults in:
id: INTEGERamount: DOUBLEis_active: BOOLEANcreated_at: DATEname: VARCHAR
Common Use Cases
Reference Data
Country codes, currency codes, status mappings:
status_code,status_name,is_final
pending,Pending,false
processing,Processing,false
shipped,Shipped,false
delivered,Delivered,true
cancelled,Cancelled,trueLookup Tables
Mappings and translations:
category_id,category_name,parent_category
1,Electronics,
2,Computers,Electronics
3,Laptops,Computers
4,Clothing,
5,Men's Clothing,ClothingDevelopment Data
Sample data for testing:
id,name,email,created_at
1,Test User 1,test1@example.com,2024-01-01
2,Test User 2,test2@example.com,2024-01-02
3,Test User 3,test3@example.com,2024-01-03Configuration Data
Feature flags, thresholds, settings:
key,value,description
min_order_amount,10.00,Minimum order value
max_discount_percent,50,Maximum discount allowed
free_shipping_threshold,100.00,Order value for free shippingSeed vs External Data
When to use seeds vs external data loading:
| Use Seeds When | Use External Loading When |
|---|---|
| Data is small (< 10K rows) | Data is large |
| Data changes infrequently | Data updates frequently |
| Data is version-controlled | Data comes from external systems |
| Same data across environments | Data varies by environment |
Best Practices
1. Keep Seeds Small
Seeds are meant for reference data, not large datasets:
Good: country_codes.csv (200 rows)
Bad: all_transactions.csv (10M rows)2. Version Control Seeds
Seeds should be committed to git:
git add seeds/country_codes.csv
git commit -m "Add country codes reference data"3. Use Descriptive Names
Name seeds clearly:
Good: order_status_codes.csv
Bad: data.csv4. Document in the CSV
Add a comment row or README for complex seeds:
# Seeds
## country_codes.csv
ISO 3166-1 country codes with regions.
Source: https://www.iso.org/iso-3166-country-codes.html
## order_statuses.csv
Valid order status values and their properties.5. Handle Special Characters
Properly escape commas and quotes:
id,name,description
1,Widget,"A simple, basic widget"
2,Gadget,"The ""best"" gadget"Troubleshooting
Type Mismatch Errors
If a seed column has mixed types:
id,value
1,100
2,abc # String in numeric column!Solution: Ensure consistent data types in each column, or cast in your model:
SELECT
id,
TRY_CAST(value AS INTEGER) as value
FROM my_seedEncoding Issues
If you see garbled characters:
- Ensure CSV is UTF-8 encoded
- Check for BOM (Byte Order Mark) characters
- Use a text editor to verify encoding
Missing Headers
If the first row is data, not headers:
1,Alice,alice@example.com # Missing header row!
2,Bob,bob@example.comSolution: Add a header row:
id,name,email
1,Alice,alice@example.com
2,Bob,bob@example.comWhitespace Issues
Extra whitespace can cause problems:
id, name ,email # Spaces around headers
1 , Alice , alice@example.com # Spaces in dataSolution: Trim whitespace in your model:
SELECT
TRIM(id) as id,
TRIM(name) as name,
TRIM(email) as email
FROM my_seedNext Steps
- Project Structure - Directory organization
- Models - Using seeds in models
- CLI Seed Command - Seed command options