Appendix: Sample Project¶
Census Explorer is a complete working example that demonstrates Invariant in action.
Overview¶
The sample project demonstrates:
- Implementing
CatalogStore— loading metadata from JSON - Implementing
QueryEngine— executing queries with DuckDB - Building a CLI — validating and querying with semantic rules
- Validation in practice — seeing queries allowed, warned, or blocked
Location¶
Installation¶
cd examples/sample-project
pip install -e ../../ # Install invariant
pip install -e . # Install census-explorer
Quick commands¶
# List available data products
census-explorer list-data-products
# Validate a query (allowed)
census-explorer validate aa0e8400-e29b-41d4-a716-446655440001 \
-m population:SUM -d geography_code
# Validate a query (blocked - can't sum an indicator)
census-explorer validate aa0e8400-e29b-41d4-a716-446655440002 \
-m unemployment_rate:SUM -d geography_code
# Execute a query
census-explorer query aa0e8400-e29b-41d4-a716-446655440001 \
-m population:SUM -d geography_code -d sex
File layout¶
examples/sample-project/
├── pyproject.toml # Package configuration
├── README.md # Detailed usage guide
├── data/
│ ├── catalog.json # Metadata definitions
│ ├── census_demographics.parquet # Population data (306 rows)
│ └── labour_force.parquet # Employment data (9 rows)
└── src/census_explorer/
├── cli.py # CLI commands (Typer)
└── infrastructure/
├── json_catalog.py # CatalogStore implementation
└── duckdb_engine.py # QueryEngine implementation
Data products¶
| ID | Name | Kind | Description |
|---|---|---|---|
aa0e8400-e29b-41d4-a716-446655440001 |
Population by Geography and Demographics | FACT | Census population counts — can SUM |
aa0e8400-e29b-41d4-a716-446655440002 |
Labour Force Indicators | INDICATOR | Employment rates — cannot SUM |
What to inspect¶
data/catalog.json¶
The metadata file defines:
- 2 studies (Census 2021, Labour Force Survey 2023)
- 2 universes (population definitions)
- 2 datasets
- 2 data products with variables
- 1 indicator definition (unemployment_rate with
NOT_AGGREGATABLEpolicy)
infrastructure/json_catalog.py¶
Shows how to implement the CatalogStore port:
- Load entities from JSON
- Build a
CatalogSnapshotfor validation - List and retrieve operations
infrastructure/duckdb_engine.py¶
Shows how to implement a query engine:
- Translate
QueryPlanto SQL - Execute against parquet files
- Return typed results
cli.py¶
Shows the integration pattern:
- Wire up catalog store and validator
- Build
QueryRequestfrom CLI args - Call
ValidateQueryUseCase - Display results with Rich tables
How it maps to concepts¶
| Sample file | Concept |
|---|---|
catalog.json (studies) |
Data Products |
catalog.json (variables) |
Variables |
catalog.json (universes) |
Universe |
| Blocked SUM on indicator | Indicator Aggregation |
cli.py (validation flow) |
Query Lifecycle |