The Headless MCP Data Engine

Where Your Data
Finally Comes Home

339 tools. 47 modules. 21 Oracle phases. DataBridge AI transforms legacy financial chaos into production-ready data marts with automated trust.

267

MCP Tools

21

Oracle Phases

103m

Benchmark

The Pantheon

Six engines. One destination.

👁️

Oracle Engine

COMPREHENSION (BLCE)

Ingests legacy SQL, Python, and Excel to extract and operationalize business logic into production Snowflake DDL.

🚢

Argos Pipelines

BUILDER (WRIGHT)

The master ship-builder. Constructs high-performance Snowflake Dynamic Table pipelines from complex hierarchies.

🦉

Athena Intelligence

PLANNER (CORTEX)

Guided wisdom. AI-powered planning and GraphRAG grounding to ensure zero-hallucination data discovery.

🛡️

Aegis Trust

GOVERNANCE (SHIELD)

Divine protection. Deterministic PII masking, trust attestations, and audit-ready lineage for absolute security.

🏔️

Olympus

HIERARCHY BUILDER

The highest order. Manage financial hierarchies, formula groups, and templates as the spine of your mart.

🧶

Penelope

RECONCILIATION

Precise weaving. Hash-compare sources and resolve discrepancies with meticulous, audit-ready precision.

The Journey Home

From ERP chaos to clean data in 4 weeks.

WEEK 1

1. Assess

Connect to ERP. Run the E2E Assessment Pipeline. Catalog tables and mask PII.

WEEK 2-3

2. Design

Deploy financial templates. Run Oracle to parse logic. Generate Kimball star schemas.

WEEK 3-4

3. Build

Generate Argos pipelines. Deploy Dynamic Tables to Snowflake. Go live.

MONTH 2+

4. Optimize

Activate GraphRAG. Build data catalog. Hardened Knowledge Base propagation.

The Proof

Battle-tested benchmark results.

103m

Enertia Benchmark

70%

Close Reduction

5.7M

Rows Processed

10 → 3

Day Close Cycle

● Six Engines Synchronized ● Aegis Layer Hardened ● Oracle Comprehension Active

DataBridge AI v0.49.4

Welcome to Ithaca

Complete these steps to get started:

✓

Register

Set up your organization

✓

Connect

Configure Snowflake connection

✓

First Project

Create from a template

✓

First Plan

Generate a workflow with AI

Loading

MCP Tools

--

Projects

--

Workflow Steps

--

Version

Recent Activity

Timestamp	Tool	Status
Loading...

Sample Data Files

Available in data/ — click a file to load it in the Tool Workbench.

Loading...

Quick Start

Get started with Ithaca:

Configure and test your data source connections. Connections are used by all pipeline, profiling, and AI tools.

Connection Settings

Connection Name

Provider

Account

User

Auth Method

Warehouse

Role

Database

Schema

Saved Connections

No connections configured yet. Add one to get started.

Connection Health

0

Total

0

Configured

0

Untested

Communication Stream

[--:--:--.---] SYSTEM

Agent Communication Console initialized. Ready to process requests. Type a query below or launch an Autonomous Demo.

Session Stats

0

Messages

0

Agents

Active Agents

🎯 Orchestrator

📊 Data Agent

🔍 Cortex Analyst

🏗️ Hierarchy Builder

✅ Quality Agent

⚡ dbt Agent

📚 Catalog Agent

✈️ Wright Agent

Autonomous Demos

Select a demo to see a hands-off walkthrough of Ithaca capabilities.

Step 0/0

Available Tools

Loading tools...

Select a Tool

Choose a tool from the list to configure and run it.

Output

// Tool output will appear here

Tool Palette

Workflow Steps

Click tools to add steps to your workflow.

Build hierarchy-driven data marts with the 4-object pipeline pattern. Configure each step and preview generated SQL.

Pipeline Configuration

Project Name

Report Type

Account Segment

Target Database

Target Schema

Measure Prefix

Hierarchy Table

Mapping Table

Fact Table

Has Sign Change Has Exclusions Has Group Filter Precedence

VW_1: Translation View

Translates ID_SOURCE column values to physical database columns using CASE statements.

ID_SOURCE Column Mappings

Additional Columns

-- Click "Generate" to create VW_1 Translation View SQL

DT_2: Granularity Table

UNPIVOT operation to normalize data and apply exclusion filters.

Unpivot Measures

Exclusion Categories

-- Click "Generate" to create DT_2 Granularity Table SQL

DT_3A: Pre-Aggregation Fact

UNION ALL branches for different join patterns. Each branch handles different dimension combinations.

Join Patterns (UNION ALL Branches)

Branch Name

Hierarchy Filter Columns

Fact Join Keys

-- Click "Generate" to create DT_3A Pre-Aggregation SQL

DT_3: Final Data Mart

Final data mart with formula precedence cascade and surrogate key generation.

Formula Precedence Levels

Dimension Keys to Include

-- Click "Generate" to create DT_3 Data Mart SQL

Run live demos against sample data using real MCP tools. Explore data quality, reconciliation, and schema analysis.

How the Data Lab Works

The Data Lab validates source data, compares datasets, and profiles quality — all from sample CSV files included with Ithaca.

CE Tools (Free):

load_csv, profile_data, compare_hashes, fuzzy_match_columns, detect_schema_drift

Pro Tools (Licensed):

analyze_book_with_researcher, compare_book_to_database, profile_book_sources

Data Flow:

                                CSV Files --> load_csv

                                load_csv --> profile_data (stats)

                                load_csv --> compare_hashes (diffs)

                                load_csv --> fuzzy_match (matches)

                                Two CSVs --> detect_schema_drift (changes)

Live Demos

Persona Data

Loading demos...

Pro Data Lab Tools

Requires Pro License

analyze_book_with_researcher

Analyze a Book's data sources against a database connection

Book Name

Analysis Type

Connection ID

compare_book_to_database

Compare Book hierarchy against live database schema

Book Name

Connection ID

profile_book_sources

Profile all data sources referenced by a Book

Book Name

Connection ID

Result

Configuration

Data Directory

NestJS Backend URL

API Key

Researcher Default Connection ID

License & Tier

Current Tier:

License Key

Tenant Information

Loading...

Cost / Credit Tracker

Track LLM token usage and Snowflake credit consumption per workflow run.

--

LLM Calls

--

Total Tokens

--

LLM Cost (USD)

--

SF Credits

Run ID	LLM Calls	Tokens (in/out)	LLM $	SF Credits	SF $	Total $
No cost data yet — run a workflow with CostTracker enabled.

Token Usage Calculator

Getting Started with Ithaca

New here? Follow these steps to be productive in under 10 minutes.

Step 1: Connect Your Data

Configure your Snowflake or database connection from the Connections page. Every pipeline and AI tool needs a working connection.

Step 2: Create Your First Hierarchy

Use a template (P&L, Balance Sheet, Oil & Gas LOS) or build from scratch. Hierarchies are the backbone of every DataBridge pipeline. Expand the sample demo to see how they work.

Step 3: Generate a Pipeline

Once your hierarchy is ready, use the Wright Pipeline page to generate a full 4-object Snowflake pipeline (Translation View, Granularity Table, Pre-Aggregation Fact, Data Mart).

Step 4: Validate Your Data

Use the Data Lab to profile data quality, reconcile sources, detect schema drift, and run fuzzy matching against your datasets.

Step 5: Explore with AI

Ask the AI Planner to analyze your data and generate multi-step workflows, or chat with the Agent Console for autonomous demos.

Take the Guided Tour

Want a hands-on walkthrough of every major page? Start the interactive guided tour.

DataBridge AI v0.49.4

A headless, MCP-native data and implementation engine with 339 tools across 47 modules. Tool availability is license-dependent (Community/Pro/Enterprise).

Core Capabilities

🔄 Data Reconciliation	Compare and validate data from CSV, SQL, PDF, JSON sources (38 tools)
🏗️ Hierarchy Builder	Create and manage multi-level hierarchy projects with formulas (49 tools)
🧬 BLCE Engine	Business logic extraction, Kimball modeling, DDL generation, deployment (84 tools, 21 phases)
🧠 Cortex AI	Snowflake Cortex integration with natural language to SQL (26 tools)
📊 Wright Module	Hierarchy-driven data mart generation with 4-object pipeline (31 tools)
📚 Data Catalog	Centralized metadata registry with business glossary (19 tools)
🔗 GraphRAG	Knowledge graph + vector search for explainable AI grounding (10 tools)
📈 Observability	Metric recording, anomaly detection, asset health monitoring (15 tools)
📦 Data Versioning	Dataset snapshots, diffs, and rollback (12 tools)
🔍 Lineage Tracking	Column-level lineage and impact analysis (11 tools)
✅ Data Quality	Expectation suites and data contracts (7 tools)
🛡️ DataShield	Offline data masking before AI processing
🔧 dbt Integration	Generate dbt projects from hierarchies (8 tools)

Quick Start

# Install from PyPI (Community Edition) pip install databridge-ai # Or install Pro (requires license key) pip install databridge-ai-pro export DATABRIDGE_LICENSE_KEY="DB-PRO-..." # Run as MCP Server python -m src.server

Architecture

graph TD A[Claude / MCP Client] --> B[MCP Protocol] B --> C[DataBridge MCP Server
267 Tools] C --> D[Hierarchy Builder
49 tools] C --> E[Data Reconciliation
38 tools] C --> F[BLCE Engine
84 tools] C --> G[Wright Module
31 tools] C --> H[Cortex AI
26 tools] C --> I[Data Catalog
19 tools] C --> J[Observability
15 tools] C --> K[Other Modules] F --> L[(Snowflake)] G --> L H --> L D --> M[GraphRAG Store] F --> M I --> M

All 28 Tool Categories (267 Total)

Tool availability depends on your license tier: CE (Community), Pro, or Enterprise.

Module	Tools	Tier	Key Tools
File Discovery	3	CE	`find_files`, `stage_file`
Data Reconciliation	38	CE	`load_csv`, `profile_data`, `fuzzy_match_columns`
Hierarchy Builder	49	CE	`create_hierarchy`, `import_flexible_hierarchy`, `export_hierarchy_csv`
Hierarchy-Graph Bridge	5	CE	`hierarchy_graph_status`, `hierarchy_rag_search`
Templates / Skills / KB	16	CE	`list_financial_templates`, `get_skill_prompt`
Git Automation	4	CE	`commit_dbt_project`, `create_deployment_pr`
SQL Discovery	2	CE	`sql_to_hierarchy`, `smart_analyze_sql`
Mapping Enrichment	5	CE	`configure_mapping_enrichment`, `enrich_mapping_file`
BLCE Engine	84	CE	`blce_parse_sql`, `blce_generate_ddl`, `blce_execute_ddl`, `model_ask`
AI Orchestrator	16	Pro	`submit_orchestrated_task`, `register_agent`
Planner Agent	11	Pro	`plan_workflow`, `suggest_agents`
Smart Recommendations	5	Pro	`get_smart_recommendations`, `smart_import_csv`
Diff Utilities	6	CE	`diff_text`, `diff_dicts`, `explain_diff`
Unified AI Agent	10	Pro	`checkout_librarian_to_book`, `sync_book_and_librarian`
Cortex Agent	12	Pro	`cortex_complete`, `cortex_reason`
Cortex Analyst	14	Pro	`analyst_ask`, `create_semantic_model`
Console Dashboard	5	CE	`start_console_server`, `broadcast_console_message`
dbt Integration	8	CE	`create_dbt_project`, `generate_dbt_model`
Data Quality	7	CE	`generate_expectation_suite`, `run_validation`
Wright Module	31	Pro	`create_mart_config`, `generate_mart_pipeline`, `wright_from_hierarchy`
Lineage & Impact	11	Pro	`track_column_lineage`, `analyze_change_impact`
Git / CI-CD	12	Pro	`git_commit`, `github_create_pr`
Data Catalog	19	Pro	`catalog_scan_connection`, `catalog_search`
Data Versioning	12	Pro	`version_create`, `version_diff`, `version_rollback`
GraphRAG Engine	10	Pro	`rag_search`, `rag_validate_output`, `rag_entity_extract`
Data Observability	15	Pro	`obs_record_metric`, `obs_create_alert_rule`
Cortex Table Understanding	5	Pro	`generate_table_understanding`, `batch_table_understanding`
AI Relationship Discovery	8	Pro	`ai_analyze_schema`, `ai_detect_relationships`
Mart Factory	10	Pro	`create_mart_config`, `generate_mart_pipeline`, `discover_hierarchy_pattern`
DataShield	—	CE	PII classification, trust attestations, data masking (integrated into pipeline phases)
Total	267

Available Templates

Accounting Domain (10 templates)

Template ID	Name	Industry
standard_pl	Standard P&L (Income Statement)	General
standard_bs	Standard Balance Sheet	General
oil_gas_los	Oil & Gas Lease Operating Statement	Oil & Gas
upstream_oil_gas_pl	Upstream Oil & Gas P&L	Oil & Gas - E&P
midstream_oil_gas_pl	Midstream Oil & Gas P&L	Oil & Gas - Midstream
oilfield_services_pl	Oilfield Services Company P&L	Oil & Gas - Services
manufacturing_pl	Industrial Manufacturing P&L	Manufacturing
industrial_services_pl	Industrial Services Company P&L	Industrial Services
saas_pl	SaaS Company P&L	SaaS
transportation_pl	Transportation & Logistics P&L	Transportation

Finance Domain (2 templates)

Template ID	Name	Industry
cost_center_hierarchy	Cost Center Hierarchy	General
profit_center_hierarchy	Profit Center Hierarchy	General

Operations Domain (8 templates)

Template ID	Name	Industry
geographic_hierarchy	Geographic Hierarchy	General
department_hierarchy	Organizational Department Hierarchy	General
asset_hierarchy	Asset Class Hierarchy	General
legal_entity_hierarchy	Legal Entity Hierarchy	General
upstream_field_hierarchy	Upstream Oil & Gas Field Hierarchy	Oil & Gas - E&P
midstream_asset_hierarchy	Midstream Oil & Gas Asset Hierarchy	Oil & Gas - Midstream
manufacturing_plant_hierarchy	Manufacturing Plant Hierarchy	Manufacturing
fleet_hierarchy	Fleet & Route Hierarchy	Transportation

ERP Data Model Templates (BLCE)

Pre-built Kimball data model specs for common ERP systems. Used by the BLCE engine to generate dimension and fact tables automatically.

ERP System	Config File	Pre-Built Dims	Pre-Built Facts
Enertia	`dm_specs/enertia.json`	12	5
WolfePak	`dm_specs/wolfepak.json`	10	5
SAP	`dm_specs/sap.json`	10	5
NetSuite	`dm_specs/netsuite.json`	9	5
QuickBooks	`dm_specs/quickbooks.json`	7	4
ProCount	`dm_specs/procount.json`	12	7

Built-in Skills

Skill ID	Name	Industries	Capabilities
financial-analyst	Financial Analyst	General	GL reconciliation, trial balance, bank rec, COA design
fpa-oil-gas-analyst	FP&A Oil & Gas Analyst	Oil & Gas	LOS analysis, JIB, reserves, hedge accounting
manufacturing-analyst	Manufacturing Analyst	Manufacturing	Standard costing, COGS, variances, inventory
saas-metrics-analyst	SaaS Metrics Analyst	SaaS	ARR/MRR, cohorts, CAC/LTV, unit economics
transportation-analyst	Transportation & Logistics Analyst	Transportation	Operating ratio, fleet, lanes, driver metrics
operations-analyst	Operations Analyst	General, Manufacturing, Logistics	Operational KPIs, throughput, utilization, capacity planning
fpa-cost-analyst	FP&A Cost Analyst	General, Manufacturing, Technology	Cost allocation, variance analysis, budget vs actual, cost centers
platform-workflow	Platform Workflow Orchestrator	General	E2E assessment pipeline, 15-phase orchestration, data modeling workflows

BLCE Auto-Generated Skills

The BLCE engine automatically generates domain-specific skill prompts from each analysis run. Skills are reusable and shareable across projects.

Skill Type	Generated From	Example
Domain Expert	Normalized measures + governance metadata	"Revenue analysis for Enertia upstream O&G"
Query Assistant	Bus matrix + model metadata	"Query the well production fact table"
Report Builder	Report suggestions + templates	"Build a lease operating statement"

API Reference

MCP Configuration (Claude Desktop)

{ "mcpServers": { "DataBridge_AI": { "command": "python", "args": ["-m", "src.server"] } } }

MCP Configuration (SSE Transport)

For remote/deployed servers, use the SSE transport configuration:

{ "mcpServers": { "DataBridge_AI": { "url": "https://mcp.databridge.dataamplifier.io/sse" } } }

Deployed Endpoints

Service	URL	Description
Dashboard	`https://databridge.dataamplifier.io`	Web UI (this dashboard)
MCP SSE	`https://mcp.databridge.dataamplifier.io/sse`	MCP server endpoint for Claude Desktop / AI clients

Programmatic Usage

from src.server import mcp # Run as MCP server mcp.run() # Or get tools list tools = await mcp.get_tools() print(f"Loaded {len(tools)} tools")

License Key System

DataBridge uses a tiered license system. Community Edition is free; Pro and Enterprise require a license key.

# License key format: DB-{TIER}-{CUSTOMER_ID}-{EXPIRY}-{SIGNATURE} # Example: export DATABRIDGE_LICENSE_KEY="DB-PRO-ACME001-20270209-a1b2c3d4e5f6" # Generate a license key (admin) python scripts/generate_license.py PRO CUSTOMER01 365 # Check license status (MCP tool) get_license_status()

Environment Variables

Variable	Description	Default
DATABRIDGE_LICENSE_KEY	License key for Pro/Enterprise features	- (CE mode)
DATABRIDGE_LICENSE_SECRET	License signing secret (admin only)	-
DATA_DIR	Data directory for projects	./data
NESTJS_BACKEND_URL	NestJS backend URL	http://localhost:8001
NESTJS_API_KEY	API key for backend	-
SNOWFLAKE_ACCOUNT	Snowflake account identifier	-
SNOWFLAKE_USER	Snowflake authentication user	-
DATABRIDGE_FUZZY_THRESHOLD	Fuzzy match score threshold (0-100)	80

Platform Architecture Diagrams

BLCE 21-Phase Pipeline

The Business Logic Comprehension Engine processes ERP data through 21 sequential phases, from intake to deployment.

Intake & Discovery

1. E2E Chain

2. Intake

3. Consultant

4. Catalog

5. Parse

6. Reports

7. Normalize

↓

Analysis & Modeling

8. Hierarchy

9. CrossRef

10. Evidence

11. Governance

12. Model Gen

13. Persist

14. Bus Matrix

↓

Enrichment & Deploy

15. Quality

16. Skills

17. AI Enrich

18. Swarm

19. Auto Build

20. Artifacts

21. Deploy

Wright Pipeline Flow

The Wright module generates a 4-object Snowflake Dynamic Table pipeline from hierarchy projects.

graph LR H[Hierarchy Project] --> VW1[VW_1
Translation View] VW1 --> DT2[DT_2
Granularity Table] DT2 --> DT3A[DT_3A
Pre-Aggregation] DT3A --> DT3[DT_3
Final Data Mart] DT3 --> SF[(Snowflake)]

Cortex AI Pipeline

Snowflake Cortex integration for AI-powered analytics with natural language queries.

graph TD Q[Natural Language Query] --> CA[Cortex Agent] CA --> SM[Semantic Model] SM --> SQL[Generated SQL] SQL --> SF[(Snowflake)] SF --> R[Results] CA --> CR[Cortex Reason] CR --> I[Insights]

Data Catalog & Observability

Centralized metadata, lineage tracking, and real-time health monitoring.

graph TD SC[Catalog Scanner] --> CAT[Data Catalog
19 tools] CAT --> LIN[Lineage Graph
11 tools] CAT --> GL[Business Glossary] OBS[Observability
15 tools] --> MET[Metrics Store] OBS --> ALR[Alert Rules] OBS --> AH[Asset Health] LIN --> GR[GraphRAG
10 tools] CAT --> GR

E2E Assessment Pipeline

The 15-phase orchestrated workflow for end-to-end ERP data assessment, from connection to final report.

graph TD C[Connect to Source] --> P[Profile Tables] P --> CL[Classify Columns] CL --> S[Summarize Tables] S --> M[Mask Data
DataShield] M --> R[Discover Relationships] R --> D[Detect Dimensions] D --> BM[Generate Bus Matrix] BM --> Q[Quality Validation] Q --> ML[Model Load
Dims + Facts] ML --> PR[Persist to Snowflake] PR --> RP[Generate Report] RP --> BD[Bundle Artifacts] BD --> SP[Create ShieldProject] SP --> DONE[Assessment Complete] style C fill:#2d5a3d,stroke:#4ade80 style DONE fill:#2d5a3d,stroke:#4ade80

Hierarchy-Graph Bridge

Auto-populates the GraphRAG vector store and lineage graph whenever hierarchies change. Event-driven with rich semantic embeddings.

graph LR HC[Hierarchy Change
Create / Update / Delete] --> ASM[AutoSyncManager
Event Callbacks] ASM --> HGB[HierarchyGraphBridge] HGB --> VS[VectorStore
Rich Embeddings:
levels, mappings,
properties, formulas] HGB --> LG[LineageGraph
Source Mapping Edges] VS --> RAG[GraphRAG Search] LG --> RAG RAG --> PA[PlannerAgent] RAG --> RE[RecommendationEngine]

Gateway Mode — Dynamic Tool Exposure

Cross-LLM compatibility layer. Only ~18 gateway tools are visible; the remaining 239 are discoverable and executable via discover_tools() and run_tool(). Enable with DATABRIDGE_TOOL_MODE=dynamic (default: full).

graph TB subgraph "MCP Client — Any LLM" A["LLM sees ~18 gateway tools"] end subgraph "Gateway Layer" B[discover_tools] C[run_tool] D[list_domains] E[search_tools] end subgraph "DynamicToolRegistry" F["Hidden Tools Store — 239 tools"] G["Domain Index — 16 domains"] H["Metadata Cache — 267 entries"] end subgraph "FastMCP Tool Manager" I["18 Always-Visible Tools: load_csv, profile_data, etc."] end A --> B & C & D & E B --> G --> H C --> F D --> G E --> H I --> A

Turbo Engine — Local Acceleration

Optional Polars + DuckDB acceleration layer. Data loads 10-100x faster locally, then persists to Snowflake via the existing bulk loader. Falls back to Pandas if not installed.

graph LR SRC["Source
CSV / Parquet / JSON"] --> PL["Polars
Fast Read + Profile"] SRC --> DDB["DuckDB
Local SQL Engine"] PL --> PDF["pd.DataFrame
Tool Compatibility"] DDB --> PDF PDF --> SF["Snowflake
sf_pool + bulk loader"] PL -.->|fallback| PD["Pandas
pd.read_csv"] PD --> PDF

Vanna RAG Text-to-SQL Pipeline

RAG-powered SQL generation from natural language. Trains on DDL, documentation, and query history. Falls back to deterministic QueryBuilder when confidence is low.

graph LR Q[NL Question] --> VR[Vanna RAG
DDL + Docs + Q&A] VR --> LLM[Claude LLM
Generate SQL] LLM --> CONF{Confidence
>= 0.7?} CONF -->|Yes| GQ[GeneratedQuery] CONF -->|No| DET[Deterministic
QueryBuilder] DET --> GQ GQ --> EXEC{Execute} EXEC --> DUCK[DuckDB
Local] EXEC --> SF[Snowflake
Remote]

PydanticAI Planning Loop

Multi-step reasoning agent for workflow planning. Iteratively validates plans using tool calls before returning a type-safe result.

graph TD UR[User Request] --> PA[PydanticAI Agent
Multi-turn Reasoning] PA -->|Tool Call| LA[list_available_agents] PA -->|Tool Call| CC[check_agent_capability] PA -->|Tool Call| VD[validate_step_dependency] PA --> REF[Iterative Refinement] REF --> PO[Validated PlanOutput
Pydantic Model] PO --> WP[WorkflowPlan] WP --> ORCH[PlatformOrchestrator]

Deployment Architecture

Production deployment on GCE with Nginx SSL termination and systemd service management.

graph TD INT[Internet] --> NG[Nginx
SSL Termination
Let's Encrypt] NG -->|databridge.dataamplifier.io| DASH[Dashboard Service
systemd: databridge-dashboard
Port 5050] NG -->|mcp.databridge.dataamplifier.io| MCP[MCP SSE Service
systemd: databridge-mcp
Port 786] DASH --> FL[Flask UI
run_ui.py] MCP --> SRV[MCP Server
run_server.py --sse] SRV --> SF[(Snowflake)] SRV --> FS[Local Filesystem
data/]

Commercialization Tiers

Three-tier licensing model with increasing tool counts and capabilities.

graph TD CE[Community Edition
~106 tools
Free - PyPI] --> PRO[Pro Edition
~247 tools
Licensed - GitHub Packages] PRO --> ENT[Enterprise
267+ tools
Custom Deploy] CE --> EX[Pro Examples
47 tests + 29 use cases]

Changelog

v0.49.4 - March 1, 2026

Enterprise Intelligence Layer: Builds 1–6 complete
Decision-making loop: VOI, Thompson Sampling, Monte Carlo rollout
Cost optimizer, governance dashboard, rule auto-tuner
Active learning, calibration, self-learning feedback loop
Distributed architecture: CE / Pro / Enterprise tiers
5-layer IP protection: license server, source stripping, Cython, API auth, data moat
GraphRAG: 4,571 nodes, 408K edges across all domains
4,363 tests passing
Total tool count: 339 CE (393 Enterprise)

v0.45.0 - February 24, 2026

Financial Validation Framework: ERP detection, TB validation, GL-TB reconciliation
Evaluation & Metrics Framework: 15 CE tools, Nelder-Mead tuner
Pattern Abstraction with federated privacy (k-anonymity, differential privacy)
Distributed architecture groundwork: Redis, Celery, PostgreSQL, S3
Total tool count: 339

v0.43.0 - February 20, 2026

Wright Integration: hierarchy-driven 4-object pipeline generation
Hierarchy-Graph Bridge: auto-sync GraphRAG on hierarchy changes
Lineage graph with full provenance tracking
Detection grounding: knowledge-backed anomaly rules
Total tool count: 290

v0.42 - February 18, 2026

BLCE P5: DDL executor + deployment phase (phase 21)
22 new tools added (tools 51-72), 5 new phases (17-21)
Auto-build pipeline: schema creation, DDL execution, validation
Swarm orchestration for parallel AI enrichment
Artifact bundle generation with rich HTML reports
Dashboard UI refresh with Architecture/Changelog tabs, BLCE Engine page
Total tool count: 267

v0.41.0 - February 16, 2026

BLCE Engine launch: Business Logic Comprehension Engine
50 initial tools across 16 phases
SQL parsing, measure normalization, cross-referencing
Evidence collection, governance metadata, model generation
Bus matrix generation, quality validation
601 tests passing

v0.40.0 - January 15, 2026

E2E Assessment Pipeline: 15-phase orchestrated workflow
DataShield UI: offline data masking before AI processing
Snowflake Connection Pool: singleton SSO auth for pipelines
Bulk VARIANT loader for Snowflake persistence
ERP config registry with auto-detect + Enertia preset
Report generator with KPI tiles, bus matrix, timeline

v0.39.0 - December 2025

Data Observability: metric recording, anomaly detection, asset health
GraphRAG Engine: knowledge graph + vector search
Data Versioning: snapshots, diffs, and rollback
AI Relationship Discovery: schema analysis, naming patterns, FK detection
Cortex Table Understanding: AI-generated table summaries

The Business Logic Comprehension Engine (BLCE) is Ithaca's core analytical engine. It ingests raw ERP SQL views and tables, extracts business logic, normalizes measures, discovers hierarchies, and generates a complete Kimball-style data warehouse — all through a 21-phase automated pipeline.

84

MCP Tools

21

Pipeline Phases

6

ERP Templates

17

Pydantic Contracts

21-Phase Pipeline

Intake & Discovery

1. E2E Chain

2. Intake

3. Consultant

4. Catalog

5. Parse

6. Reports

7. Normalize

↓

Analysis & Modeling

8. Hierarchy

9. CrossRef

10. Evidence

11. Governance

12. Model Gen

13. Persist

14. Bus Matrix

↓

Enrichment & Deploy

15. Quality

16. Skills

17. AI Enrich

18. Swarm

19. Auto Build

20. Artifacts

21. Deploy

How It Works

Phase Group	Phases	Purpose
Intake & Discovery	1-6	Connect to ERP, catalog tables, parse SQL, identify reports
Analysis & Normalization	7-9	Normalize measures, detect hierarchies, cross-reference
Governance & Modeling	10-14	Collect evidence, apply governance, generate Kimball model, bus matrix
Quality & Skills	15-16	Validate data quality, generate domain-specific AI skills
Enrichment & Build	17-21	AI enrichment, swarm orchestration, auto-build DDL, deploy

84 BLCE Tools by Function

Parsing & Extraction (8 tools)

blce_parse_sql_logic blce_parse_python_logic blce_parse_excel_logic blce_parse_dax_logic blce_parse_mdx_logic blce_parse_pdf_logic blce_extract_business_measures blce_extract_common_filters

Normalization & Grain (2 tools)

blce_normalize_measures blce_detect_grain_contract

Cross-Reference & Comparison (4 tools)

blce_compare_logic_sources blce_discover_cross_references blce_propose_conformed_dimensions blce_validate_cross_reference

Evidence & Governance (5 tools)

blce_collect_evidence_sample blce_mask_evidence blce_verify_measure_with_data blce_classify_artifact blce_governance_report

AI & Semantic (4 tools)

blce_ai_enrich_artifact blce_ai_semantic_analysis blce_ai_cross_reference blce_ai_quality_policy

Skills & Generation (2 tools)

blce_generate_skill blce_list_generated_skills

Pipeline & Orchestration (5 tools)

blce_run_pipeline blce_pipeline_status blce_get_run_summary blce_run_full_engine blce_orchestrator_status

Parallel Engine & Agents (7 tools)

blce_run_parallel_engine blce_agent_runtime_status blce_message_bus_drain blce_parallel_phase_graph blce_route_task blce_agent_pool_stats blce_list_agents

Client Interaction (6 tools)

blce_intake_questionnaire blce_process_report blce_process_meeting_notes blce_department_mapper blce_conversation_list blce_conversation_detail

E2E Handoff (2 tools)

blce_e2e_handoff_load blce_e2e_to_artifacts

Model Operations (9 tools)

model_ask model_query_builder model_suggest_report model_propose_change model_add_source_system model_evolution_history blce_model_alter blce_model_alter_summary blce_merge_systems

Proposal & Code Generation (9 tools)

blce_propose_dimensions blce_propose_facts blce_generate_bus_matrix blce_resolve_conflicts blce_generate_ddl blce_generate_wright_pipeline blce_generate_union_template blce_generate_semantic_layer blce_generate_proposal

Analysis & Mapping (3 tools)

blce_analyze_tables blce_map_report_to_model blce_detect_hierarchies

Review & Deployment (6 tools)

blce_review_model blce_approve_dimension blce_approve_fact blce_deploy_semantic_layer blce_execute_ddl blce_deployment_status

Graph Copilot & Excel Reconciliation (3 tools)

blce_graph_copilot_extract blce_excel_vs_warehouse_reconcile blce_discrepancy_resolver

DataShield Classification (9 tools)

assess_quality quality_from_classifications onboard_data resolve_entities get_validation_results run_validation get_llm_validation_prompt generate_expectation_suite add_column_expectation

17 Pydantic Contracts

BLCE uses strongly-typed Pydantic models at every phase boundary. Each contract validates data flowing between phases.

Contract	Prefix	Purpose
ParsedSQL	`PSQL_`	Validated SQL parse tree with CTEs, joins, measures
NormalizedMeasure	`NM_`	Canonical measure with aggregation type, grain, units
DetectedHierarchy	`DH_`	Discovered hierarchy levels with parent-child links
CrossReference	`XR_`	Cross-table relationships with confidence scores
EvidenceRecord	`ER_`	Source evidence for each analytical decision
GovernanceTag	`GT_`	PII/sensitivity classification, retention policy
DimensionSpec	`DS_`	Kimball dimension definition with SCD type
FactSpec	`FS_`	Kimball fact table with grain, measures, FK links
BusMatrixEntry	`BM_`	Fact-dimension intersection for bus matrix
QualityRule	`QR_`	Data quality expectation with threshold
SkillPrompt	`SP_`	Generated AI skill with domain context
EnrichmentResult	`ENR_`	AI-enriched metadata and descriptions
SwarmTask	`ST_`	Parallel task definition for swarm orchestration
DDLStatement	`DDL_`	Generated CREATE TABLE/VIEW statement
DeploymentPlan	`DP_`	Ordered DDL execution plan with rollback
ArtifactBundle	`AB_`	HTML report, JSON metadata, diagram outputs
PipelineState	`PS_`	Checkpoint state for pipeline resume/rollback

Mart Factory (Phase 26)

The Mart Factory generates complete 4-object Snowflake Dynamic Table pipelines from hierarchy configurations. It uses heuristic discovery to auto-detect hierarchy patterns and suggest optimal mart configurations.

10

MCP Tools

4

Pipeline Objects

5

Formula Levels

10 MCP Tools

Category	Tools
Config (3)	`create_mart_config`, `add_mart_join_pattern`, `export_mart_config`
Pipeline (3)	`generate_mart_pipeline`, `generate_mart_object`, `generate_mart_dbt_project`
Discovery (2)	`discover_hierarchy_pattern`, `suggest_mart_config`
Validation (2)	`validate_mart_config`, `validate_mart_pipeline`

4-Object Pipeline

VW_1 Translation View

DT_2 Granularity Table

DT_3A Pre-Aggregation Fact

DT_3 Data Mart

Formula Engine — 5-Level Precedence Cascade

Level	Operations	Example
P1	SUM	Revenue = Sum of all revenue line items
P2	SUBTRACT, ADD	Net Revenue = Revenue - Discounts
P3	DIVIDE, RATIO	Gross Margin % = Gross Profit / Revenue
P4	VARIANCE	Variance = Actual - Budget
P5	Complex	Custom multi-step calculations

DataShield — Trust & Data Classification

DataShield provides offline data masking, PII/sensitivity classification, and trust attestation enforcement for AI-safe data processing.

Key Capabilities

Feature	Description
PII Classification	Column-level sensitivity detection (PII, PHI, financial, confidential) using heuristic + AI models
Trust Attestations	Every AI phase records pass/fail attestations. Configurable enforcement: `hard_fail`, `warn`, or `off`
Data Masking	Offline masking of sensitive columns before AI/LLM processing — no PII leaves your environment
Audit Trail	All attestations persisted as JSON files with timestamps, event IDs, and phase metadata

Trust Enforcement Modes

Mode	Behavior	Use Case
`hard_fail`	Blocks phases when attestation is missing	Production deployments
`warn`	Logs warning but continues execution	Development & E2E testing
`off`	No enforcement	Local testing

Classification Output

DataShield classifies every column in your schema and outputs structured reports:

// Persisted to Snowflake: DATASHIELD_CLASSIFICATIONS table { "column": "SSN", "classification": "PII", "confidence": 0.98, "masking_strategy": "HASH", "rationale": "Social Security Number pattern detected" } // Attestation record (data/datashield/attestations/) { "phase": "ai_relationship_discovery", "lane": "client_cortex_raw", "result": {"ok": true, "code": "TRUST_ATTESTATION_VALID"}, "timestamp": "2026-02-24T08:49:50Z" }

Sample: Investment Property Financial Analysis DEMO

Commercial real estate investment property model with income statement, balance sheet, and financial analysis hierarchies.

Click a node in the hierarchy tree to view details.

graph TD ROOT[Investment Property
Financial Analysis] --> IS[Income Statement] ROOT --> BS[Balance Sheet] ROOT --> FA[Financial Analysis
Report] IS --> REV[Revenue] IS --> OPEX[Operating Expenses] IS --> NOI_C[Net Operating Income] REV --> RENT[Rental Income] REV --> CAM[CAM Reimbursements] REV --> OTH_R[Other Income] BS --> ASSETS[Assets] BS --> LIAB[Liabilities] BS --> EQ[Owner Equity] FA --> NOI[NOI Analysis] FA --> CAP[Cap Rate Analysis] FA --> DCF[DCF Valuation]

Project:

Hierarchy Tree

Select a project to view its hierarchy tree.

Select a node from the tree to edit its details.

Name

Hierarchy ID

Description

Parent

Sort Order

Flags

Include Exclude Calculation Active Leaf Node

Database	Schema	Table	Column	UID

Name

Value

The 60-Second Wow

Upload your Chart of Accounts. Get a production-ready financial hierarchy and dbt models. Zero config.

Upload → Classify → Hierarchy → Mart → Deploy

5-phase pipeline from raw CSV to queryable Snowflake mart

📄

Upload
CSV

→

🔎

Auto
Classify

→

🌳

Discover
Hierarchy

→

⚙

Generate
Mart

→

⚡

Deploy to
Snowflake

Click "Launch" above to begin

▷

Launch the 60-Second Wow to see the full pipeline in action

Financial Intelligence Demos

CFO-grade analytics, forensic auditing, executive dashboards, and portfolio risk assessment.

$

CFO Gross Margin Mismatch

Revenue = $12.4M but COGS shows $8.1M creating a 34.7% margin vs expected 42%. FixGenerator runs the full closure loop.

FixGeneratorClosure Loop

GL

Month-End GL Reconciliation

GL trial balance vs sub-ledger totals with 47 mismatched accounts. GraphRAG recommends a workflow then reviews findings.

GraphRAGWorkflow

SOX

Audit Evidence Trail

External auditors need SOX compliance evidence. CFO-strict search for audit trail data plus closure metrics KPI dashboard.

GraphRAGMetrics

FRD

Ghost Vendor Fraud

Forensic ledger-to-invoice matching detects ghost vendors with zero POs. Flags suspicious invoices and quantifies total exposure.

ForensicAudit

View Data

M&A

M&A Integration Conflict

Compare account ID formats across merging entities. Detects 847 overlapping IDs and generates unified mapping recommendations.

M&ASchema

View Data

SYN

Portfolio Synergy Capture

Cluster redundant cost centers across PE portfolio companies. Quantifies $3.36M/yr savings pipeline across 3 opportunity clusters.

SynergyPortfolio

View Data

KPI

Executive Dashboard

C-suite portfolio overview: 1,945 companies tracked, trust store health, $18.2M synergy pipeline, and risk breakdown by category.

PortfolioIntelligence

HMP

Portfolio Risk Heatmap

PE firm risk density grid: 5 firms x 4 issue types with color-coded severity. Red zones = immediate manual audit required.

RiskHeatmap

Select a Financial Intelligence demo above

▷

Click "Run Demo" on any card to see live results

Data Engineering Demos

ERP integration, grain analysis, fact harmonization, self-healing pipelines, and legacy modernization.

ERP

ERP Integration Quality Check

Migrating from legacy ERP with 200+ tables. Search for relevant tools then get a workflow recommendation.

GraphRAGSearch

GRN

Enterprise Grain Analysis

Two ERP systems at different granularity — daily vs monthly. Detect, compare, and recommend alignment.

GrainMulti-System

UNI

Fact Harmonization

Match columns across two fact tables using exact, prefix-stripped, and fuzzy matching. Generate UNION ALL SQL.

HarmonizeColumn Match

VAL

Progressive Parity Validation

5-gate state machine: LINE_ITEM/MONTH → FULL_REPORT/YEAR. Watch gates pass, fail, fix, and certify.

State MachineProgressive

CRT

Parity Certificate

Full validation cycle: compile spec, run progressive validation, classify discrepancies, generate signed certificate.

CertificateEnd-to-End

SH

Self-Healing Pipeline

4-stage autonomous loop: Detect issues, patch in sandbox, memorize the fix, replay on new companies with zero human input.

Self-HealingTrust Store

ARC

Architect Modernizer

Legacy COBOL → star schema: AI proposal, architect correction (SK + SCD2), then self-improved replay on new files.

LegacyModernize

Advanced: Implementation Showcase (6 demos) & Platform Internals

TPL

ERP Template Auto-Select

Shows how the system resolves ERP template strategy (explicit/alias/detected).

TemplateAuto-Detect

A/B

Proposal Coverage Impact

Compares no-template baseline vs template-enriched proposal coverage.

CoverageModeling

TR

Trust Metrics Live

Pulls real trust-policy metrics from GraphRAG runtime memory.

TrustRuntime

POL

Policy Explainability

Drills into one discrepancy event and explains trust factors and reasoning codes.

PolicyExplain

RET

Retention Tier Status

Shows hot/warm/cold memory tier distribution.

RetentionGovernance

WP

WolfePak Quick Start

Fast bootstrapping to at least 5 dimensions and 3 facts.

WolfePakQuick Start

Select a demo above to begin

▷

Click "Run Demo" on any card to see live results

Select an Enterprise demo above

▷

Click "Run Demo" on any card to see live results

DataBridge AI v0.49.4 — Platform Recommendation Guide

Comprehensive guide covering E2E Assessment Pipeline, BLCE Business Logic Engine, and Hierarchy Financial Reporting.

15

E2E Phases

6

BLCE Phases

84

BLCE Tools

12

Financial Templates

49

Hierarchy Tools

31

Wright Tools

E2E Assessment Pipeline

Transforms a raw ERP database into a production-ready Kimball dimensional data warehouse with AI-powered relationship discovery, PII masking, data quality expectations, and a complete audit trail.

Pipeline Flow

Raw ERP

→

Sample & Profile

→

AI FK Discovery

→

Classify Columns

→

Detect Dims/Facts

→

Generate Specs

→

Load DW

→

Quality + Report

All 15 Phases

#	Phase	What It Does	Business Value	Duration
1	load_metadata	Connects to source, samples every table, detects column types, infers basic FK patterns	Baseline understanding of source schema	5–15 min
2	ai_relationship_discovery	6-sub-phase AI pipeline: schema inventory, naming patterns, deterministic FK scan, value overlap, Cortex semantic matching, confidence scoring	Uncovers hidden FK relationships invisible to naming heuristics	5–10 min
3	group_tables	Clusters tables by ERP domain prefix (GL, RV, AR, etc.)	Organizes 100+ raw tables into business domains	<1 min
4	hierarchy	Builds dimension hierarchies (optional)	Pre-built rollup structures	Skipped
5	ai_classify	DataShield column classification — identifies PII (SSN, email, phone), classifies identifiers, measures, dates, codes	Security: all PII identified and masked before data leaves client environment	10–20 min
6	detect_dimensions	Kimball heuristic classifier: dimension (referenced by 3+ tables), fact (references 2+ dims), bridge (M:M resolver)	Foundational DW design: which tables are facts vs. dimensions	2–5 min
7	wright	Generate dbt pipelines from hierarchy projects	Automated data mart generation	Skipped
8	dbt	Deploy dbt transformations	Continuous transformation pipeline	Skipped
9	quality	Generates Great Expectations suites from dim/fact classification (NOT NULL, UNIQUE, NUMERIC)	Automated data quality monitoring — detect drift before production	2–5 min
10	observability	Record SLA metrics and asset health	Ongoing monitoring	Skipped
11	artifact_bundle	Rich HTML report with KPI tiles, phase timeline, bus matrix, classification breakdown	Client-facing deliverable — one page that tells the whole story	2–5 min
12	bus_matrix	Kimball fact×dimension conformance grid	Architecture scorecard: 75%+ = enterprise-ready, <40% = design gaps	<1 min
13	dm_spec_generate	Auto-generates dimensional model table specs (DIM_, FCT_) with SCD2 boilerplate	Automated DW design — production-ready table specs in seconds	1–2 min
14	dm_load	Loads dimensional tables from source to warehouse (chunked INSERT, 500-row batches)	Analytics warehouse live, ready for Tableau/Power BI	50–80 min
15	quality_from_classification	Maps DataShield per-column classifications to advanced quality expectations	Tier-2 quality rules: SSN format regex, date range validation, code set membership	2–5 min

Critical path: Phases 1, 2, 6, 13, 14 must succeed. Highlighted rows (12–15) are the E2E-extended phases beyond the base 11-phase orchestrator.

Business Examples

Oil & Gas — Energy Company (Enertia ERP)

E2E Assessment: 111 Source Tables to Kimball Star Schema

The client runs Enertia ERP with 111 tables across Revenue, Production, Land, Joint Interest Billing, and GL. Table names like RVMASHDR, JIBDTLRC are cryptic — no FK metadata in the schema.

Phase	Result	Business Impact
load_metadata	111 tables sampled, 247 relationships inferred	First-ever complete schema map
ai_relationship_discovery	21 naming patterns (GL, RV, AB, JIB), Cortex semantic matching	Discovered FK patterns invisible to heuristics
ai_classify	2,143 columns classified, masked samples generated	Security team verified: zero PII leaked to analytics layer
detect_dimensions	36 dimensions, 9 facts identified	Kimball design: GLCHART, PROPERTY, CUSTOMER as conformed dims
dm_load	14 tables loaded, 5.7M rows	Production DW ready for Power BI — Revenue, Production, GL analytics

Outcome: 103 minutes from raw Enertia to production-ready Kimball star schema with 5.7M rows

SaaS Company — Subscription Analytics

E2E Assessment: Stripe + Salesforce + Snowflake Native

Consolidate subscription data from Stripe (payments), Salesforce (CRM), and product usage database into a unified analytics warehouse.

Phase	Expected Result	Business Impact
load_metadata	~45 tables: 15 Stripe, 20 Salesforce, 10 product	First unified view of all subscription data
ai_relationship_discovery	Cross-system FK: customer_id, subscription_id, invoice_id	Links Stripe charges to Salesforce opportunities to product usage
detect_dimensions	DIM_CUSTOMER, DIM_PLAN, DIM_DATE; FCT_SUBSCRIPTION, FCT_INVOICE	Kimball design for MRR/ARR, churn, LTV analytics
bus_matrix	Conformance grid showing shared CUSTOMER and DATE dims	Validates star schema supports cross-domain cohort analysis

Outcome: Unified SaaS analytics warehouse enabling MRR/ARR, churn cohort, and LTV reporting

Client Deliverables

1. Dimensional Data Warehouse

14–17 production-ready tables (DIM_* + FCT_*)
5–10M rows of clean, conformed data
SCD2 change tracking for dimension history
Ready for Tableau / Power BI / Looker

2. Metadata Audit Trail (6 tables)

RUN_SUMMARY — pipeline execution proof
TABLE_PROFILES — per-table profiling stats
RELATIONSHIPS — FK discovery + confidence
CLASSIFICATIONS — per-column data types
TABLE_SUMMARY — business purpose per table
MASKED_SAMPLES — de-identified sample data

3. Reports & Quality

Rich HTML report (KPI tiles, phase timeline)
Bus matrix conformance grid
Great Expectations quality suites
GraphRAG knowledge base entries

Proven Results — Enertia E2E v2

103 min

Total Duration

111

Source Tables

5.7M

Rows Loaded

36

Dimensions Found

9

Facts Identified

2,143

Columns Classified

BLCE — Business Logic Comprehension Engine

Extracts, normalizes, classifies, and operationalizes business logic from 7 source formats into reusable AI skills and production Snowflake DDL.

Pipeline Flow

Source Files

→

Parse & Extract

→

Normalize

→

Cross-Reference

→

Evidence Sample

→

Governance

→

Skill & DDL Gen

7 Source Formats

Format	What Gets Extracted	Example
SQL	SELECT measures, WHERE filters, GROUP BY grain, FROM/JOIN dependencies	SELECT SUM(amount) AS total_revenue FROM sales WHERE is_active = true
Python	pandas aggregation patterns, DataFrame transformations	df.groupby('region')['amount'].sum()
Excel	Named ranges, SUMIF/VLOOKUP formulas, pivot table definitions	=SUMIF(A:A,"Revenue",B:B)
DAX	Power BI measure definitions, CALCULATE contexts	CALCULATE(SUM(Sales[Amount]), YEAR(Sales[Date])=2025)
MDX	SSAS cube queries, dimension hierarchies	SELECT [Measures].[Revenue] ON 0 FROM [Cube]
PDF	OCR-extracted tables, report structure, KPI definitions	Board report with "Net Revenue: $12.5M"
CSV	Column headers as grain, numeric columns as measures	Monthly budget CSV with department, amount, period

The 6-Phase Pipeline

Phase 1: Parse Sources

Extracts LogicArtifacts from any combination of source files — each artifact captures measures, filters, joins, grain columns, and source dependencies.

Phase 2: Normalize

Deduplicates and canonicalizes measures across all sources. Consolidation with a confidence boost of +0.1 per source.

Phase 3: Cross-Reference

Discovers relationships between artifacts using 3 strategies: Column Name Similarity (0.75), Grain Matching (0.85), and Measure Expression Matching.

Phase 4: Evidence Sampling

Builds validation queries to test extracted logic against actual data — up to 5,000 rows with a 12-month lookback window. SHA-256 hashed for integrity.

Phase 5: Governance Classification

Scores each artifact on a [-1.0, 1.0] scale using 5 evidence-based rules to classify as CORE, CANDIDATE, or CUSTOM.

Phase 6: Skill & DDL Generation

Only CORE-classified artifacts generate reusable AI skill prompts and production Snowflake DDL.

Governance Classification System

Rule	Boost	Penalty	What It Measures
Standard Aggregations (SUM, COUNT, AVG)	+0.20	—	Uses widely-recognized patterns
Core Naming (total_, count_, sum_, net_)	+0.15	—	Domain-standard naming conventions
Custom Naming (custom_, client_, _temp)	—	-0.30	Client-specific, non-reusable
Used by 3+ clients	+0.50	—	Proven reusability
Used by 1 client only	—	-0.10	Low reusability

CORE (score ≥ 0.4)

Standardizable, reusable across clients. Gets skill prompt + DDL.

CANDIDATE (-0.1 < score < 0.4)

Near-CORE. Needs more client evidence.

CUSTOM (score ≤ -0.1)

Client-specific. Documented but not operationalized.

Hierarchy Financial Reporting

Hierarchies are the architectural spine of Ithaca. Up to 15 levels, 5 formula operations, source mapping with precedence groups, and automatic deployment to Snowflake.

15

Max Levels

12

Templates

4

Import Tiers

14

Agg Types

7

Property Cats

5

Formula Levels

12 Industry Financial Templates

Template	Industry	Type	Levels	Key Features
Standard P&L	General	Income Statement	3	Revenue, COGS, Gross Profit, OpEx, Net Income
Standard Balance Sheet	General	Balance Sheet	3	Assets (Current/Non-Current), Liabilities, Equity
Upstream O&G P&L	E&P	Income Statement	4	Oil/Gas/NGL revenue, LOE breakdown, DD&A, Netback per BOE
Midstream O&G P&L	Midstream	Income Statement	4	Gathering, Processing, Transportation, Storage revenue
Oilfield Services P&L	OFS	Income Statement	3	Well services, Completion, Workover, Rig revenue
Oil & Gas LOS	E&P	Lease Operating	5	Per-property LOE: labor, chemicals, utilities, workover
Manufacturing P&L	Manufacturing	Income Statement	4	Product lines, COGS by material/labor/overhead
SaaS P&L	SaaS	Income Statement	3	Subscription/Professional/Usage revenue, CAC, LTV
Transportation P&L	Logistics	Income Statement	3	Freight revenue, fuel costs, maintenance
Cost Center Hierarchy	General	Cost Center	4	Revenue-generating, Production, Support, R&D
Profit Center Hierarchy	General	Profit Center	4	Business units, product lines, geographic regions

Wright Data Mart Pipeline — 4 Objects

Transforms hierarchies into 4-object Snowflake pipelines.

VW_1 Translation View

Dynamic column mapping via CASE
Joins to dimension tables
Multi-currency conversion

DT_2 Granularity Table

UNPIVOT filter groups
Multi-round filtering
Exclusion logic via NOT IN

DT_3A Pre-Aggregation

UNION ALL per join pattern
Account segment filtering
Sign change flag handling

DT_3 Data Mart

5-level formula cascade (P1–P5)
Calculated rows via formula engine
DENSE_RANK surrogate keys

Financial Reporting Scenarios

Scenario 1: Month-End Close Acceleration

From 10 Days to 3 Days — Automated P&L Rollup

Problem: Month-end close takes 10 days because accountants manually build P&L rollups in Excel across 5 entities.

Solution: Deploy Standard P&L hierarchy with GL account source mappings. Wright auto-generates VW_1 → DT_2 → DT_3A → DT_3.

Outcome: 70% reduction in close cycle time.

Scenario 2: Oil & Gas Lease Operating Statement (LOS)

Per-Property Cost Analysis with Netback Calculation

Problem: Operations managers need per-property LOE per BOE analysis.

Solution: Deploy O&G LOS hierarchy (5 levels). GL accounts 6100–6900 mapped to LOE categories. Wright DT_3 calculates Netback.

Outcome: Operations team gets daily LOE per BOE dashboard.

Scenario 3: Multi-Entity Consolidation with IC Elimination

3-Entity GL Consolidation with GAAP-Compliant Eliminations

Problem: Holding company needs consolidated statements across 3 subsidiaries with IC elimination.

Solution: Consolidated P&L with entity-level rollup. 3 UNION ALL branches in DT_3A; elimination in DT_3 formula cascade.

Outcome: Automated consolidation with real-time IC elimination.

Platform Maturity Model

Stage	Capabilities	Metrics	Timeline
Stage 1: Assess	E2E Pipeline (15 phases) + BLCE Parse & Classify	Schema cataloged, dims identified, PII masked	Week 1
Stage 2: Design	+ Hierarchy templates + Bus matrix + BLCE governance	P&L hierarchy live, star schema designed	Week 2–3
Stage 3: Build	+ Wright pipelines + DM load + Quality expectations	Data marts live, BI connected	Week 3–4
Stage 4: Optimize	+ Hierarchy Intelligence + GraphRAG + BLCE skills	AI governance, automated skill library	Month 2+

Planner Actions

📋

Plan Workflow

Create executable workflow from natural language

🔍

Analyze Request

Extract intent, entities, constraints from request

🤖

Suggest Agents

Recommend agents with relevance scoring

◆

Build Dimensions

Guided wizard to design star schema dimensions

🤖

Autonomous Agent

AI agent that autonomously investigates and solves problems

Agent Registry 0

Loading agents...

Quick Prompts

Past Conversations

Conversation

Welcome to the AI Workflow Planner. Describe what you want to build and I'll create an executable plan using the right agents and tools.

Try: "Scan my Snowflake schema and build a star schema" or click a Quick Prompt.

What does Workbook Analysis do?

A 6-stage AI pipeline that scans Excel workbooks, classifies their purpose, extracts formula logic, links entities across sheets, and proposes fixes.

Supported Formats

.xlsx .xlsm .xlsb .xls .xltx .xltm

Best Suited For

Multi-sheet financial workbooks with formulas — P&L, balance sheets, consolidation packs, budgets, forecasts, and financial models.

Archetype Classifications

Archetype	Signals
Financial Report	Sheet names with P&L, balance, income, cashflow; functions like SUMIFS, VLOOKUP, IRR, NPV; currency formats
Data Extract	Keywords like export, dump, raw; high row-to-formula ratio; single-sheet flat tables
Model / Template	Template, form, forecast, scenario in filename; named ranges; data validation; complex formula chains
Consolidation	Cross-sheet references; multiple structurally similar sheets; intercompany, elimination keywords
Unknown	No strong archetype signal detected

Pipeline Stages

1. Validate → 2. Triage → 3. Sheet Scan → 4. Debate → 5. Entity Link → 6. Fix Proposals

Each stage is fail-forward — if one fails, independent stages still run.

Select Workbook

Previously uploaded Excel files:

or upload a new file

Drop Excel file here or click to browse

Sample Workbooks

Pick a sample and click Analyze to see the full pipeline in action:

Loading samples...

Options

Deep formula scan 🛡 DataShield (auto-mask on upload)

Advanced — skip stages

Skip logic extraction Skip entity linking Skip fix proposals

Analysis Report

Raw JSON

📊

Instant Workbook Intelligence

Upload any Excel workbook and get a full analysis in seconds.

Classify

Auto-detect workbook type: P&L, Balance Sheet, Model, Data Extract

Discover

Find business entities, formulas, and cross-sheet dependencies

Detect Issues

Catch formula conflicts, naming inconsistencies, missing links

Recommend Fixes

Get actionable proposals ranked by risk and impact

Try a sample workbook from the left panel, or upload your own.

Configure Step

Data Preview

Configure & Execute

Register Your Organization

Set Up Connection

Create Your First Project

Where Your DataFinally Comes Home

Six engines. One destination.

Oracle Engine

Argos Pipelines

Athena Intelligence

Aegis Trust

Olympus

Penelope

From ERP chaos to clean data in 4 weeks.

1. Assess

2. Design

3. Build

4. Optimize

Battle-tested benchmark results.

📊 Dashboard

Welcome to Ithaca

Recent Activity

Sample Data Files

Quick Start

🔌 Connections

Connection Settings

Saved Connections

Connection Health

🎯 Live Demos

Session Stats

Active Agents

Autonomous Demos

🔧 Tool Workbench

Available Tools

Select a Tool

Output

⚡ Workflow Editor

Tool Palette

Workflow Steps

✈️ Wright Pipeline

Pipeline Configuration

VW_1: Translation View

DT_2: Granularity Table

DT_3A: Pre-Aggregation Fact

DT_3: Final Data Mart

🔬 Data Lab

How the Data Lab Works

Live Demos

Pro Data Lab Tools

analyze_book_with_researcher

compare_book_to_database

profile_book_sources

Result

⚙️ Administration

Configuration

License & Tier

Upgrade via Subscription

Tenant Information

Cost / Credit Tracker

Token Usage Calculator

📚 Documentation

Getting Started with Ithaca

Step 1: Connect Your Data

Step 2: Create Your First Hierarchy

Step 3: Generate a Pipeline

Step 4: Validate Your Data

Step 5: Explore with AI

Take the Guided Tour

DataBridge AI v0.49.4

Core Capabilities

Quick Start

Architecture

All 28 Tool Categories (267 Total)

Available Templates

Accounting Domain (10 templates)

Finance Domain (2 templates)

Operations Domain (8 templates)

ERP Data Model Templates (BLCE)

Built-in Skills

Where Your Data
Finally Comes Home