Build a Digital Twin of Your SaaS Business
A digital twin is a governed model of the metrics that matter. Approved definitions. Named owners. Explicit relationships. It separates the golden metrics from the noise so AI agents can answer questions that currently take days of analyst time and three Slack threads to resolve.
The Cost of Flying Blind
- "Which cohort is driving the NRR decline?" An analyst pulls data manually, cross-references CRM and billing, builds a one-off notebook. Answer arrives in 3-5 days.
- New analyst joins Spends months building tribal knowledge about which tables matter and how metrics connect.
- Board prep Finance reports NRR from one query. CS reports it from another. Two numbers. One awkward conversation.
- Source system migration Moving from Chargebee to Stripe breaks every dashboard and every query. Months of rework.
- Same question An agent traces the relationship graph from NRR to Health Score to Feature Adoption to the exact warehouse tables. SQL generated in seconds.
- New analyst Explores the graph, sees how every metric connects, finds the adapter mapping to know exactly which table.column computes what.
- Board prep One canonical definition. One SQL source. One number.
- Source migration Remap the affected fields in the adapter. Definitions stay the same. Dashboards keep working.
Three Layers, Zero Migration
The twin stacks on top of your existing infrastructure. Your warehouse, your dbt models and your BI dashboards all stay exactly as they are, until you're ready to update them or make new ones. The hardest part is the human work: agreeing on which metrics matter, who owns each definition and how they connect. GASP provides the framework and tools to assist in this work. Your team curates it into the model that reflects your business.
AI Agents
Read-only access to the twinMCP tools and custom integrations. Agents read the graph and your adapter to answer questions, generate SQL and trace relationships. They query the twin, not your source systems. With protocols like A2A and OSSA a coordinator agent can spin up specialists on demand.
Your Adapter
Maps GASP fields to your warehouse
A JSON mapping file. For each GASP field you provide your schema.table.column reference plus the team that owns the definition and the team that owns the pipeline.
This is the only file you write.
GASP Standard
300 definitions, formulas, relationshipsThe knowledge graph (60 metrics, 15 business entities, 7 concepts, 164 relationships), the ontology (198 canonical fields, 22 source categories) and 73 parameterized SQL templates. This is the scaffolding. It comes ready-made.
Your Warehouse
UnchangedSnowflake, BigQuery, Postgres, Redshift, Databricks. Your existing pipelines keep running. The twin reads from your warehouse. It writes nothing.
Why It Matters
Metric Accountability
Every metric in the adapter has a named owner: the team responsible for how it is calculated. Every field mapping has an optional data owner: the team responsible for the pipeline. When NRR looks wrong you know exactly who to ask and exactly which table.column to inspect.
Source System Independence
When you migrate from Chargebee to Stripe or Salesforce to HubSpot the twin absorbs the change. Remap the affected fields in the adapter. Every metric definition and every agent workflow continues to work against the new data. The cost of switching drops from months to an afternoon.
Curated by Humans, Used by Agents
The twin is built from a one-time human curation of which metrics to track, how they connect and who owns them. Agents read from this curated model. They get the approved definitions and the governed relationships, not raw access to every table in your warehouse.
Agentic Operating Layer
Any number of agents can read from the twin in parallel. Spin up a retention agent for a board meeting. Spin up a pipeline agent for a forecast call. Tear them down when the meeting ends. With protocols like A2A and OSSA a coordinator can discover and orchestrate specialist agents on demand.
See multi-agent patternsWhat You Can Ask
These are questions that are hard to answer today because the data lives in multiple systems and the context is spread across teams. With a digital twin an AI agent handles them in seconds.
"Which customer cohort is driving the NRR decline and what did their product usage look like in the 6 months before they churned?"
Agent calls infer_drivers("nrr", "down") to rank upstream candidates, then explain_path to trace NRR → Churned MRR → Logo Churn → Health Score → Feature Adoption with edge classifications. Generates cohort SQL from the adapter. Cross-references product analytics data. Returns a segmented analysis.
"Our CAC Payback went from 14 to 19 months this quarter. Break down exactly what changed."
Agent calls get_formula("cac_payback") to decompose CAC Payback = CAC / (ARPA × Gross Margin). Generates SQL for each component from the adapter via generate_query. Compares quarter-over-quarter. Identifies the driver.
"We are switching from Chargebee to Stripe next quarter. Which metrics and data pipelines are affected?"
Agent reads the adapter. Finds all fields sourced from Billing. Lists the 26 metrics that depend on them and the field owners responsible for remapping.
"Show me every metric that goes into our board deck and flag any that are calculated differently than the GASP standard. What other metrics should we be tracking?"
Agent calls lookup_metric on each board metric and compares the canonical formula against the adapter SQL. Flags deviations with the metric owner. Then coverage_gap(board_metrics) surfaces direct upstream and downstream metrics not on the deck.
"If our three largest accounts by ARR churned tomorrow what would our NRR, GRR and Rule of 40 look like?"
Counterfactual "what if" simulation lives on the TWIN Modifier, not the MCP. Pull current ARR by account, set the lever, watch the propagation through the relationship graph in real time.
How to Build It
Install the GASP MCP server
One command. 11 tools. 300 metrics. Instant access.
claude mcp add gasp-standard -- npx -y gasp-standard-mcp Download the adapter template
194 fields grouped by source system. Each annotated with which metrics use it.
Download templateMap your warehouse
Fill in schema.table.column for the fields your warehouse provides.
Start with Billing fields. They cover the most metrics.
Assign metric owners
Declare which team owns each metric definition. This is required. See the adapter guide for the format.
Validate
Run validate_adapter to see your coverage.
Fully covered metrics get SQL generation. Partial coverage shows exactly which fields to add next.
Query
Ask your AI assistant complex questions. The agent uses the relationship graph to trace dependencies, the adapter to find the right columns and the query generator to produce warehouse-specific SQL.
Architecture
How the pieces connect for data engineers who want to understand the full stack.
Optional: Deploy the Knowledge Graph
The MCP server works standalone. For teams that want to explore the graph visually or run custom Cypher queries you can deploy to Neo4j:
cat gasp-knowledge-graph.cypher | cypher-shell -u neo4j -p password The Cypher file includes 60 metric nodes, 11 entity nodes, 197 field nodes and 22 source category nodes with full relationship edges and indexes. See the full schema.
Downloads and Quick Start
Related pages:
Govern the Agents
A digital twin defines what your agents measure. GASP: AICF defines how those agents are governed. 168 canonical controls across SOC 2, NIST AI RMF, EU AI Act, ISO 42001 and four other frameworks. Each with evidence requirements, risk tiers and a full question bank for intake review.
Classify any AI tool against 8 frameworks in one step. Run a gap report, generate a questionnaire, track evidence and issue a decision. All from Claude Code or Claude Desktop via MCP.
- Controls
- 168
- Frameworks
- 8
- AI-specific
- 35
- Mappings
- 807
Frequently Asked Questions
Does this require changes to our existing infrastructure?
No. The twin layers on top of what you already have. Your warehouse, pipelines and dashboards stay exactly as they are. The adapter is a JSON file that describes your existing schema. Installing the MCP server is a single command.
What is the real effort involved?
The technical setup is fast. Installing the MCP server takes one command. Mapping warehouse fields to the adapter template takes a data engineer an afternoon if your dbt project is well structured.
The real effort is organizational: getting cross-functional agreement on which metrics matter, how each one is calculated and who owns the definition. This is the conversation most companies avoid because it surfaces years of accumulated inconsistency. It is also the most valuable part of the process. GASP provides the starting vocabulary so your team is editing a draft rather than writing from scratch. And this work only needs to happen once. After that initial curation the twin is a living reference that agents, analysts and executives all share.
Do agents write to our warehouse or source systems?
Agents read from the twin. They generate SQL that a human or CI pipeline can review and execute. The twin is a read-only semantic layer. Agents query the twin, not your CRM or billing system.
Can multiple agents use the twin simultaneously?
Yes. The twin is a shared model. Any number of agents can read from it in parallel. One for retention analysis. Another for pipeline diagnostics. Another for board prep. They share the same canonical definitions and the same adapter mapping so their answers are always consistent.
What if our definitions differ from the GASP standard?
That is expected. GASP is a starting point, not a mandate. Your team reviews each definition, adjusts where your business model requires it and documents the deviation in the adapter. The value is in having one agreed definition per metric, regardless of whether it matches GASP exactly. The standard gives you 300 definitions to react to rather than 300 blank pages to fill.
Who should own the curation process?
Typically RevOps or Finance leads the initial curation because they sit at the intersection of Sales, CS and Engineering data. The adapter makes ownership explicit: each metric has a named team, each field has an optional pipeline owner. After the initial pass, ownership is distributed. The teams closest to the data maintain their own sections.
How does this relate to a semantic layer like dbt metrics or Cube?
Complementary. A semantic layer defines how to query your warehouse. The twin defines what those queries mean in business terms: which metrics connect to which, who owns them and what the canonical formula should be. If you already have a dbt metrics layer the adapter mapping is even simpler because your field references are already well structured.