A white-label, self-hosted catalog with an AI assistant that only sees what you expose, using the model you choose. Built for teams that take privacy seriously.
Connect anything → govern everything
What's inside
A full governance toolkit, in one private deployment. No SaaS. No data leaving your perimeter.
One searchable inventory of every table, column, measure and report across your entire stack.
Definitions, owners, stewards, tags and SLAs on every asset — finally in one place.
Cross-tool dependency map. Click any node to see what breaks downstream if it changes.
Surface unused columns and measures, plus the highest-impact assets your team should care about most.
Chat over your catalog, lineage and SQL — using your model, with strict guardrails. Zero hallucination, tool-grounded answers.
Slack notifications on every sync — run metrics, durations and failure traces, so engineers catch issues fast.
Privacy & security
datagov is delivered as a fully white-label, in-house deployment. There is no shared SaaS tenant, no third-party data plane, no telemetry phoning home.
You own the infrastructure. You own the database. You own the model. We just build the software that runs on top.
Your cloud account, your network, your auth. Zero external dependencies for the core product.
Point the assistant at OpenAI, Azure OpenAI, Anthropic, or a fully local LLM. The model only ever sees the metadata you whitelist.
Live query tools are SELECT-only, dry-run first, byte-capped, and row-capped.
Toggle live tools per source. Your admins decide exactly what the assistant is allowed to touch.
AI assistant
Grounded in your catalog. Restricted to your guardrails. Speaks your data definitions, not generic ones.
Connected to your catalog
revenue_per_booking defined?
| Name | Owner | Used by |
|---|---|---|
| revenue_per_booking | finance-team | 14 reports |
Defined in mart_finance.bookings. Want me to show the SQL?
Zero hallucination
Always grounded in catalog tools. If the answer isn't in your metadata, the assistant says so.
Read-only by default
Live SQL is restricted to SELECT/WITH, dry-run, byte-capped.
Asks before guessing
Ambiguous prompts trigger a clarifying question — never a fabricated answer.
Per-org feature flags
Admins toggle which live tools the assistant can use, per source.
How it works
Add your sources with credentials you control. Your data never leaves your network.
A scheduled background pipeline extracts and transforms metadata automatically.
Use the dictionary, lineage and insights to clean up, document and track impact.
Ask the assistant anything about your data — it answers from your catalog, not the internet.
Get a live walkthrough of datagov on your own stack. No SaaS sign-ups, no data leaving your perimeter.