DG
datagov
Data Governance · 100% In-house

Data governance, fully in-house.

A white-label, self-hosted catalog with an AI assistant that only sees what you expose, using the model you choose. Built for teams that take privacy seriously.

Self-hosted White-label Bring your own LLM Zero data egress
app.yourcompany.com / datagov
Total Reports
1,284
Total Measures
3,792
Cleanup
186
Lineage
src stg stg mart BI

Connect anything → govern everything

BI platforms Transformation layers Cloud warehouses Relational DBs · soon & more

What's inside

Everything your data team already wished they had.

A full governance toolkit, in one private deployment. No SaaS. No data leaving your perimeter.

Unified Catalog

One searchable inventory of every table, column, measure and report across your entire stack.

Data Dictionary

Definitions, owners, stewards, tags and SLAs on every asset — finally in one place.

Lineage Graph

Cross-tool dependency map. Click any node to see what breaks downstream if it changes.

Cleanup & Impact Insights

Surface unused columns and measures, plus the highest-impact assets your team should care about most.

AI Assistant

BYO LLM

Chat over your catalog, lineage and SQL — using your model, with strict guardrails. Zero hallucination, tool-grounded answers.

Pipeline Alerts

Slack notifications on every sync — run metrics, durations and failure traces, so engineers catch issues fast.

Privacy & security

Your data never
leaves your perimeter.

datagov is delivered as a fully white-label, in-house deployment. There is no shared SaaS tenant, no third-party data plane, no telemetry phoning home.

You own the infrastructure. You own the database. You own the model. We just build the software that runs on top.

Dockerized · Postgres · Nginx · runs anywhere

Deployed in your VPC / on-prem

Your cloud account, your network, your auth. Zero external dependencies for the core product.

Bring your own model

Point the assistant at OpenAI, Azure OpenAI, Anthropic, or a fully local LLM. The model only ever sees the metadata you whitelist.

Read-only by default

Live query tools are SELECT-only, dry-run first, byte-capped, and row-capped.

Per-org feature flags

Toggle live tools per source. Your admins decide exactly what the assistant is allowed to touch.

AI assistant

An assistant that actually knows your data.

Grounded in your catalog. Restricted to your guardrails. Speaks your data definitions, not generic ones.

DG

Governance Assistant

Connected to your catalog

Online
Where is revenue_per_booking defined?
DG
Found 1 measure in semantic model finance_core:
NameOwnerUsed by
revenue_per_bookingfinance-team14 reports

Defined in mart_finance.bookings. Want me to show the SQL?

Yes, and tell me which reports break if I rename it.
DG
Tracing lineage…

Built-in guardrails

  • Zero hallucination

    Always grounded in catalog tools. If the answer isn't in your metadata, the assistant says so.

  • Read-only by default

    Live SQL is restricted to SELECT/WITH, dry-run, byte-capped.

  • Asks before guessing

    Ambiguous prompts trigger a clarifying question — never a fabricated answer.

  • Per-org feature flags

    Admins toggle which live tools the assistant can use, per source.

How it works

From zero to governed in four steps.

1

Connect

Add your sources with credentials you control. Your data never leaves your network.

2

Sync

A scheduled background pipeline extracts and transforms metadata automatically.

3

Govern

Use the dictionary, lineage and insights to clean up, document and track impact.

4

Discover

Ask the assistant anything about your data — it answers from your catalog, not the internet.

Ready to bring data governance in-house?

Get a live walkthrough of datagov on your own stack. No SaaS sign-ups, no data leaving your perimeter.