Author Image

CTO & Co-founder of Visivo

A Practical Git Workflow for Data Teams (Branches, PRs, and Reviews)

Software teams settled the version-control debate years ago. Here is a concrete branch-and-PR workflow for analysts who want the same safety net for dashboards.

A branch and pull request workflow for data teams

A practical Git workflow for data teams is the same one software teams adopted years ago: create a branch for each change, open a pull request, get a review, and merge once checks pass. Map those four steps onto analytics and you get a real safety net for dashboards and metrics, with every change tracked, reviewed, and reversible. This guide is written for analysts who are new to Git and want the minimal version that actually sticks.

The version-control debate is over in software engineering. Nobody edits production code by SSHing into a server and saving a file in place. Yet that is still how a lot of analytics work happens: someone opens a BI tool, edits a live calculation, hits save, and hopes for the best. The result is the genre of bug every data team knows by heart, the metric that quietly changed and nobody can say when or why. Git fixes this, and it does not require you to become a software engineer to benefit.

Why analysts resisted Git for so long

Git earned a reputation for being hostile. The command line is unforgiving, the mental model of commits and branches and remotes is not obvious, and the failure mode when you get it wrong is a wall of red text about detached HEADs and merge conflicts. For an analyst whose job is answering business questions, that learning curve felt like a tax with no obvious payoff.

There was also a tooling gap. For most of BI's history, dashboards were not text. They lived inside a proprietary tool as binary blobs or rows in a vendor database. Git is built to track changes in text files, so when your dashboard is not text, Git has nothing meaningful to diff. You could check in an export, but a one-line change to a chart produced a 4,000-line unreadable diff. Version control only pays off when the thing you are versioning is legible, and legacy BI artifacts were not.

The third reason is cultural. Analysts were trained in SQL, spreadsheets, and BI tools, not in software process. Branches and pull requests were someone else's vocabulary. So the resistance was rational. The payoff was unclear, the tooling did not fit, and the workflow belonged to a different discipline.

What changed is that analytics moved toward code. When your metrics, models, and dashboards are defined in plain-text files, the original objection disappears. The diff is readable, the review is meaningful, and the same workflow that protects application code protects your numbers. This shift is the foundation of BI-as-code, and Git is the substrate it runs on.

The minimal Git workflow that actually sticks

You do not need the full Git surface area. You need a loop you can repeat without thinking, made of five moves.

  1. Pull the latest version of the main branch so you start from current truth.
  2. Branch to create an isolated workspace for your change.
  3. Commit as you make discrete, meaningful edits.
  4. Push your branch and open a pull request so others can review.
  5. Merge once the review is approved and automated checks are green.

In commands, the everyday loop looks like this:

# Start from the latest main
git checkout main
git pull

# Branch for the specific change
git checkout -b fix/net-revenue-excludes-refunds

# ... edit your project files ...

git add project.visivo.yml
git commit -m "Subtract refunds from net_revenue per finance definition"

git push origin fix/net-revenue-excludes-refunds
# then open a PR in GitHub/GitLab

The mental model that makes this click: a commit is one discrete analytics change with a message explaining why; a branch is a parallel line of work that cannot affect anyone else until you merge; a pull request is the formal moment where a change is proposed and reviewed; and a merge is integration into the shared truth. Four concepts, one loop. Everything else in Git you learn the day you actually need it.

A good commit message earns its keep months later. "Update dashboard" tells future-you nothing. "Subtract refunds from net_revenue per finance definition" tells you what changed and why, exactly what you want when you run git blame on a metric and need to reconstruct a decision.

Branches: parallel work without conflicts

A branch is the single most underrated tool for a data team, because analytics work is inherently parallel and inherently interruptible. You are halfway through redesigning the marketing dashboard when finance asks for an urgent fix to the revenue model. Without branches, those two changes pile up in the same working copy and you ship them tangled together, or you ship half of one. With branches, each lives in its own line of work.

# Analyst A is rebuilding the marketing dashboard
git checkout -b feature/marketing-dashboard-redesign

# Analyst B fixes the revenue model in parallel, untouched by A's work
git checkout main
git checkout -b fix/revenue-completed-orders-only

Because the two branches touch different files (or different parts of the same file), they merge cleanly and independently. Analyst A's half-finished redesign never leaks into production while Analyst B's fix ships immediately. The urgent fix does not have to wait for the redesign to finish, and the redesign does not have to be rushed to unblock the fix.

Branches also make experimentation cheap and safe. Want to try a completely different cut of the funnel metric? Branch, build it, and if it does not pan out, delete the branch. Nothing in main was ever touched, so the cost of a bad idea drops to nearly zero, which is the condition under which people try good ones.

Keep branches short-lived and focused on one change. A branch that lives for three weeks and touches forty files is a merge conflict waiting to happen and a review nobody can do well. One change, one branch, one PR is the default that keeps the workflow frictionless.

Pull requests as a metric review gate

If branches are the most underrated tool, the pull request review is the most valuable part of the whole workflow, because it is where bad metric logic gets caught before a stakeholder ever sees a wrong number.

A code review on a dashboard change is not bureaucracy. It is a second pair of eyes asking the questions that matter:

  • Does this SQL join on the right key, or will it fan out and double-count revenue?
  • Does the date logic handle month boundaries and time zones correctly?
  • Does this metric definition match what finance actually means by "net revenue"?
  • Will any dashboard that depends on this model break?

These are precisely the errors that are invisible in a live-edit BI tool and obvious in a diff. When a reviewer sees WHERE status = 'completed' disappear from a revenue query, they catch the bug in thirty seconds. When that same change is made directly in a production dashboard with no review, it surfaces as a 10x revenue spike in the CFO's Monday meeting.

The PR is also where context gets written down. The description explains the business reason, the testing performed, and the expected impact. That record outlives the people who wrote it. A new analyst can scroll the PR history and learn how every number on the dashboard came to be defined, which is the cheapest onboarding a team will ever get. This is the same discipline that gives you a single source of truth for metrics: the definition and the reasoning behind it live together, in the open.

A practical norm: require at least one approving review on the main branch, and route metric-sensitive changes to whoever owns the business definition. You are not adding ceremony. You are making sure the person who knows what "active customer" means signs off before the definition changes under them.

Wiring CI checks into the merge

Human review catches logic errors. Automated checks catch the mechanical ones, and they do it on every PR without anyone remembering to. This is the role of continuous integration in an analytics workflow.

When a pull request opens, a CI pipeline can automatically validate the change before a human even looks at it: confirm the configuration is syntactically valid, run data-quality assertions against the affected models, and even spin up a preview build of the dashboard so reviewers can click through the actual result. Pair that with a protected main branch, configured so nothing merges until the checks pass, and you have closed the loop. A change that breaks validation simply cannot reach production.

# .github/workflows/validate-analytics.yml
name: Validate Analytics
on:
  pull_request:
    paths: ["**.visivo.yml"]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install visivo
      - run: visivo compile      # config valid? references resolve?
      - run: visivo test         # data-quality assertions pass?

The combination of protected branches plus CI is what turns Git from a filing cabinet into a quality gate. The branch protection enforces that review and checks happened; the CI does the checking. Neither relies on a person being diligent at the exact moment it matters, which is the whole point of automation.

Doing this with Visivo dashboards

Everything above assumes your analytics live as plain text, and that is exactly what Visivo provides. In Visivo, your data sources, models, metrics, dimensions, relations, and dashboards are all defined in YAML files that sit in a Git repository. There is no separate "export to Git" step, because the YAML is the dashboard. That makes this entire workflow native rather than bolted on.

A metric definition is a few readable lines, which means a change to it produces a clean, reviewable diff:

models:
  - name: orders
    sql: SELECT * FROM orders_table
    metrics:
      - name: net_revenue
        expression: "SUM(amount) - SUM(refund_amount)"
        description: "Gross revenue minus refunds, per finance definition"

When an analyst changes that expression on a branch and opens a PR, the reviewer sees exactly which line moved and why. CI runs visivo compile to confirm every ${ref(...)} resolves and visivo test to run quality assertions, and a preview build renders the affected dashboard so reviewers see the real chart. The branch, the review, the checks, the merge: the same loop software teams have trusted for years, now applied to the numbers your business runs on.

To see the file-based model in action, the from YAML to dashboard workflow walks through a full project, and get started gets a local project running in a few minutes. The point is the same throughout: dashboards are too important to edit in place and hope. Branch, review, merge, and let the safety net do its job.

Previously in Visivo

This post continues a series on bringing software discipline to analytics. Previously we covered building a single source of truth for metrics, the foundation that makes a metric review gate meaningful in the first place. Up next, we look at how the same code-first stack powers headless BI, one metrics layer serving every consumer.

Install command copied