I’ve worked as an infrastructure consultant in Norway for about two and a half years. That’s not long. But in that time, every single client I’ve worked with has had a version of the same problem.
Someone pushes a change to a shared Terraform module. An hour later, Slack lights up. Three teams are broken. Nobody knew they depended on that module. Or rather, one person knew. And they left six months ago.
The dependency chain was always there. It just wasn’t visible to anyone.
This post is about that problem: the complete absence of cross-repo dependency visibility in modern infrastructure teams. It’s a problem I kept running into, kept hearing about, and eventually decided to build something to fix. But before I talk about the fix, I want to talk about why the problem exists, why it’s getting worse, and why the “obvious” solutions don’t actually work.
What is cross-repo dependency visibility?
Most engineering teams today operate in a polyrepo setup. Not because they chose it carefully — usually because the org grew, teams split, acquisitions happened, and now there are a hundred or more repos across GitLab or GitHub.
Inside those repos, there are dependencies everywhere. A Terraform module in one repo is sourced by twelve others. A Docker base image built in one pipeline is pulled by thirty Dockerfiles across the org. A GitLab CI template is included by eighty pipelines. A Helm chart references another internal chart. An Ansible role depends on a shared collection.
Cross-repo dependency visibility means being able to answer one question: if I change this thing, what else is affected?
Not “what might be affected.” Not “let me grep around and hope I find everything.” But a clear, always-current, automatically maintained answer.
Almost nobody has this.
Why this problem shows up at every growing org
The pattern is remarkably consistent. Here’s how it plays out:
At 10–30 repos, you probably don’t need tooling. A senior engineer holds the dependency map in their head. They can tell you “oh, if you touch the VPC module, repos A, B, and C use it.” This works fine. It’s fast. It’s accurate. It requires zero infrastructure.
At 50–100 repos, the mental model starts cracking. Nobody can hold all the relationships in their head anymore. People start grepping across repos; cloning everything locally, running ripgrep, building a picture manually. It takes hours. The results are point-in-time snapshots that are stale by the next morning.
At 100–500+ repos, grep breaks down entirely. The dependency graph is too large, too dynamic, and spans too many ecosystems to track manually. The “right” answer to “who uses this module?” becomes “ask in the #platform channel and wait for replies.”
The breaking point is almost always the same: someone pushes a breaking change and multiple downstream consumers fail. Nobody knew who to notify. Or a senior engineer leaves and their mental map of the org leaves with them. Or a new hire joins the platform team and literally cannot understand what depends on what.
It’s not a Terraform problem — it’s an org-wide visibility problem
When I first encountered this, I thought it was a Terraform thing. Terraform modules get sourced across repos, and there’s no built-in way to know who’s consuming a given module at which version.
But it’s much bigger than Terraform.
The real-world dependency graph for a platform team includes Terraform module sources (via git URL or registry), Docker base images (often pinned to :latest everywhere), GitLab CI templates included by dozens of pipelines, GitHub Actions reusable workflows, Helm charts referencing other internal charts, Ansible roles and collections, Python packages installed from internal git URLs, Go modules, and npm packages published to internal registries.
These dependencies cross ecosystem boundaries. A Terraform module might reference a Docker image that’s built by a CI pipeline that uses a shared template. The graph doesn’t respect neat tool categories, and no tool today shows you the full picture across all of them.
The tribal knowledge problem
The part of this that resonates most with people — I’ve seen it in conversations, in Reddit threads, in every team I’ve worked with — isn’t the technical complexity. It’s the human cost.
Infrastructure dependency knowledge is tribal knowledge. It lives in the heads of whoever has been around the longest. There’s no shared, always-current view. There’s no way for a new engineer to sit down and understand the full dependency picture without weeks of archaeology.
When that senior engineer goes on holiday, decisions slow down. When they leave the company, that knowledge is gone permanently. The team doesn’t even know what they’ve lost until something breaks and nobody can explain why.
This is the real cost: not that the dependency graph is complex, but that the only copy of it exists in human memory, and human memory doesn’t scale, doesn’t transfer, and doesn’t survive attrition.
What people try — and where each approach falls short
I’ve seen teams try every reasonable approach. Each one solves part of the problem and fails at the rest.
Grep / ripgrep across repos
The first tool everyone reaches for. Clone the repos, run a search, assemble the picture manually.
It works, for a while. The results are a point-in-time snapshot with no history, no dashboard, and no way to handle transitive dependencies. At a hundred repos, it takes hours. It doesn’t catch renamed references or implicit dependencies. And the results are stale the moment you close the terminal.
Backstage or a service catalog
Backstage looks like the right answer on paper. It’s a developer portal with a service catalog. You define your dependencies in a catalog-info.yaml file per repo, and Backstage renders a graph.
The problem: it requires humans to declare and maintain those YAML files. In practice, the catalog goes stale within weeks. As one engineer put it to me, it’s “documentation with extra steps.” If the dependency is declared in a YAML file that nobody updates, it’s just a different kind of tribal knowledge.
Renovate or Dependabot
These tools are excellent at what they do: they automatically open pull requests when a dependency has a new version available. They solve the upgrade lag problem.
But they don’t solve the visibility problem. Renovate doesn’t tell you “who is consuming module X at which version right now.” It reacts after a new version is published. It can’t answer the question you need before you make a breaking change: “if I push v2.0 of this module, which 40 repos need to coordinate?”
HCP Terraform Explorer
HashiCorp’s cloud platform has a module explorer that shows workspace-to-module relationships. It’s the closest native solution.
But it only works for Terraform. It only works if your plan and apply runs through HCP Terraform. If your org has a mix of Terraform Cloud, self-hosted runners, Docker, CI templates, and Helm — and most do — you get a partial view with large blind spots.
”Just use a monorepo”
This is the most commonly suggested solution, and it’s the least helpful for the teams who actually have this problem.
Monorepos are a valid architecture. But suggesting a monorepo to an org with three hundred existing repos, multiple teams with separate access controls, compliance boundaries, and maybe some repos inherited through an acquisition — that’s like suggesting “just rewrite it in Rust.” Technically valid. Not actionable.
And a monorepo still doesn’t give you blast radius analysis. It just makes the grep slightly easier.
Building it yourself
This is the one that surprised me. Multiple teams I’ve talked to — and many more on Reddit — have independently built their own bespoke solutions. Nightly cron jobs that shallow-clone every repo, grep Dockerfiles and Terraform source blocks, dump the results to SQLite or a spreadsheet.
These solutions work. They prove the approach is sound. But they’re brittle, org-specific, hard to maintain, and they go stale between runs. Nobody has the bandwidth to turn their weekend hack into a real product. They want a product to exist. It just doesn’t.
Why the problem is getting worse, not better
Two trends are making this more painful every year.
Tool diversity is increasing. Five years ago, a platform team might have only been dealing with Terraform. Today the same team manages Terraform modules, Docker images, CI pipelines, Helm charts, Kubernetes manifests, and maybe Ansible. Every new tool in the stack adds a new class of cross-repo dependency that no existing tool tracks.
AI-assisted development is accelerating the sprawl. Teams ship code faster than ever with AI coding tools. But the AI writes the module — it doesn’t know that three repos in another part of the org reference the interface it just changed. AI accelerates the creation of infrastructure code without any awareness of the existing dependency graph. That gap compounds every sprint.
The irony is that the tools keep getting better at everything except showing you how it all connects.
The pattern that works: auto-discovery from code
Across every DIY solution I’ve seen — and across the engineering community discussions I’ve been part of — the approach that works is always the same:
- Scan the org — enumerate every repo via the GitLab or GitHub API.
- Parse the actual files — Terraform source blocks, Dockerfile
FROMstatements, CI include directives, Helm chart dependencies, Python requirements, Go module files, npm manifests. Not metadata. Not YAML someone fills in. The real source files. - Build a directed graph — repo A depends on artifact X, which is produced by repo B. Store this as a queryable graph.
- Keep it fresh — webhook-triggered or scheduled rescans so the graph is never stale.
- Make it queryable — “show me everyone who consumes this Docker base image” should be a one-click answer, not a two-day grep expedition.
The fact that independent practitioners keep arriving at this same architecture tells you something important: the approach is sound. What’s missing is a product that does it well, keeps it current, and works across every ecosystem a team actually uses.
What I’m building
This is the problem that led me to build Riftmap. It scans a GitLab or GitHub organisation, automatically discovers cross-repo dependencies across Terraform, Docker, CI templates, Python, Go, npm, Ansible, Helm, Kubernetes, Kustomize (and more coming) — and builds a queryable dependency graph with visual blast radius analysis.
No per-repo configuration. No YAML to maintain. One read-only token at the org level, and it discovers everything from the actual code.
It’s currently in early access. If this problem sounds familiar, I’d genuinely like to hear how it shows up at your org — whether or not Riftmap is the right solution for you. The more I understand how different teams experience this, the better the tooling gets for everyone.
You can see more at riftmap.dev, or reach me directly at [email protected].