Blog

Tracing a Legacy System with Broken History — Designing Code Sonar

A design note on building an internal DeepWiki-like system for company legacy environments where public GitHub repo-oriented tools could not be applied as-is.

AI AgentLegacy AnalysisMCPFilesystem MCPGitHub MCPConfluence MCPMSSQLKafkaMermaidDeepWiki

Note: This post reconstructs the design concepts from a real project. No internal code or data is included.

The Problem: Why Legacy Onboarding Takes So Long

When I first joined the team, understanding the system's history was the biggest wall I hit.

  • The code exists: years of accumulated production code. But no comments explaining why it was written this way.
  • The docs are stale: There's a Confluence space, but it reflects state from 3 years ago.
  • The engineers are gone: The person who originally designed this system has long since left.
  • The history is broken: JIRA tickets exist but aren't connected to code. PR descriptions are sparse.

The result: a new engineer needs a long ramp-up period before they can work independently on the system, and throughout that time they rely heavily on explanations and reviews from veteran engineers.

This isn't simply a "no documentation" problem. The code, docs, history, and business context are all fragmented across different systems — and connecting them into a coherent picture is nearly impossible manually.


The Goal Was an Internal DeepWiki

DeepWiki analyzes code repositories and auto-generates documentation. It is a well-built tool. The problem I needed to solve, however, lived inside a company environment where private repositories, legacy databases, and Jira/Wiki history were all part of the same context. A public GitHub repo-oriented tool could not be applied as-is, so the real target was closer to a custom internal DeepWiki.

The idea was to keep the DeepWiki-like goal of reading a codebase and producing documentation, but adapt it to company constraints by adding GitHub, MSSQL, and Jira/Wiki MCP sources with cross-validation.

The limits of code-only analysis:

What code analysis can tell you:
- This API accepts these parameters
- This function calls these classes
- The DB table has this schema

What code analysis cannot tell you:
- What data gets published to this Kafka topic and why
- Why this SP has this specific branching logic
- Why this code was changed 3 years ago
- What business policy this exception handler reflects

Understanding business logic requires context beyond the code. That context is scattered across Jira issues, Confluence policy docs, GitHub PR descriptions, and DB stored procedures.


Design Principle: Cross-Validation with Traceable Evidence

Code Sonar's core design principle is one thing:

Every claim must have a traceable source.

If an AI writes "this API handles order processing," it must be clear whether that claim comes from code, a Confluence doc, a JIRA issue, or AI inference. Without this distinction, AI confidently produces authoritative-sounding wrong documentation.

So evidence is classified into four types:

Type Source Trust Level
code/config Actual code & config files Highest (implementation fact)
wiki Confluence design & policy docs High (design intent)
github PRs, commits, issues Medium (change context)
inferred AI reasoning Low (must not use definitive language)

The Stepwise MCP Architecture

The most important design choice was simple: do not start by querying the database.

Legacy databases have many tables and stored procedures, and broad discovery queries can be risky in an operational environment. Code Sonar first narrows candidates from code and history, obtains specific table and SP names, and only then queries MSSQL.

🔎 Filesystem MCPLocal Search(API · Consumer · SQL Call Sites) 🐙 GitHub MCPPR · Commit · CODEOWNERS(Change Context) 🎯 Target SelectionTable · SP Candidates(Full-scan Prevention) 🗄️ MSSQL MCPTargeted Query(SP Definition · Dependencies) 📋 Jira & Confluence MCPPolicy · Issue · Ops Docs(Business Background) 🤖 Code Sonar AgentEvidence Cross-ValidationDocument Generation

Filesystem MCP: Local Search as the Starting Point

The first step is local workspace search. Code Sonar looks for API paths, Kafka topic names, consumer classes, repository calls, SQL mappers, and configuration keys.

This step is not meant to produce the final answer. It narrows the search scope. For example, if RankingConsumer reveals a topic name and an SP call, those names become the next search keys.

GitHub MCP: Context of Change

PR titles, bodies, review comments, commit messages, CODEOWNERS. This is the most important source for tracing when and why code changed.

For teams that write good PR descriptions, the business context of a code change is most clearly captured here. Conversely, sparse PR descriptions significantly reduce this source's value.

Jira & Confluence MCP: Business Context

Planning docs, policy documents, QA issues, incident reports. This is what explains why a given logic is necessary.

A 3-year-old Confluence doc may no longer match the current code. So Confluence is only used as evidence for design intent — implementation facts must always be verified against code.

MSSQL MCP: Business Logic Inside SPs

This is the most distinctive aspect of this project.

In legacy systems, core business logic is often locked inside Stored Procedures. You can see the code calling an SP, but you can't know what happens inside it without direct DB access.

MSSQL MCP is closer to the final verification step. It reads SP definitions only for the table and SP names identified by Filesystem MCP and GitHub MCP, then analyzes branching logic and data transformation rules.

That sequence matters for two reasons:

  • It avoids unnecessary table exploration or broad full-scan style access.
  • It keeps every DB finding connected to a concrete code path.
-- If there's an SP like this:
CREATE PROCEDURE CalculateAdBid
    @adId INT, @slotId INT
AS BEGIN
    -- Core business logic lives in here
    -- Invisible from application code alone
END

atls: Filling the MCP Ecosystem Gap

Jira & Confluence MCP works well for reads (search, retrieval), but falls short for features like page creation/update and recursive Wiki scraping.

So I built a Python CLI package: atls — a wrapper around the Atlassian REST API.

# Recursive Wiki collection
atls wiki fetch https://wiki.example.com/spaces/COM/pages/1000 --recursive --max-depth 3

# Auto-publish analysis results to Confluence
atls wiki create "System Index" --markdown-file "Index.md" --space "~user" --parent 1000
atls wiki update 1001 "commerce-api - Data Flow" --markdown-file "Data Flow.md"

Designing 17 Specialized Agents

Assigning full analysis to a single Agent creates two problems:

  1. Context window overflow
  2. Blurred responsibility → difficult quality validation

The solution: a chain of 17 specialized Agents with clearly separated roles.

📥 Evidence Collectionfilesystem-source-scannerwiki-source-scannergithub-source-scanner 🔍 Deep Analysisbridge-analyzerdb-schema-analystintegration-flow-analystentity-lifecycle-analyst 📝 Document Writingdeep-analysis-writerbusiness-workflow-analystcross-repo-tracerenv-matrix-analyst ✅ Validationevidence-auditorqa-revieweranalyst-backend 📤 Publishingwiki-publisheratlassian-adapter

Key agents:

  • bridge-analyzer: The most critical agent. Traces the complete business flow from client to Kafka topic, consumer, and SP call.
  • db-schema-analyst: Analyzes DB schema and SP logic around the table and SP candidates identified in earlier steps.
  • evidence-auditor: Audits every document claim for traceable sources. Separates inference (inferred) from fact (code/config).
  • qa-reviewer: Final check on Mermaid syntax, evidence quality, and Confluence publishing policy compliance.

The Mermaid Diagram Strategy

What happens if you ask an AI to "draw the data flow of this system in Mermaid"?

Almost always: spaghetti. Too many nodes, crossed arrows, parse errors.

The solution is a 2-stage generation strategy.

Stage 1: Pre-refine relationships

First ask the AI:
"What is the relationship between the Order API and the Payment Consumer?
Does the Order API emit an event to the Payment Consumer, or is the flow reversed?
Under what conditions does this flow occur?"

Stage 2: Generate diagram from refined relationships

Generate the diagram based on the refined relationship list.
Explicitly specify Mermaid syntax constraints:
- Quote API paths and URLs in node labels
- No fanout shorthand syntax
- Use <br/> for line breaks in labels

Separating these two stages alone produced a dramatic improvement in diagram quality.


Current State and Honest Limitations

What Code Sonar does well today:

  • ✅ Automated business flow documentation (client → Kafka → consumer → SP)
  • ✅ Data flow Mermaid diagram generation
  • ✅ Evidence Ledger (claim-to-source mapping)
  • ✅ Automated Confluence publishing

Where it's still weak:

  • ⚠️ Method/function-level code-granularity analysis
  • ⚠️ Full semantic interpretation of complex SP branching logic

The Ideal End State: A 5-Level Design

With current implementation as Level 1, the ideal complete system looks like this.

Level 2 — Code-Level Precision

  • AST (Abstract Syntax Tree) based Call Graph generation
  • Method-level business rule tracing
  • ETL pattern reverse engineering

Level 3 — Bidirectional Traceability

"Which JIRA issue changed what code, and what Kafka topic did that change affect, what consumer handles it, what DB tables does it touch, what SP does it invoke?"

If this is fully automated, impact analysis on a legacy system becomes a matter of minutes.

Level 4 — Documentation Drift Detection

On every PR merge, automatically compare changed code against existing documentation. If a mismatch is detected, flag the doc as "stale" and trigger an update notification.

Level 5 — Interactive Query Interface

User: "Show me the full business flow when CPC ad ranking refreshes"

Code Sonar: [Generates sequence diagram with Evidence citations]
- RankingConsumer subscribes to RANKING_UPDATE topic
  [source: code/RankingConsumer.java:L45]
- Calls CalculateAdBid SP to compute bid price
  [source: code/BidService.java:L112 + SP definition]
- Saves ranking snapshot to MongoDB
  [source: code/RankingRepository.java:L67]
- Publishes AD_RANK_REFRESHED event on completion
  [source: wiki/Confluence#1234]

Natural language queries with Evidence-cited answers. That's the final target.


Reflections

The biggest lesson from building Code Sonar: legacy system problems aren't about missing code — they're about missing context.

The code exists. What's missing is why it exists, what business decision created it, and what incident that 3-year-old change was designed to prevent.

AI can't fully reconstruct that lost context. But it can follow scattered clues — local code, GitHub PRs, Jira issues, Confluence docs, and DB SPs — in a controlled order and produce a more reliable explanation.

Attaching sources to every claim. Separating inference from verified fact. That's the core of Code Sonar.