The third path: why YAML catalogues rot and runtime tracing misses the point.
Two well-trodden roads. Both fail at scale, for opposite reasons. The way out isn't picking one — it's noticing the architecture is already in your code.
How we built cross-repo edge resolution with sandboxed WASM plugins, why we abandoned static type-flow analysis halfway through, and the trick that finally made routing-key resolution deterministic.
When we tell people ArchMellon detects communication edges from source code alone, the most common response is a polite, slightly skeptical nod. The skepticism is fair. Of course you can find http.Post calls. The hard part isn't the calls — it's resolving them: which repository owns the handler, which routing key they actually publish to, whether the consumer over in payment-svc still listens to that key after last quarter's refactor.
This post walks through how we solved that problem. It is not a victory lap. We rewrote the resolver three times, abandoned a perfectly nice type-flow analysis halfway through, and only got it deterministic after a lunch conversation with someone who'd built a similar thing for Erlang in 2015. None of this was on the original Linear ticket.
Take the most common pattern we see: a producer calls mesh.SendCommand("process_payment", ...) in order-svc, and somewhere across 47 repositories there's a consumer wired to that string. From the call site alone, you have a literal — a string — and a function name. You don't have:
Runtime tracing solves this beautifully — until you realize it can't see cold paths. If process_payment only fires on a Tuesday, and you sample on a Wednesday, the edge is invisible. We needed a static answer.
The first instinct of any compiler person is type-flow. Trace the literal "process_payment" through the AST, follow it into SendCommand's parameters, and reach the AMQP client's internal routing table. It's a beautiful problem on a whiteboard.
It's a nightmare in practice. Half the routing keys are constructed via fmt.Sprintf. Another quarter come from environment variables. The remaining quarter come from a constants file that's auto-generated from a YAML schema in a sibling repo. Type-flow analysis can chase the first quarter and gets nothing useful for the rest.
We spent four weeks on this. The tipping point was a Slack thread where Dmitri pointed out that we had the wrong abstraction. We weren't trying to analyze the program. We were trying to describe it — at the level of conventions teams already use. That meant the right primitive wasn't a static analyzer. It was a plugin.
The shape we landed on: ArchMellon parses the AST and offers it to a sandboxed WebAssembly plugin via a small set of host functions. The plugin gets to visit nodes — function calls, struct literals, constant declarations — and emit edges. We ship system plugins for AMQP, gRPC, Kafka, NATS, and HTTP. Customers can write their own for in-house conventions.
The contract is intentionally narrow:
pub trait CommDetector { // Called for every function-like node in the AST. fn visit_call(&mut self, ctx: &CallContext) -> Vec<Edge>; // Called once per repository — for cross-call state. fn visit_module(&mut self, ctx: &ModuleContext) -> Vec<Edge>; // Confidence — used by the resolver downstream. fn confidence(&self, edge: &Edge) -> f32; }
That's it. Three methods. The plugin author writes whatever pattern-matching they want against the AST cursor, emits edges with a confidence score, and we handle the rest — sandboxing, signing, indexing, cross-repo resolution.
Here's roughly what our system AMQP plugin does when it visits a call node:
fn visit_call(&mut self, ctx: &CallContext) -> Vec<Edge> { let Some(method) = ctx.method_name() else { return vec![]; }; if !AMQP_PUBLISH_METHODS.contains(method) { return vec![]; } let routing_key = ctx.first_arg().resolve_string() .unwrap_or_else(|| ctx.heuristic_key()); let target = self.topology .lookup(&routing_key) .unwrap_or("<unresolved>"); vec![Edge { from: ctx.repository().to_owned(), to: target.into(), protocol: "AMQP".into(), routing_key, confidence: if ctx.first_arg().is_string_literal() { 0.97 } else { 0.65 }, }] }
Two things to notice. First, the confidence score isn't a vibe — it's a function of how deterministically we resolved the routing key. Literals get 0.97. Heuristics get 0.65. Below 0.4, we surface the edge as "suspected" in the UI rather than asserting it.
The naive approach is to scan every repository for AMQP consumer registrations, build a global table from routing key to service, and look up producers against it. This works fine on toy datasets. On 47 real repositories with overlapping naming conventions, it falls apart. Three different teams had the routing key "order.created" — one was a domain event, one was a metric, one was a Kafka topic name accidentally reused.
The fix: don't resolve at lookup time. Resolve at index time, but keep the ambiguity as a graph property. We promote the routing key from a string to a scoped identifier — namespace + key + protocol — and let the resolver fan out.
Three takeaways for anyone building static comm detection:
comm-detector-pdk and works in any language with extism-pdk support.We're working on extending the plugin contract to support stateful detectors — patterns that need to maintain state across calls. Plus Java and C# language support.
If you've built something similar and want to compare notes — or have a messaging convention we don't handle yet — drop us a line. We're listening.
Beta is open. Five-minute setup. Source never leaves your network.