Inside the Reconciliation Agent: when to escalate vs. when to auto-fix
The Reconciliation Agent is the pain-killer for the migration bugs that destroyed TSB UK, SAAQ Quebec, and Revlon SAP. Here's how it decides whether to fix an anomaly or escalate it to a human.
Of the five agents in Migratio's fleet, the Reconciliation Agent is the one we obsessed over the most. It's the layer that catches the bug class that destroys cutovers. It's also the layer that, if it makes a bad call, can be the bug that destroys a cutover.
The agent's job: take a record from the source, take its mapped record(s) in the target, and produce one of four verdicts.
- ●Match — source and target agree exactly. No action.
- ●Reconcilable — source and target differ in a known, recoverable way. Agent proposes a fix.
- ●Anomaly — human review — agent has low confidence. Escalate to Exception Triage.
- ●Hard failure — agent has identified an integrity violation that requires a stop-the-line decision. Escalate to the cutover war-room channel.
The confidence model
Every Reconcilable verdict comes with a confidence score in [0, 1]. The score is the agent's belief that its proposed fix is correct, given the rule library and the historical decisions made against this customer's data.
Three thresholds, configurable per project:
- ●Auto-resolve threshold (default 0.92 for banking, 0.85 for tax) — at or above, the agent applies the fix and logs the decision
- ●Triage threshold (default 0.50) — below auto-resolve but above triage, the agent stages a proposal and waits for human approval
- ●Hard-fail threshold (any verdict at or below 0.50, OR any integrity-violating anomaly) — escalate immediately to war-room
Why three? Because the cost of a wrong auto-resolve is different from the cost of a wrong escalation. A wrong auto-resolve corrupts data silently. A wrong escalation just consumes a human reviewer's 5 minutes. We tune the auto-resolve threshold high enough that the false-positive rate is in the same order of magnitude as the human reviewer's own error rate.
How we set the auto-resolve threshold
Per project, we calibrate the threshold using two inputs:
- ●Historical reconciliation decisions from the customer's prior migrations (if they exist) — gives us a target false-positive rate matched to their tolerance
- ●Sample-pass labeled anomalies from the first 2 weeks of the engagement — gives us the agent's empirical precision/recall curve on this customer's data shape
For NRS's recent TaxPro → Rev360 cutover, we set the auto-resolve threshold at 0.91 (slightly under the banking default, slightly over the tax default). Of 47,238 surfaced anomalies, 3,617 cleared the threshold and auto-resolved. The agent's empirical precision on those 3,617 — verified later by NRS's compliance team — was 99.83%. Six records out of 3,617 were classified as "agent was wrong; reviewer would have escalated" — and all six were corrected before cutover.
The lesson — Agents make mistakes. The question isn't whether — it's whether your audit trail surfaces them quickly enough to correct, and whether your confidence model keeps the error rate inside acceptable bounds. Both are operational properties of the platform, not theoretical properties of the model.
Where the agent doesn't auto-resolve, even at high confidence
Some anomaly classes are policy-locked to require human review, regardless of confidence:
- ●Anything involving a change of legal entity (subsidiary restructuring, parent/child reassignment, ownership change)
- ●Anything that would alter a regulatory identifier (TIN, account number, license number)
- ●Anything that would change a customer's tax residency or jurisdiction
- ●Anything involving deceased or dissolved entities (legal review required)
These aren't soft thresholds the agent can override. The agent's prompt explicitly enumerates them as policy-locked categories — even a 0.99 confidence score routes to human review for these. That's by design and not negotiable per project.
What the audit record looks like
{
"decision_id": "dec_8a4f9c",
"agent": "reconciliation_v3.2",
"rule_library_version": "naming.v1.4 / equivalence.v2.0",
"subject_id": "TIN:14729583012",
"verdict": "reconcilable",
"confidence": 0.94,
"action": "merge_records",
"reasoning": "naming.v1.4 SUFFIX_EXPANSION rule: ENT to Enterprises",
"auto_resolved": true,
"threshold_crossed": 0.91,
"applied_at": "2026-04-12T03:47:18Z"
}
Every field is required. Every record links back to its rolling_back_action, so any decision can be reversed before cutover. The rule_library_version is immutable: if we update the naming library in May, the May version is a different decision identifier — we don't retroactively change historical decisions.