ExVerus: Verus Proof Repair via Counterexample Reasoning

Jun Yang1, Yuechun Sun1, Yi Wu1, Rodrigo Caridad1, Yongwei Yuan2, Jianan Yao3, Shan Lu1,4, Kexin Pei1

1The University of Chicago    2Purdue University    3University of Toronto    4Microsoft Research


Paper Code ExVerus Examples Results BibTeX Contact

TLDR: ExVerus is a counterexample-guided framework that lets LLMs use verifier feedback to refine proofs. Instead of static proof generation, it learns from failed attempts by generating and generalizing counterexamples into inductive invariants—yielding higher proof accuracy, robustness, and efficiency than prior LLM-based Verus systems.

Why Counterexample-Guided Formal Proof Generation?

Current LLM-based proof generation framework often treats verification as a static, end-to-end prediction over source code, relies on sparse verifier feedback, and lacks visibility into concrete program behavior. As a result, such systems struggle to reason about fine-grained semantics,especially when constructing inductive loop invariants that demand deeper behavioral understanding, and counterexample guided proof generation can help address these challenges.

Overview of ExVerus

ExVerus Framework Diagram


ExVerus is a counterexample-guided proof-repair framework for Verus that transforms static proof generation into an iterative reasoning process. By automatically generating, validating, and generalizing counterexamples, it enables LLMs to refine loop invariants based on concrete program behaviors. It operates in mainly three phases:

  • Initial Generation: Produce an initial proof using a baseline LLM prompt; failed proofs trigger repair.
  • Counterexample Generation: Generate and validate SMT-based counterexamples that reveal real invariant failures.
  • Generalization & Repair: Classify errors, mutate invariants to block counterexamples, and select the best repair via blocking validation.

Motivating Examples

On how ExVerus iteratively repairs proofs through counterexample feedback.

How ExVerus generates, validates and blocks counterexamples. For details of step 3:CEX-guided proof repair, check Generalization Details.

Counterexample generation and validation workflow.

How ExVerus' CEX-guided proof repair works through Error Triangle, Counterexample-based mutation, and validation.

Counterexample generalization for proof repair visual.

Key Highlights

  • Counterexample‑Guided: Makes proof search iterative by generating, validating, and generalizing counterexamples.
  • Three‑Phase Repair: Initial proof → SMT counterexamples → invariant mutation.
  • Inductive Invariants: Generalizes counterexamples into invariants that block failures.
  • ObfsBench Robustness: 266 obfuscated tasks covering layout, data, and control‑flow changes.
  • Results: +45% over SOTA; 90% robust under obfuscation (vs. 33–50%); 70.2% counterexample‑blocking; ~75% lower tokens/time.

Performance Highlights

State-of-the-Art Results

ExVerus performs impressively across all evaluated benchmarks and base models, substantially surpassing previous SOTA LLM-based provers and strong iterative refinement baselines. It attains the highest verification success rates on VerusBench and DafnyBench (up to 95.5% with o4-mini) and solves significantly more tasks on HumanEval and LCBench, underscoring its superior generality and adaptability.

Performance Chart

Exceptional Robustness

ExVerus demonstrates strong semantic resilience under deliberate code obfuscation. In the ObfsBench benchmark—where 266 verified Verus proofs were systematically transformed using layout, data, and control-flow obfuscations, 90% of ExVerus-verified proofs remain verifiable, compared to only 33% - 50% for AutoVerus.

Exceptional Robustness

Superior Efficiency

ExVerus significantly reduces token consumption compared to prior methods: around 75% fewer tokens than AutoVerus on DeepSeek and around 25% fewer tokens than AutoVerus on GPT-4o.

Superior Efficiency

BibTeX

@inproceedings{
yang2026exverus,
title={ExVerus: Verus Proof Repair via Counterexample Reasoning},
author={Jun Yang and Yuechun Sun and Yi Wu and Rodrigo Caridad and Yongwei Yuan and Jianan Yao and Shan Lu and Kexin Pei},
booktitle={Forty-third International Conference on Machine Learning},
year={2026},
url={https://openreview.net/forum?id=FNDkXA0OUJ}
}