Systematic Testing of C++ Abstraction Recovery Systems by Iterative Compiler Model Refinement

Download: Paper.

“Systematic Testing of C++ Abstraction Recovery Systems by Iterative Compiler Model Refinement” by Edward J. Schwartz, Cory F. Cohen, and Stephanie Schwartz. In Proceedings of the 2025 Workshop on Software Understanding and Reverse Engineering, 2025.

Abstract

C++ source code abstractions such as classes and methods greatly assist human analysts and automated algorithms alike when analyzing C++ programs. These abstractions are lost during the compilation process, but researchers have been developing tools to recover them using program analysis. Despite promising advances, this difficult problem remains unsolved, with state-of-the-art solutions self-reporting accuracies of 78% [32] and 77.5% [10] for different types of abstractions.

In this paper, we address this problem by proposing a new modelbased approach for systematically testing C++ abstraction recovery systems. Our high-level approach is to both jointly and iteratively refine the abstraction recovery system and a compiler model that introspects the compilation process. We built EmCee, a model of Microsoft's Visual C++ compiler, to apply our technique to the popular C++ abstraction recovery systems VirtAnalyzer [10] and OOAnalyzer [32]. EmCee “parses” input files by interpreting them as answers to a series of multiple choice questions (inspired by the game “twenty questions”), which makes it very amenable to fuzzing. We use an off-the-shelf grey-box fuzzer to automatically generate test cases for EmCee that represent a variety of program structures and optimizations. We then use these test cases to evaluate the reasoning in VirtAnalyzer and OOAnalyzer for soundness problems, and correct any violations. Using our approach, we identified 27 soundness problems in OOAnalyzer and three in VirtAnalyzer.

Download: Paper.

BibTeX entry:

@inproceedings{schwartz:2025,
   author = {Edward J. Schwartz and Cory F. Cohen and Stephanie Schwartz},
   title = {Systematic Testing of {C++} Abstraction Recovery Systems by
	Iterative Compiler Model Refinement},
   booktitle = {Proceedings of the 2025 Workshop on Software Understanding
	and Reverse Engineering},
   year = {2025}
}

(This webpage was created with bibtex2web.)

Back to publications.