Edward J. SchwartzComputer Security Researcher4 min. read

I'm happy to announce that my student Luke Dramko's paper "Idioms: A Simple and Effective Framework for Turbo-Charging Local Neural Decompilation with Well-Defined Types" has been accepted to NDSS 2026! Put simply, the paper shows that neural decompilers benefit greatly from explicitly predicting and recovering user-defined types (structs, unions, etc.) referenced in decompiled code.

Paper & Code

Paper: https://edmcman.github.io/papers/ndss26.pdf
Code & models: https://github.com/squareslab/idioms

The paper has a great motivating example that I'll borrow here. The example starts with a C function that uses a struct type:

struct hash {
  int hash_size;
  int item_cnt;
  struct gap_array *data;
  int (*hash_make_key)(void *item);
  int (*cmp_item)(void *item1, void *item2);
};
struct gap_array {
  int len;
  void **array;
};
  
int hash_find_index(struct hash *h, void *item) {
    void *cnx;
    int index = hash_make_key(h, item);
    int cnt = 0;
    cnx = gap_get(h->data, index);
    while (cnx != NULL) {
        if (cnt++ > h->hash_size) return -1;
        if (!h->cmp_item(cnx, item)) break;
        index = hash_next_index(h, index);
        cnx = gap_get(h->data, index);
    }
    if (cnx == NULL) return -1;
    return index;
}

If you've read this blog, it will probably not surprise you that even state-of-the-art decompilers like Hex-Rays struggle to recover this code in a human-readable form. Here is the output from Hex-Rays:

__int64 __fastcall func4(__int64 a1, __int64 a2) {
  int v2;          // eax
  __int64 result;  // rax
  int v4;          // [rsp+10h] [rbp-10h]
  unsigned int v5; // [rsp+14h] [rbp-Ch]
  __int64 i;       // [rsp+18h] [rbp-8h]
  v5 = func2(a1, a2);
  v4 = 0;
  for (i = func1(*(_QWORD *)(a1 + 8), v5); i;
       i = func1(*(_QWORD *)(a1 + 8), v5)) {
    v2 = v4++;
    if (v2 > *(_DWORD *)a1) return 0xFFFFFFFFLL;
    if (!(*(unsigned int(__fastcall **)(__int64, __int64))(a1 + 24))(i, a2))
      break;
    v5 = func3((_DWORD *)a1, v5);
  }
  if (i)
    result = v5;
  else
    result = 0xFFFFFFFFLL;
  return result;
}

There are a lot of problems with this output, but the most serious is that the information about the hash struct has been completely lost.

A very exciting line of decompilation research is neural decompilation, which leverages neural models to either (1) directly decompile code, or (2) improve the decompiled code from traditional (non-neural) decompilers. I am personally extremely excited about the latter approach, which uses neural models to post-process the output of traditional decompilers such as Hex-Rays. Traditional decompilers have been studied for decades, so why not leverage their strengths while using neural models to fix their weaknesses? One popular example is the LLM4Decompile models. Here is an example of LLM4Decompile's output for this function:

int FUN_00100155(struct FUN_0009ff84 *VAR_0,void *VAR_1){
  int VAR_2;
  int VAR_3;
  void *VAR_4;
  VAR_2 = FUN_0009ff86(VAR_0, VAR_1);
  VAR_3 = 0;
  VAR_4 = FUN_0009ff88(VAR_0->VAR_5, VAR_2);
  while (VAR_4) {
    if (VAR_0->VAR_6 < VAR_3) { return -1; }
    if (!VAR_0->VAR_7(VAR_4, VAR_1)) { break; }
    VAR_2 = FUN_0009ff89(VAR_0, VAR_2);
    VAR_4 = FUN_0009ff88(VAR_0->VAR_5, VAR_2);
    VAR_3++;
  }
  if (VAR_4) { return VAR_2; }
  return -1;
}

Unlike Hex-Rays, LLM4Decompile correctly identifies that the function's arguments are pointers and that the first argument is a pointer to a struct. But what is struct FUN_0009ff84 and what are its fields, VAR_5, VAR_6, and VAR_7? And perhaps most importantly for reverse engineers, what offsets are those fields at? This information is crucial for understanding the code, but it has been omitted by the model.

One of Idioms' main contributions is to modify the training process of neural models so that they produce well-defined types such as structs with named fields. Unsurprisingly, this makes decompiled code much easier to understand. Here is the output of Idioms on this function:

struct hash_t {
  int size;
  int count;
  struct hash_table_t *table;
  int (*hash)(void *key);
  int (*cmp)(void *key1, void *key2);
};
struct hash_table_t {
  int size;
  void **items;
};
int hash_find(struct hash_t *hash, void *key) {
  int index = hash_index(hash, key);
  int i = 0;
  void *item = hash_get(hash->table, index);
  while (item != ((void *)0)) {
    if (i++ > hash->size) { return -1; }
    if (hash->cmp(item, key) == 0) { break; }
    index = hash_next(hash, index);
    item = hash_get(hash->table, index);
  }
  return (item == ((void *)0)) ? -1 : index;
}

An Unexpected Advantage of Joint Predictions

Perhaps surprisingly, in addition to improving the readability of the code, jointly predicting both code and types also significantly improves the accuracy of the decompiled code!

Across multiple models and evaluation metrics, Idioms consistently outperforms prior neural decompilers:

On ExeBench: Idioms achieves 54.4% test-pass accuracy (vs. 46.3% for LLM4Decompile and 37.5% for Nova).
On RealType: A dataset we introduce that contains substantially more and more realistic user-defined types (UDTs); Idioms improves correctness metrics by 95–205% over prior work.
Context helps: Adding neighboring-function context improves UDT recovery—up to 63% improvement in structural accuracy—with little downside for larger models.

Beyond Decompilation

Surprisingly (to me), Idioms also outperforms standalone type recovery tools such as Retypd, BinSub, TRex, and TypeForge, by at least 73%. This suggests that generative, context-aware approaches may be well suited to resolving the inherent ambiguity of type recovery than prior approaches, even though this was not the original motivation of Idioms.

Edward J. SchwartzComputer Security Researcher3 min. read

research

papers

Systematic Testing of C++ Abstraction Recovery Systems by Iterative Compiler Model Refinement

I'm excited to announce that our paper, "Systematic Testing of C++ Abstraction Recovery Systems by Iterative Compiler Model Refinement", has been published at the 2025 Workshop on Software Understanding and Reverse Engineering (SURE)! SURE is a new workshop—more details below.

When I joined SEI, one of my first research projects was improving C++ abstraction recovery from binaries. This work led to the development of OOAnalyzer, a tool that combines practical reverse engineering with academic techniques like a Prolog-based reasoning engine.

Over time, users increasingly adopted OOAnalyze and ran it on video game executables that were much larger and more complex than the malware samples we had originally targeted. Perhaps unsurprisingly, running OOAnalyzer on these larger, more complex binaries often revealed problematic corner cases in OOAnalyzer's rules. As we learned about such problems, we would attempt to improve the rules to handle the new cases. Sometimes this was pretty easy; some of the problems were obvious in hindsight. But not always.

Eventually, I found that some rules were becoming so nuanced and complex that I was having trouble reasoning about them. I realized that I needed a more systematic way to test and refine OOAnalyzer's rules. This realization led to the work presented in this paper.

On one hand, I'm incredibly proud of this paper. To greatly simplify, we developed a way to model check the rules in OOAnalyzer. It's been incredibly effective at finding problems in OOAnalyzer's rules. We found 27 soundness issues in OOAnalyzer, and two in VirtAnalyzer, a competing C++ abstraction recovery system.

Unfortunately, the journey to publication for this paper has been long and bumpy. It is, admittedly, a fairly niche topic. But I also think it has some interesting ideas that could be applied more generally to other reverse engineering systems. In particular, I think that much of what we call "binary analysis" is actually "compiled executable analysis" in disguise. In other words, our binary analyses are often making (substantial!) assumptions about how compilers work, such as the calling conventions that are used, and without these assumptions they do not work. Our paper provides a systematic way to encode and refine these assumptions, and validate whether underlying analyses are correct under the assumptions. My hope is that in future work, we are able to demonstrate how to apply this overall approach to more traditional binary analyses. I personally think that static binary analysis without such reasonable simplifying assumptions is impractical, so we need more techniques like this to help analyze compiled executables.

Although I'm proud of the problems we discovered in OOAnalyzer, I'm also a bit disappointed in the practical impact. My hope was that when we found and fixed soundness problems in OOAnalyzer's rules, it would generally improve the performance of OOAnalyzer on real-world binaries. However, we found that in general this was not true. In fact, in some cases, fixing soundness problems actually made OOAnalyzer perform worse on real-world binaries! I think this highlights that there is a trade-off between soundness and accuracy in binary analysis. In order to be sound, the system can never make mistakes. We may need to make a rule more conservative, even if it almost never poses a problem in practice. However, in order to maximize accuracy, the system must optimize for the most common cases, even if it means that it will occasionally make mistakes.

SURE Workshop

SURE, the Workshop on Software Understanding and Reverse Engineering, is a new workshop colocated with ACM CCS this year. In addition to presenting our paper, I had the honor of participating in a panel on "Modern Software Understanding" alongside Fish Wang and Steffen Enders. The first SURE was a resounding success, and I'm eager to see how it evolves in the coming years!

Ed at Panel on Modern Software Understanding

Edward J. SchwartzComputer Security Researcher1 min. read

🎉 New Research Published at DIMVA 2025

I'm excited to announce that "Quantifying and Mitigating the Impact of Obfuscations on Machine-Learning-Based Decompilation Improvement" has been published at the 2025 Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2025)!

The Research Team

This work was primarily conducted by Deniz Bölöni-Turgut—a bright undergraduate at Cornell University—as part of the REU in Software Engineering (REUSE) program at CMU. She was supervised by Luke Dramko from our research group.

What We Investigated

This paper tackles an important question in the evolving landscape of AI-powered reverse engineering: How do code obfuscations impact the effectiveness of these ML-based approaches? In the real world, adversaries often employ obfuscation techniques to make their code harder to analyze by reverse engineers. Although these obfuscation techniques were not designed with machine learning in mind, they can significantly modify the code, which raises the question of whether they could hinder the performance of ML models, which are currently trained on unobfuscated code.

Key Findings

Our research provides important quantitative insights into how obfuscations affect ML-based decompilation:

Obfuscations do negatively impact ML models: We demonstrated that semantics-preserving transformations that obscure program functionality significantly reduce the accuracy of machine learning-based decompilation tools.
Training on obfuscated code helps: Our experiments show that training models on obfuscated code can partially recover the lost accuracy, making the tools more resilient to obfuscation techniques.
Consistent results across multiple models: We validated our findings across three different state-of-the-art models from the literature—DIRTY, HexT5, and VarBERT—suggesting that our findings generalize.
Practical implications for malware analysis: Since obfuscations are commonly used in malware, these findings are directly applicable to improving real-world binary analysis scenarios.

This work represents an important step forward in making ML-based decompilation tools more resilient against the obfuscation techniques commonly encountered in real-world binary analysis scenarios. As the field continues to evolve, understanding these vulnerabilities and developing robust solutions will be crucial for maintaining the effectiveness of AI-powered security tools.

Want to know more? Download the complete paper.

Edward J. SchwartzComputer Security Researcher1 min. read

🎉 New Research Published at DSN 2025

I'm excited to announce that "A Human Study of Automatically Generated Decompiler Annotations" has been published at the 2025 IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2025)!

The Research Team

This work represents the culmination of Jeremy Lacomis's Ph.D. research, alongside our fantastic collaborators:

Vanderbilt University: Yuwei Yang, Skyler Grandel, and Kevin Leach
Carnegie Mellon University: Bogdan Vasilescu and Claire Le Goues

What We Studied

This paper investigates a critical question in reverse engineering: Do automatically generated variable names and type annotations actually help human analysts understand decompiled code?

Our study built upon DIRTY, our machine learning system that automatically generates meaningful variable names and type information for decompiled binaries. While DIRTY showed promising technical results, we wanted to understand its real-world impact on human reverse engineers.

Key Findings

Surprisingly, the annotations did not significantly improve participants' task completion speed or accuracy
This challenges assumptions about the direct correlation between code readability and task performance
Participants preferred code with annotations over plain decompiled output

Interested in the full methodology and detailed results? Download the complete paper to dive deeper into our human study design, statistical analysis, and implications for future decompilation tools.

Edward J. SchwartzComputer Security Researcher1 min. read

papers

publications

fuzzing

Right before the holidays, I, along with my co-authors of the journal article The Art, Science, and Engineering of Fuzzing: A Survey, received an early holiday present!

Congratulations!

On behalf of Vice President for Publications, David Ebert, I am writing to inform you that your paper, "The Art, Science, and Engineering of Fuzzing: A Survey," has been awarded the 2021 Best Paper Award from IEEE Transactions on Software Engineering by the IEEE Computer Society Publications Board.

This was quite unexpected, as our article was accepted back in 2019 -- four years ago! But it only "appeared" in the November 2021 editions of the journal.

You can access this article here or, as always, on my publications page.

Edward J. SchwartzComputer Security Researcher2 min. read

papers

publications

It's been an exciting year so far. I'm happy to announce that two papers I co-authored received awards. Congratulations to the students who did all the heavy lifting -- Jeremy, Qibin, and Alex!

Distinguished Paper Award: Augmenting Decompiler Output with Learned Variable Names and Types

Qibin Chen, Jeremy Lacomis, Edward J. Schwartz, Claire Le Goues, Graham Neubig, and Bogdan Vasilescu. Augmenting Decompiler Output with Learned Variable Names and Types, (PDF) Proceedings of the 2022 USENIX Security Symposium. Received distinguished paper award.

This paper follows up on some of our earlier work in which we show how to improve decompiler. Decompiler output is often substantially more readable compared to the lower-level alternative of reading disassembly code. But decompiler output still has a lot of shortcomings when it comes to information that is removed during the compilation process, such as variable names and type information. In our previous work, we showed that it is possible to recover meaningful variable names by learning appropriate variable names based on the context of the surrounding code.

In the new paper, Jeremy, Qibin and my coauthors explored whether it is also possible to recover high-level types via learning. There is a rich history of binary analysis work in the area of type inference, but this work generally focuses on syntactic types, such as struct {float; float}. These type inference algorithms are generally already built into decompilers. In our paper, we try to recover semantic types, such as struct {float x; float y} point which includes the type and field names, which are more valuable to a reverse engineer. It turns out that we can recover semantic types even more accurately than variable names. This is in part because types are constrained by the way in which they are used. For example, an int can't be confused with a char because they are different sizes.

Best Paper Award: Learning to Superoptimize Real-world Programs

Alex Shypula, Pengcheng Yin, Jeremy Lacomis, Claire Le Goues, Edward Schwartz, and Graham Neubig. Learning to Superoptimize Real-world Programs, (Arxiv) (PDF) Proceedings of the 2022 Deep Learning for Code Workshop at the International Conference on Learning Representations. Received best paper award.

In this paper, Alex and our co-authors investigate whether neural models are able to learn and improve on optimizations at the assembly code level by looking at unoptimized and optimized code pairings that are generated from an optimizing compiler. The short answer is that they can, and by employing reinforcement learning on top, can learn to outperform an optimizing compiler in some cases! Superoptimization is an interesting problem in its own right, but what really excites me about this paper is it demonstrates that neural models can learn very complex optimizations such as register allocation just by looking at the textual representation of assembly code. The optimizations the model can perform clearly indicate that the model is learning a substantial portion of x86 assembly code semantics merely by looking at examples. To me, this clearly signals that, with the right data, neural models are likely able to solve many binary analysis problems. I look forward to future work in which we combine traditional binary analysis techniques, such as explicit semantic decodings of instructions, with neural learning.

Edward J. SchwartzComputer Security Researcher1 min. read

papers

publications

research

I'm happy to announce that a paper written with my colleagues, A Generic Technique for Automatically Finding Defense-Aware Code Reuse Attacks, will be published at ACM CCS 2020. This paper is based on some ideas I had while I was finishing my degree that I did not have time to finish. Fortunately, I was able to find time to work on it again at SEI, and this paper is the result. A pre-publication copy is available from the publications page.

Edward J. SchwartzComputer Security Researcher1 min. read

research

papers

fuzzing

My colleagues and I wrote a survey paper on fuzzing entitled The Art, Science, and Engineering of Fuzzing: A Survey. Late last year this was accepted to appear in the IEEE Transactions on Software Engineering journal, but has not been officially published to date. We also recently learned that it will appear at ICSE 2020. As usual, you can download it from the publications page.

Edward J. SchwartzComputer Security Researcher1 min. read

research

reverse engineering

papers

My colleagues and I finished the camera ready version of our DIRE paper on variable name recovery. Although there's no code released at this time, I'm hoping to release a proof of concept Hex-Rays plugin.

Edward J. Schwartz

Computer Security Researcher

An Unexpected Advantage of Joint Predictions

Beyond Decompilation

Systematic Testing of C++ Abstraction Recovery Systems by Iterative Compiler Model Refinement

SURE Workshop

🎉 New Research Published at DIMVA 2025

The Research Team

What We Investigated

Key Findings

Read More

🎉 New Research Published at DSN 2025

The Research Team

What We Studied

Key Findings

Read More

Distinguished Paper Award: Augmenting Decompiler Output with Learned Variable Names and Types

Best Paper Award: Learning to Superoptimize Real-world Programs