Edward J. SchwartzComputer Security Researcher2 min. read

Can existing neural decompiler artifacts be used to run on a new example? Here are some notes on the current state of the art. I assign each decompiler a score from 0 to 10 based on how easy it is to use the publicly available artifacts to run on a new example.

SLaDe: 2/10

SLaDe has a publicly released replication artifact but there are several problems that prevent it from being used on new examples:

  1. The models are trained on assembly code produced from compilers rather than disassemblers. This is probably minor.
  2. More problematically, SLaDe uses IO testcases during beam search to help detect the best candidate. It can be used without these, but the results will be worse. SLaDe does not contain a mechanism for producing testcases for new examples.

Below is a quote from a private conversation with the author:

You are right that IO are somehow used to select in the beam search, in the sense that we report pass@5. They are not strictly required to get the outputs though.

The link you sent is for the program synthesis dataset. In this one, IO generation was programmatic but still kind of manual, I don't think it would be feasible to automatically generate the props file in the general case. For the Github functions, we have a separate repo that automatically generates IO tests, but those are randomly generated and the quality depends on each case. If I had to redo now, I would ask an LLM to generate unit tests! I can give you access to the private repo we used to automatically generate the IO examples for the general case if you wish, but now I'd do it with LLMs rather than randomly.

LLM4Decompile: 9/10

LLM4Decompile has published model files on HuggingFace that can easily be used to run on new examples. I created a few HuggingFace Spaces for testing.

resym: 2/10

resym has a publicly released replication artifact. Unfortunately, as of February 2025, the artifact is missing the "prolog-based inference system for struct layout recovery" which is the key contribution of the paper. Thus it is not possible to run resym on new examples.

DeGPT: 8/10

DeGPT has a publicly released GitHub repository. I'm largely going on memory, but I used it previously on new examples and it was relatively easy to use. I did have to file a few PRs though.

blog image
Edward J. SchwartzComputer Security Researcher1 min. read

This page documents my experience with "pressure washing" my vinyl fence and siding. I have pressure washing in quotes, because it's SH or sodium hypochlorite (or bleach) that does the bulk of the work. Pros often call this "soft washing".

For vinyl fence soft washing, you want around 1-2% SH. Most household bleach is 6% SH, so if you mix 1 part bleach with 5 parts water, you'll get around 1% SH.

You also want to use a surfectant to help the mixture stick to the fence. I used Dawn Ultra. Some people claim that some dish soaps will cause a bad reaction with the bleach, ranging from "mustard gas" to neutralizing the bleach.

I personally found that at 1-1.5% SH, the mixture was safe to use around grass. I wet the grass before and after applying the mixture, and I didn't see any damage.

Supplies

Recipe

  1. Add 3 cups of 6% SH bleach
  2. Add 0.8 gallons of water
  3. Add 2 fl. oz. of Dawn Ultra

Make sure to put the soap in last, or your mixture will foam up and overflow the sprayer when you try to close it.

Spray the mixture on the fence, let it sit for about five minutes, and then rinse it off. You can use a garden hose, but I personally found that using a Ryobi One+ EZ-Clean worked better. I'm sure a pressure washer would have been even faster, but it is less convenient to use.

That's about it. This removed most of the staining.

For some areas that had large amounts of growth, I used a Ryobi Scrubber to physically remove it before spraying.

The bleach was not able to remove all stain spots. For those remaining spots that were in conspicuous places, I used a magic eraser / melanine sponge.

Before
Before
After
After
blog image
Edward J. SchwartzComputer Security Researcher1 min. read

At some point, I hope to create a Notes section on my website that will turn Markdown files into a list of notes. This is basically how the blog works. But, I'm kind of busy. And since Gatsby seems like it's dead, I'm not sure that I want to invest a whole lot of time into it. (Although putting the notes in markdown seems like a good idea for compatibility.)

Anyway, here is my first very short note on Profiling.

SpeedScope

SpeedScope is an awesome tool for visualizing profiler output. It has a flame graph view that is wonderful. I also like to use the Sandwich view, sorting by total time and simply looking for the first function that I recognize. This is often the culprit.

The documentation is pretty good. It also shows how to record profiles in compatible formats for most platforms. I mostly use py-spy and perf.

Java

The one notably missing platform is Java! Luckily, it's not too hard to convert Java's async-profiler output to a format that SpeedScope can read. Here's how I do it:

  • Download async-profiler
  • Create an output file in collapsed format
  • You can do this in several ways, such as:
  • ./asprof start -i 1s Ghidra followed by ./asprof stop -o collapsed -f /tmp/out.prof.collapsed Ghidra
  • ./asprof collect -d 60 -o collapsed -f /tmp/out.prof.collapsed Ghidra
  • Then open out.prof.collapsed in SpeedScope.
Screenshot of profiling Ghidra
Screenshot of profiling Ghidra

The collapsed format takes a while to parse, so it might be worth it to export the native SpeedScope format.

Powered with by Gatsby 5.0