Edward J. SchwartzComputer Security Researcher7 min. read

I've been feeling left behind by all the exciting news surrounding AI, so I've been quietly working on a project to get myself up to speed with some modern Transformers models for NLP. This project is mostly a learning exercise, but I'm sharing it in the hopes that it is still interesting!

Background: Functions, Methods and OOAnalyzer

One of my favorite projects is OOAnalyzer, which is part of SEI's Pharos static binary analysis framework. As the name suggests, it is a binary analysis tool for analyzing object-oriented (OO) executables to learn information about their high level structure. This structure includes classes, the methods assigned to each class, and the relationships between classes (such as inheritance). What's really cool about OOAnalyzer is that it is built on rules written in Prolog -- yes, Prolog! So it's both interesting academically and practically; people do use OOAnalyzer in practice. For more information, check out the original paper or a video of my talk.

Along the way to understanding a program's class hierarchy, OOAnalyzer needs to solve many smaller problems. One of these problems is: Given a function in an executable, does this function correspond to an object-oriented method in the source code? In my humble opinion, OOAnalyzer is pretty awesome, but one of the yucky things about it is that it contains many hard-coded assumptions that are only valid for Microsoft Visual C++ on 32-bit x86 (or just x86 MSVC to save some space!)

It just so happens that on this compiler and platform, most OO methods use the thiscall calling convention, which passes a pointer to the this object in the ecx register. Below is an interactive Godbolt example that shows a simple method mymethod being compiled by x86 MSVC. You can see on line 7 that the thisptr is copied from ecx onto the stack at offset -4. On line 9, you can see that arg is passed on the stack. In contrast, on line 20, myfunc only receives its argument on the stack, and does not access ecx.

(Note that this assembly code comes from the compiler and includes the name of functions, which makes it obvious whether a function corresponds to an OO method. Unfortunately this is not the case when reverse engineering without access to source code!)

Because most non-thisptr arguments are passed on the stack in x86 MSVC, but the thisptr is passed in ecx, seeing an argument in the ecx register is highly suggestive (but not definitive) that the function corresponds to an OO method. OOAnalyzer has a variety of heuristics based on this notion that tries to determine whether a function corresponds to an OO method. These work well, but they're specific to x86 MSVC. What if we wanted to generalize to other compilers? Maybe we could learn to do that. But first, let's see if we can learn to do this for x86 MSVC.

Learning to the Rescue

Let's play with some machine learning!

Step 1: Create a Dataset

You can't learn without data, so the first thing I had to do was create a dataset. Fortunately, I already had a lot of useful tools and projects for generating ground truth about OO programs that I could reuse.

BuildExes

The first project I used was BuildExes. This is a project that takes several test programs that are distributed as part of OOAnalyzer and builds them with many versions of MSVC and a variety of compiler options. The cute thing about BuildExes is that it uses Azure pipelines to install different versions of MSVC using the Chocolatey package manager and perform the compilations. Otherwise we'd have to install eight MSVC versions, which sounds like a pain to me. BuildExes uses a mix of publicly available Chocolatey packages and some that I created for older versions of MSVC that no one else cares about ๐Ÿคฃ.

When BuildExes runs on Azure pipelines, it produces an artifact consisting of a large number of executables that I can use as my dataset.

Ground Truth

As part of our evaluations for the OOAnalyzer paper, we wrote a variety of scripts that extracted ground truth information out of PDB debugging symbols files (which, conveniently, are also included in the BuildExes artifact!) These scripts aren't publicly available, but they aren't top secret and we've shared them with other researchers. They essentially call a tool to decode PDB files into a textual representation and then parse the results.

Putting it Together

Here is the script that produces the ground truth dataset. It's a bit obscure, but it's not very complicated. Basically, for each executable in the BuildExes project, it reads the ground truth file, and also uses the bat-dis tool from the ROSE Binary Analysis Framework to disassemble the program.

The initial dataset is available on ๐Ÿค— HuggingFace. Aside: I love that ๐Ÿค— HuggingFace lets you browse datasets to see what they look like.

Step 2: Split the Data

The next step is to split the data into training and test sets. In ML, the best practice is generally to ensure that there is no overlap between the training and test sets. This is so that the performance of a model on the test set represents performance on "unseen examples" that the model has not been trained on. But this is tricky for a few reasons.

First, software is natural, which means that we can expect that distinct programmers will naturally write the same code over and over again. If a function is independently written multiple times, should it really count as an "previously seen example"? Unfortunately, we can't distinguish when a function is independently written multiple times or simply copied. So when we encounter a duplicate function, what should we do? Discard it entirely? Allow it to be in both the training and test sets? There is a related question for functions that are very similar. If two functions only differ by an identifier name, would knowledge of one constitute knowledge of the other?

Second, compilers introduce a lot of functions that are not written by the programmer, and thus are very, very common. If you look closely at our dataset, it is actually dominated by compiler utility functions and functions from the standard library. In a real-world setting, it is probably reasonable to assume that an analyst has seen these before. Should we discard these, or allow them to be in both the training and test sets, with the understanding that they are special in some way?

Constructing a training and test split has to take into account these questions. I actually created an additional dataset that splits the data via different mechanisms.

I think the most intuitively correct one is what I call splitting by library functions. The idea is quite simple. If a function name appears in the compilation of more than one test program, we call it a library function. For example, if we compile oo2.cpp and oo3.cpp to oo2.exe and oo3.exe respectively, and both executables contain a function called msvc_library_function, then that function is probably introduced by the compiler or standard library, and we call it a library function. If function oo2_special_fun only appears in oo2.exe and no other executables, then we call it a non-library function. We then split the data into training and test sets such that the training set consists only of library functions, and the test set consists only of non-library functions. In essence, we train on the commonly available functions, and test on the rare ones.

This idea isn't perfect, but it works pretty well, and it's easy to understand and justify. You can view this split here.

Step 3: Fine-tune a Model

Now that we have a dataset, we can fine-tune a model. I used the huggingface/CodeBERTa-small-v1 model on ๐Ÿค— HuggingFace as a base model. This is a model that was trained on a large corpus of code, but that is not trained on assembly code (and we will see some evidence of this).

My model, which I fine-tuned with the training split using the "by library" method described above, is available as ejschwartz/oo-method-test-model-bylibrary. It attains 94% accuracy on the test set, which I thought was pretty good. I suspect that OOAnalyzer performs similarly well.

If you are familiar with the ๐Ÿค— Transformers library, it's actually quite easy to use my model (or someone else's). You can actually click on the "Use in Transformers" button on the model page, and it will show you the Python code to use. But if you're a mere mortal, never fear, as I've created a Space for you to play with, which is embedded here:

If you want, you can upload your own x86 MSVC executable. But if you don't have one of those lying around, you can just click on one of the built-in example executables at the bottom of the space (ooex8.exe and ooex9.exe). From there, you can select a function from the dropdown menu to see its disassembly, and the model's opinion of whether it is a class method or not. If the assembly code is too long for the model to process, you'll currently encounter an error.

Here's a recording of what it looks like:

If you are very patient, you can also click on "Interpret" to run Shapley interpretation, which will show which tokens are contributing the most. But it is slow. Very slow. Like five minutes slow. It also won't give you any feedback or progress (sigh -- blame gradio, not me).

Here's an example of an interpretation. The dark red tokens contribute more to method-ness and dark blue tokens contribute to function-ness. You can also see that registers names are not tokens in the original CodeBERT model. For instance, ecx and is split into ec and x, which the model learns to treat as a pair. It's not a coincidence that the model learned that ecx is indicative of a method, as this is a rule used in OOAnalyzer as well. It's also somewhat interesting that the model views sequences of zero bytes in the binary code as being indicative of function-ness.

Shapley interpretation example
Shapley interpretation example

Looking Forward

Now that we've shown we can relearn a heuristic that is already included in OOAnalyzer, this raises a few other questions:

  • One of the "false positives" of the ecx heuristics is the fastcall calling convention, which also uses the ecx register. We should ensure that the dataset contains functions with this calling convention (and others), since they do occur in practice.

  • What other parts of OOAnalyzer can we learn from examples? How does the accuracy of learned properties compare to the current implementation?

  • Can we learn to distinguish functions and methods for other architectures and compilers? Unfortunately, one bottleneck here is that our ground truth scripts only work for MSVC. But this certainly isn't an insurmountable problem.

blog image
Edward J. SchwartzComputer Security Researcher1 min. read

I've mentioned a few times that I'm an avid skier. I live in central Pennsylvania though, and so the skiing is not always the best, to say the least!

I've been going to the BLISTER Summit in Crested Butte for the past two years now, and it's been a blast. The summit defies categorization; it's a cross between a huge ski demo, meet-up, and panel sessions. It's a great way to try out new gear, ski on an incredible mountain, and ski with some really awesome people, including Wendy Fisher, Chris Davenport, Drew Petersen and the BLISTER team.

Anyway, I mostly wanted to share a few fun pictures from the event. Somehow I made it into the promo video for next year's event! Sadly, it was not because of my epic skiing:

Pictures

I've been looking for a good excuse to post some pictures. So, here you go!

Me skiing in powder and trees
Me skiing in powder and trees

Me skiing in powder
Me skiing in powder

My friend Doug and I riding the T-Bar
My friend Doug and I riding the T-Bar

A group of Blister attendees in Headwall Glades
A group of Blister attendees in Headwall Glades

So much fresh powder!
So much fresh powder!

Me kicking up some snow
Me kicking up some snow

Me skiing some chop
Me skiing some chop

blog image
Edward J. SchwartzComputer Security Researcher1 min. read

My family and I had quite the day attending the 2023 Central Pennsylvania Open Source Conference (CPOSC) yesterday at the Ware Center in Lancaster, PA. In addition to attending, we spoke in three different talks!

My wife, Dr. Stephanie Schwartz, kicked things off.

Stephanie
presenting
Stephanie presenting
She gave a short introduction on Random Forests as an introduction for her student Samantha Noggle, in "Machine Learning Techniques to Improve Users' Music Listening Experiences". I thought Samantha did a great job on her presentation. It was an interesting topic that she presented at just the right level of detail.

Next, my step son Nick Elzer, along with his coworker from Quub, Nathaniel Every, gave his first ever public conference presentation, and did a great job!

Nick and Nathaniel
presenting
Nick and Nathaniel presenting
Their presentation was "SpaceHeX Beta1.0 Release - What If, For Space Hardware Development, We Put Each Egg In Its Own Basket?", and was about SpaceHex, a prototyping system they open sourced to make it easier for people to break into satellite development. They also demoed a pretty sweet omnidirectional rover that they built (at 2am the night before, naturally).

Finally, I gave a tutorial presentation called "Introduction to Exploiting Stack Buffer Overflow Vulnerabilities". If you want, you can follow along here and the videos below.

Me speaking
Me speaking

After a few years of hiatus, it was great to have CPOSC 2023 be back in person again. I saw a number of great presentations, and met a lot of interesting and smart people. It's always surprising how many technical people work in an area that is known for its rural farming!

Videos

Edward J. SchwartzComputer Security Researcher4 min. read

Last December, I did most of Advent of Code in Rust, which I had never used before. You can find my solutions here.

The Good

Modern syntax with LLVM codegen

I tend to program functionally, perhaps even excessively so. I try to express most concepts through map, filter, and fold. I tend to enjoy languages that make this easy. Fortunately, this is becoming the norm, even in non-functional languages such as Python, Java and C++.

Perhaps it is not too surprising then that Rust, as a new language, supports this style of programming as well:

let x: i32 = (1..42).map(|x| x+1).sum();
println!("x: {x}");

What is truly amazing about Rust though is how this function code is compiled to x86-64. At optimiation level 1, the computation of x evaluates to

        mov     dword ptr [rsp + 4], 902
        lea     rax, [rsp + 4]

Yes, the compiler is able to unfold and simplify the entire computation, which is pretty neat. But let's look at the code at optimization level 0:

        mov     ecx, 1
        xor     eax, eax
        mov     dl, 1
.LBB5_1:
.Ltmp27:
        movzx   edx, dl
        and     edx, 1
        add     edx, ecx
.Ltmp28:
        add     eax, ecx
        inc     eax
        mov     ecx, edx
.Ltmp29:
        cmp     edx, 42
        setb    dl
.Ltmp30:
        jb      .LBB5_1
.Ltmp31:
        sub     rsp, 72
        mov     dword ptr [rsp + 4], eax
        lea     rax, [rsp + 4]

So our functional computation of a range, a map, and a sum (which is a reduce) is compiled into a pretty simple loop. And keep in mind this is at optimization level 0.

By contrast, let's take a look at how OCaml handles this. First, the included OCaml standard library is not so great, so writing the program is more awkward:

let r = List.init 42 (fun x -> x + 1) in
let x = List.map (fun x -> x+1) r in
let x = List.fold_left (+) 0 x in
Printf.printf "x: %x\n" x

But let's look at the assembly with aggressive optimizations:

camlExample__entry:
        leaq    -328(%rsp), %r10
        cmpq    32(%r14), %r10
        jb      .L122
.L123:
        subq    $8, %rsp
.L121:
        movl    $85, %ebx
        movl    $5, %eax
        call    camlExample__init_aux_432@PLT
.L124:
        call    caml_alloc2@PLT
.L125:
        leaq    8(%r15), %rsi
        movq    $2048, -8(%rsi)
        movq    $5, (%rsi)
        movq    %rax, 8(%rsi)
        movq    camlExample__Pmakeblock_arg_247@GOTPCREL(%rip), %rdi
        movq    %rsp, %rbp
        movq    56(%r14), %rsp
        call    caml_initialize@PLT
        movq    %rbp, %rsp
        movq    camlExample__Pmakeblock_arg_247@GOTPCREL(%rip), %rax
        movq    (%rax), %rax
        call    camlExample__map_503@PLT
.L126:
        call    caml_alloc2@PLT
.L127:
        leaq    8(%r15), %rsi
        movq    $2048, -8(%rsi)
        movq    $5, (%rsi)
        movq    %rax, 8(%rsi)
        movq    camlExample__x_77@GOTPCREL(%rip), %rdi
        movq    %rsp, %rbp
        movq    56(%r14), %rsp
        call    caml_initialize@PLT
        movq    %rbp, %rsp
        movq    camlExample__x_77@GOTPCREL(%rip), %rax
        movq    (%rax), %rax
        movq    8(%rax), %rbx
        movl    $5, %eax
        call    camlExample__fold_left_558@PLT
.L128:
        movq    camlExample__x_75@GOTPCREL(%rip), %rdi
        movq    %rax, %rsi
        movq    %rsp, %rbp
        movq    56(%r14), %rsp
        call    caml_initialize@PLT
        movq    %rbp, %rsp
        movq    camlExample__const_block_49@GOTPCREL(%rip), %rdi
        movq    camlExample__Pmakeblock_637@GOTPCREL(%rip), %rbx
        movq    camlStdlib__Printf__anon_fn$5bprintf$2eml$3a20$2c14$2d$2d48$5d_409_closure@GOTPCREL(%rip), %rax
        call    camlCamlinternalFormat__make_printf_4967@PLT
.L129:
        movq    camlExample__full_apply_240@GOTPCREL(%rip), %rdi
        movq    %rax, %rsi
        movq    %rsp, %rbp
        movq    56(%r14), %rsp
        call    caml_initialize@PLT
        movq    %rbp, %rsp
        movq    camlExample__full_apply_240@GOTPCREL(%rip), %rax
        movq    (%rax), %rbx
        movq    camlExample__x_75@GOTPCREL(%rip), %rax
        movq    (%rax), %rax
        movq    (%rbx), %rdi
        call    *%rdi
.L130:
        movl    $1, %eax
        addq    $8, %rsp
        ret
.L122:
        push    $34
        call    caml_call_realloc_stack@PLT
        popq    %r10
        jmp     .L123

The Bad

Lambda problem

In general, the Rust compiler's error messages are quite helpful. This is important, because dealing (fighting) with the borrow checker is a frequence occurrence. Unfortunately, there are some cases that, despite a lot of effort, I still don't really understand.

Here is a problem that I posted on stack overflow. It's a bit contrived, but it happened because I had a very functional solution to part 1 of an Advent of Code problem. The easiest way to solve the second part was to add a mutation.

Here is the program:

fn main(){
  let v1=vec![1];
  let v2=vec![3];
  let mut v3=vec![];
  v1.iter().map(|x|{
    v2.iter().map(|y|{
      v3.push(*y);
    })
  });
}

And here is the error:

error: captured variable cannot escape `FnMut` closure body
 --> src/main.rs:6:5
  |
4 |     let mut v3=vec![];
  |         ------ variable defined here
5 |     v1.iter().map(|x|{
  |                     - inferred to be a `FnMut` closure
6 | /     v2.iter().map(|y|{
7 | |       v3.push(*y);
  | |       -- variable captured here
8 | |     })
  | |______^ returns a reference to a captured variable which escapes the closure body

The suggestions I received on stack overflow were basically "use loops". This was very disappointing for an example where the closures' scopes are clearly limited.

Anyway, it's still early days for Rust, so I hope that problems like this will be improved over time. Overall, it seems like a great language for doing systems development, but I still think a garbage collected language is better for daily driving.

Right before the holidays, I, along with my co-authors of the journal article The Art, Science, and Engineering of Fuzzing: A Survey, received an early holiday present!

Congratulations!

On behalf of Vice President for Publications, David Ebert, I am writing to inform you that your paper, "The Art, Science, and Engineering of Fuzzing: A Survey," has been awarded the 2021 Best Paper Award from IEEE Transactions on Software Engineering by the IEEE Computer Society Publications Board.

This was quite unexpected, as our article was accepted back in 2019 -- four years ago! But it only "appeared" in the November 2021 editions of the journal.

You can access this article here or, as always, on my publications page.

blog image
Edward J. SchwartzComputer Security Researcher2 min. read

It's been an exciting year so far. I'm happy to announce that two papers I co-authored received awards. Congratulations to the students who did all the heavy lifting -- Jeremy, Qibin, and Alex!

Distinguished Paper Award: Augmenting Decompiler Output with Learned Variable Names and Types

Qibin Chen, Jeremy Lacomis, Edward J. Schwartz, Claire Le Goues, Graham Neubig, and Bogdan Vasilescu. Augmenting Decompiler Output with Learned Variable Names and Types, (PDF) Proceedings of the 2022 USENIX Security Symposium. Received distinguished paper award.

This paper follows up on some of our earlier work in which we show how to improve decompiler. Decompiler output is often substantially more readable compared to the lower-level alternative of reading disassembly code. But decompiler output still has a lot of shortcomings when it comes to information that is removed during the compilation process, such as variable names and type information. In our previous work, we showed that it is possible to recover meaningful variable names by learning appropriate variable names based on the context of the surrounding code.

In the new paper, Jeremy, Qibin and my coauthors explored whether it is also possible to recover high-level types via learning. There is a rich history of binary analysis work in the area of type inference, but this work generally focuses on syntactic types, such as struct {float; float}. These type inference algorithms are generally already built into decompilers. In our paper, we try to recover semantic types, such as struct {float x; float y} point which includes the type and field names, which are more valuable to a reverse engineer. It turns out that we can recover semantic types even more accurately than variable names. This is in part because types are constrained by the way in which they are used. For example, an int can't be confused with a char because they are different sizes.

Best Paper Award: Learning to Superoptimize Real-world Programs

Alex Shypula, Pengcheng Yin, Jeremy Lacomis, Claire Le Goues, Edward Schwartz, and Graham Neubig. Learning to Superoptimize Real-world Programs, (Arxiv) (PDF) Proceedings of the 2022 Deep Learning for Code Workshop at the International Conference on Learning Representations. Received best paper award.

In this paper, Alex and our co-authors investigate whether neural models are able to learn and improve on optimizations at the assembly code level by looking at unoptimized and optimized code pairings that are generated from an optimizing compiler. The short answer is that they can, and by employing reinforcement learning on top, can learn to outperform an optimizing compiler in some cases! Superoptimization is an interesting problem in its own right, but what really excites me about this paper is it demonstrates that neural models can learn very complex optimizations such as register allocation just by looking at the textual representation of assembly code. The optimizations the model can perform clearly indicate that the model is learning a substantial portion of x86 assembly code semantics merely by looking at examples. To me, this clearly signals that, with the right data, neural models are likely able to solve many binary analysis problems. I look forward to future work in which we combine traditional binary analysis techniques, such as explicit semantic decodings of instructions, with neural learning.

I'm happy to announce that a paper written with my colleagues, A Generic Technique for Automatically Finding Defense-Aware Code Reuse Attacks, will be published at ACM CCS 2020. This paper is based on some ideas I had while I was finishing my degree that I did not have time to finish. Fortunately, I was able to find time to work on it again at SEI, and this paper is the result. A pre-publication copy is available from the publications page.

Edward J. SchwartzComputer Security Researcher3 min. read

One of the aspects I like about my job as a researcher at SEI is that I split my time between being an academic researcher and a software engineer. For example, a few years ago, my colleagues and I published a paper on how to employ logic programming to recover object-oriented abstractions from an executable. I was very proud of that paper (and still am!) because it introduces some interesting new approaches to performing binary analysis, and yet is also a usable tool (OOAnalyzer) that people are actually using to reverse engineer real software.

Although basing our system on logic programming made for an interesting paper, it also made it challenging to scale OOAnalyzer to very large and complex executables. We strugged from the beginning to achieve reasonable performance, and eventually concluded that we had to aggressively use tabling in order to achieve reasonable performance. At a high level, tabling is memoization for Prolog. Prolog does have side-effects, though, and part of the challenge for tabling implementations is to make sure that the tables are consistent.

One of the leading Prolog implementations with tabling support is XSB Prolog, which OOAnalyzer has been using since the beginning. Teri Swift, one of the XSB developers, and the leading expert on Prolog tabling, really helped us get enough performance to be able to write our paper. We were able to analyze some large and interesting programs, including mysql.exe, the MySQL client program. The largest program we were able to test in our paper was over 5 MiB.

Unfortunately, there were many programs that we weren't able to test successfully as well. One of these that really stuck in my mind was mysqld.exe. In our paper, we were able to run successfully on mysql.exe, the MySQL client program, but mysqld.exe, the daemon was much larger (22 MiB vs 5 MiB), and remained elusive.

A while after we published our paper, we started working again with Teri Swift, who was helping Jan Wielemaker of SWI Prolog add support for tabling to SWI Prolog. Teri and Jan wanted to use OOAnalyzer as a test case to make sure that SWI Prolog's tabling implementation was working correctly. Since this would mean that OOAnalyzer would have another supported Prolog engine, this was a win-win situation for us! Jan graciously helped us get OOAnalyzer running on SWI, which was useful in its own right. He also developed some invaluable debugging tools to help us understand what tabling is doing behind the scenes.

These tools, combined with a lot of profiling and rewriting iterations, eventually allowed us to improve performance bottlenecks to the point that we could successfully reason through mysqld.exe and other similarly large programs! Along the way, we battled performance bottlenecks, memory that would not be reclaimed, and other problems. But now we can analyze mysqld.exe in about 18 hours, and only using about 12 GiB of memory. And what is encouraging is that even near the end of reasoning about such a large program, the tool is still making new inferences fairly quickly.

These updates are available in the latest release of the Pharos repository. My colleague Cory Cohen recently wrote a tutorial on how to run OOAnalyzer on large executables, with all of the tricks we have learned over the years.

While working with Teri and Jan, I very slowly began to understand some of the underlying problems that would cause performance to degrade in large programs. One of the problems was that, to maintain consistency, the tabling implementation would re-compute tables from scratch. For most tables, this could be done quickly. But for others, it would take up huge amounts of time. I also realized that because of the structure of our problem, which incrementally learns new conclusions over time, the recomputations were not really needed.

I created a work-around in Prolog that I call "trigger rules". I rewrote the most expensive OOAnalyzer rules to be triggered when dependent facts were introduced, which avoided the costly (and unnecessary) recomputation of the table for all values. The conversion process is fragile and manual, but it proved that if we could avoid the costly recomputations, OOAnalyzer's performance would scale pretty well.

More recently, we've been working with Jan on some a tabling method that avoids the costly recomputations (and also avoids the fragile, error-prone manual transformations that I needed to make for trigger rules). The new type of tabling is called monotonic tabling. There are still some kinks to work out, but we are optimistic that this will further improve the performance of OOAnalyzer and make it easier to maintain.

And of course, monotonic tabling will probably make for an interesting subject of a new paper. So we've now come full circle. We wrote the research paper on OOAnalyzer. This led us to engineering challenges. And to solve the engineering challenges, it brought us back to a new research problem.

Edward J. SchwartzComputer Security Researcher7 min. read

I'm a long time user of weechat. I don't really use IRC anymore, but I use it to connect to bitlbee for IMs, slack, and so on.

I really like weechat. But for a long time, I've had strange screen corruption problems, such as in the second to last line of:

weechat bug
weechat bug

That's a mild example. Sometimes weechat is all but unusable. Now, I can press Ctrl+L to redraw the screen and make the corruption go away. But it really bugged me one day, so I started investigating.

It took me a while, but eventually I realized that running weechat with a fresh configuration did not exhibit the problem. From there, it was straight-forward (but annoying) to isolate the problematic configuration option. It turned out to be an option called eat_newline_glitch.

According to the weechat user manual:

if set, the eat_newline_glitch will be set to 0; this is used to not add new line char at end of each line, and then not break text when you copy/paste text from WeeChat to another application (this option is disabled by default because it can cause serious display bugs)

If eat_newline_glitch is not turned on, then links to wrap from one line to another are not properly detected by gnome-terminal. It will detect the portion on the first line as a link only. In the modern days of long, automatically generated URLs, this is really annoying to say the least.

Anyway, at some point I must have turned eat_newline_glitch on and forgot about it, and never associated it with the screen corruption. I could have just turned the option off, but I obsess over things like this like to know why things don't work. ๐Ÿ˜€

The first problem was being able to replicate the problem on demand. I eventually realized that the problem seemed to be related to lines that are exactly the width of the terminal. After some fighting with weechat scripting, I managed to write the following script which prints a command-line that reliably triggers the problem for me:

#!/usr/bin/env python

import subprocess
import time

prefixlen = len("13:52:08 | ")

collen = int(subprocess.check_output("tput cols", shell=True))
n = collen - prefixlen

s = "A" * n

commands = []
commands.append("/set weechat.look.eat_newline_glitch on")
commands.append("/bar hide buflist")
commands.append("/print %s" % s)
commands.append("/print %s" % s)
for _ in range(20):
    commands.append("/exec -buffer new echo This text should not be visible in buffer 1")

commands.append("/buffer exec.new")
commands.append("/wait 1 /buffer 1")

print ("weechat -d /tmp -r '%s'" % "; ".join(commands))

The problem is pretty easy to spot. If you're in buffer 1 and you see "This text should not be visible in buffer 1", the bug was present! Here's an example:

screenshot of reproduced bug
screenshot of reproduced bug

Now that I could reliably replicate the problem, how to figure out what was going on? Weechat uses the ncurses library to write the UI to the screen. And interestingly, eat_newline_glitch corresponds to a capability name in terminfo, which is used by ncurses to draw the screen. The official description is not very helpful:

Newline ignored after 80 columns (Concept)

After perusing the ncurses source code and documentation, I found that it has a trace mode. Since ncurses maintains an internal representation of the terminal, it can print out what it thinks the screen should look like.

newscr[ 0]  -1 -1 ='WeeChat 2.9-dev (C) 2003-2020 - https://weechat.org/                                                                                                                                         '
colors[ 0]        ='222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222'
newscr[ 1]  -1 -1 ='11:11:30 |   ___       __         ______________        _____                                                                                                                                '
colors[ 1]        ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk                                                                                                                               '
newscr[ 2]  -1 -1 ='11:11:30 |   __ |     / /___________  ____/__  /_______ __  /_                                                                                                                               '
colors[ 2]        ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk                                                                                                                               '
newscr[ 3]  -1 -1 ='11:11:30 |   __ | /| / /_  _ \  _ \  /    __  __ \  __ `/  __/                                                                                                                               '
colors[ 3]        ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk                                                                                                                               '
newscr[ 4]  -1 -1 ='11:11:30 |   __ |/ |/ / /  __/  __/ /___  _  / / / /_/ // /_                                                                                                                                 '
colors[ 4]        ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk                                                                                                                               '
newscr[ 5]  -1 -1 ='11:11:30 |   ____/|__/  \___/\___/\____/  /_/ /_/\__,_/ \__/                                                                                                                                 '
colors[ 5]        ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk                                                                                                                               '
newscr[ 6]  -1 -1 ='11:11:30 | WeeChat 2.9-dev [compiled on Apr  1 2020 10:32:22]                                                                                                                                '
colors[ 6]        ='1161161117799999999999999997888888888888888888888888888888887                                                                                                                                '
newscr[ 7]  -1 -1 ='11:11:30 | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                                                   '
colors[ 7]        ='11611611177111111111111111111111111111111111111111111111111111111111111111                                                                                                                   '
newscr[ 8]  -1 -1 ='11:11:30 | Plugins loaded: alias, buflist, charset, exec, fifo, fset, irc, logger, python, relay, script, trigger, xfer                                                                      '
colors[ 8]        ='11611611177111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111                                                                      '
newscr[ 9]  -1 -1 ='11:11:30 | WARNING: this option can cause serious display bugs, if you have such problems, you must turn off this option.                                                                    '
colors[ 9]        ='1161161117711111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111                                                                    '
newscr[10]  -1 -1 ='11:11:30 | Option changed: weechat.look.eat_newline_glitch = on  (default: off)                                                                                                              '
colors[10]        ='1161161117711111111111111111111111111111111111111111111111777887771111111118887                                                                                                              '
newscr[11]  -1 -1 ='11:11:30 | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
colors[11]        ='116116111771111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[12]  -1 -1 ='11:11:30 | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
colors[12]        ='116116111771111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[13]  -1 -1 ='                                                                                                                                                                                             '
newscr[14]  -1 -1 ='                                                                                                                                                                                             '
colors[14]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[15]  -1 -1 ='                                                                                                                                                                                             '
colors[15]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[16]  -1 -1 ='                                                                                                                                                                                             '
colors[16]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[17]  -1 -1 ='                                                                                                                                                                                             '
colors[17]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[18]  -1 -1 ='                                                                                                                                                                                             '
colors[18]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[19]  -1 -1 ='                                                                                                                                                                                             '
colors[19]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[20]  -1 -1 ='                                                                                                                                                                                             '
colors[20]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[21]  -1 -1 ='                                                                                                                                                                                             '
colors[21]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[22]  -1 -1 ='                                                                                                                                                                                             '
colors[22]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[23]  -1 -1 ='                                                                                                                                                                                             '
colors[23]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[24]  -1 -1 ='                                                                                                                                                                                             '
colors[24]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[25]  -1 -1 ='                                                                                                                                                                                             '
colors[25]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[26]  -1 -1 ='                                                                                                                                                                                             '
colors[26]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[27]  -1 -1 ='                                                                                                                                                                                             '
colors[27]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[28]  -1 -1 ='                                                                                                                                                                                             '
colors[28]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[29]  -1 -1 ='                                                                                                                                                                                             '
colors[29]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[30]  -1 -1 ='                                                                                                                                                                                             '
colors[30]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[31]  -1 -1 ='                                                                                                                                                                                             '
colors[31]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[32]  -1 -1 ='                                                                                                                                                                                             '
colors[32]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[33]   5  5 ='[11:14] [2] [core] 1:weechat                                                                                                                                                                 '
colors[33]        ='322222323232322223243555555522222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222'
newscr[34]  -1 -1 ='                                                                                                                                                                                             '
colors[34]        ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'

Ncurses' internal representation of the screen does not show the "This text should not be visible" messages, which implies that something ncurses is doing to update the screen is not having the desired effect.

After digging through the ncurses code some more, I found a very useful define called POSITION_DEBUG:

Enable checking to see if doupdate and friends are tracking the true cursor position correctly. NOTE: this is a debugging hack which will work ONLY on ANSI-compatible terminals!

Since eat_newline_glitch was related to the cursor position, this sounded promising! And indeed, with this option turned on, the trace log contained the following messages:

position seen (10, 188) doesn't match expected one (11, 0) in wrap_cursor
position seen (11, 188) doesn't match expected one (12, 0) in wrap_cursor

Not only does it tell us that something is wrong, but also the function it is in, wrap_cursor (with some debug code removed):

/*
 * Wrap the cursor position, i.e., advance to the beginning of the next line.
 */
static void
wrap_cursor(NCURSES_SP_DCL0)
{
    if (eat_newline_glitch) {
	/*
	 * xenl can manifest two different ways.  The vt100 way is that, when
	 * you'd expect the cursor to wrap, it stays hung at the right margin
	 * (on top of the character just emitted) and doesn't wrap until the
	 * *next* graphic char is emitted.  The c100 way is to ignore LF
	 * received just after an am wrap.
	 *
	 * An aggressive way to handle this would be to emit CR/LF after the
	 * char and then assume the wrap is done, you're on the first position
	 * of the next line, and the terminal out of its weird state.  Here
	 * it's safe to just tell the code that the cursor is in hyperspace and
	 * let the next mvcur() call straighten things out.
	 */
	SP_PARM->_curscol = -1;
	SP_PARM->_cursrow = -1;
    } else if (auto_right_margin) {
	SP_PARM->_curscol = 0;
	SP_PARM->_cursrow++;
	/*
	 * We've actually moved - but may have to work around problems with
	 * video attributes not working.
	 */
	if (!move_standout_mode && AttrOf(SCREEN_ATTRS(SP_PARM))) {
	    VIDPUTS(SP_PARM, A_NORMAL, 0);
    } else {
    	SP_PARM->_curscol--;
    }
    position_check(NCURSES_SP_ARGx
		   SP_PARM->_cursrow,
		   SP_PARM->_curscol,
		   "wrap_cursor");
}

The explanation of xenl (the capability name for eat_newline_glitch) is the best I've ever seen. Basically, it means that the cursor's location is unknown as soon as the right-most column is output. Ncurses handles this by marking the cursor location as unknown, which then forces a conservative mechanism to reset the cursor location the next time it is moved. OK, that makes sense, but then why are we getting screen corruption when we set eat_newline_glitch in weechat?

Because setting weechat's eat_newline_glitch to x sets ncurses' eat_newline_glitch to !x. I'm embarrassed to admit this took me quite a long time to notice.

Ok, so to refresh, I set eat_newline_glitch to true in weechat, which sets it to false in ncurses. So when we execute wrap_cursor, we take the branch for else if (auto_right_margin), which assumes that the terminal moves the cursor to the left-most column of the next row. Since this is not what gnome-terminal does, screen corruption results!

So where is the bug? Weechat surely has a bug, in that screen corruption can occur when you set eat_newline_glitch. But they do warn you about this in a very vague way. In my opinion, ncurses does not have a bug. My terminal has weird behavior when reaching the right-most column, and thus it is no surprise that when we effectively tell ncurses the glitch does not exist, it results in screen corruption.

That being said, I think there is some room for improvement. The problem with eat_newline_glitch is that the behavior is ambiguous by definition; it describes two types of behaviors. If ncurses knew the actual behavior of the terminal, it wouldn't need to manually move the cursor or insert a line break, which wouldn't interfere with the terminal's ability to detect a wrapped URL. So one solution would be to create two capabilities based on the two different behaviors that comprise eat_newline_glitch.

Another option would be to query for the cursor's location the first time wrap_cursor is invoked, and use the result to predict the cursor location in the future.

Anyway, that is my long trip down the rabbit hole of multi-line URLs. Unlike most trips of this nature, this one did not have a very satisfying ending. Hopefully I can save at least one other person from going down the same rabbit hole.

My colleagues and I wrote a survey paper on fuzzing entitled The Art, Science, and Engineering of Fuzzing: A Survey. Late last year this was accepted to appear in the IEEE Transactions on Software Engineering journal, but has not been officially published to date. We also recently learned that it will appear at ICSE 2020. As usual, you can download it from the publications page.

Powered with by Gatsby 5.0