Last December, I did most of Advent of Code in Rust, which I had never used before. You can find my solutions here.
I tend to program functionally, perhaps even excessively so. I try to express
most concepts through map
, filter
, and fold
. I tend to enjoy languages
that make this easy. Fortunately, this is becoming the norm, even in
non-functional languages such as Python, Java and C++.
Perhaps it is not too surprising then that Rust, as a new language, supports this style of programming as well:
let x: i32 = (1..42).map(|x| x+1).sum();
println!("x: {x}");
What is truly amazing about Rust though is how this function code is
compiled to x86-64. At optimiation level 1, the computation of x
evaluates to
mov dword ptr [rsp + 4], 902
lea rax, [rsp + 4]
Yes, the compiler is able to unfold and simplify the entire computation, which is pretty neat. But let's look at the code at optimization level 0:
mov ecx, 1
xor eax, eax
mov dl, 1
.LBB5_1:
.Ltmp27:
movzx edx, dl
and edx, 1
add edx, ecx
.Ltmp28:
add eax, ecx
inc eax
mov ecx, edx
.Ltmp29:
cmp edx, 42
setb dl
.Ltmp30:
jb .LBB5_1
.Ltmp31:
sub rsp, 72
mov dword ptr [rsp + 4], eax
lea rax, [rsp + 4]
So our functional computation of a range, a map, and a sum (which is a reduce
)
is compiled into a pretty simple loop. And keep in mind this is at optimization
level 0.
By contrast, let's take a look at how OCaml handles this. First, the included OCaml standard library is not so great, so writing the program is more awkward:
let r = List.init 42 (fun x -> x + 1) in
let x = List.map (fun x -> x+1) r in
let x = List.fold_left (+) 0 x in
Printf.printf "x: %x\n" x
But let's look at the assembly with aggressive optimizations:
camlExample__entry:
leaq -328(%rsp), %r10
cmpq 32(%r14), %r10
jb .L122
.L123:
subq $8, %rsp
.L121:
movl $85, %ebx
movl $5, %eax
call camlExample__init_aux_432@PLT
.L124:
call caml_alloc2@PLT
.L125:
leaq 8(%r15), %rsi
movq $2048, -8(%rsi)
movq $5, (%rsi)
movq %rax, 8(%rsi)
movq camlExample__Pmakeblock_arg_247@GOTPCREL(%rip), %rdi
movq %rsp, %rbp
movq 56(%r14), %rsp
call caml_initialize@PLT
movq %rbp, %rsp
movq camlExample__Pmakeblock_arg_247@GOTPCREL(%rip), %rax
movq (%rax), %rax
call camlExample__map_503@PLT
.L126:
call caml_alloc2@PLT
.L127:
leaq 8(%r15), %rsi
movq $2048, -8(%rsi)
movq $5, (%rsi)
movq %rax, 8(%rsi)
movq camlExample__x_77@GOTPCREL(%rip), %rdi
movq %rsp, %rbp
movq 56(%r14), %rsp
call caml_initialize@PLT
movq %rbp, %rsp
movq camlExample__x_77@GOTPCREL(%rip), %rax
movq (%rax), %rax
movq 8(%rax), %rbx
movl $5, %eax
call camlExample__fold_left_558@PLT
.L128:
movq camlExample__x_75@GOTPCREL(%rip), %rdi
movq %rax, %rsi
movq %rsp, %rbp
movq 56(%r14), %rsp
call caml_initialize@PLT
movq %rbp, %rsp
movq camlExample__const_block_49@GOTPCREL(%rip), %rdi
movq camlExample__Pmakeblock_637@GOTPCREL(%rip), %rbx
movq camlStdlib__Printf__anon_fn$5bprintf$2eml$3a20$2c14$2d$2d48$5d_409_closure@GOTPCREL(%rip), %rax
call camlCamlinternalFormat__make_printf_4967@PLT
.L129:
movq camlExample__full_apply_240@GOTPCREL(%rip), %rdi
movq %rax, %rsi
movq %rsp, %rbp
movq 56(%r14), %rsp
call caml_initialize@PLT
movq %rbp, %rsp
movq camlExample__full_apply_240@GOTPCREL(%rip), %rax
movq (%rax), %rbx
movq camlExample__x_75@GOTPCREL(%rip), %rax
movq (%rax), %rax
movq (%rbx), %rdi
call *%rdi
.L130:
movl $1, %eax
addq $8, %rsp
ret
.L122:
push $34
call caml_call_realloc_stack@PLT
popq %r10
jmp .L123
In general, the Rust compiler's error messages are quite helpful. This is important, because dealing (fighting) with the borrow checker is a frequence occurrence. Unfortunately, there are some cases that, despite a lot of effort, I still don't really understand.
Here is a problem that I posted on stack overflow. It's a bit contrived, but it happened because I had a very functional solution to part 1 of an Advent of Code problem. The easiest way to solve the second part was to add a mutation.
Here is the program:
fn main(){
let v1=vec![1];
let v2=vec![3];
let mut v3=vec![];
v1.iter().map(|x|{
v2.iter().map(|y|{
v3.push(*y);
})
});
}
And here is the error:
error: captured variable cannot escape `FnMut` closure body
--> src/main.rs:6:5
|
4 | let mut v3=vec![];
| ------ variable defined here
5 | v1.iter().map(|x|{
| - inferred to be a `FnMut` closure
6 | / v2.iter().map(|y|{
7 | | v3.push(*y);
| | -- variable captured here
8 | | })
| |______^ returns a reference to a captured variable which escapes the closure body
The suggestions I received on stack overflow were basically "use loops". This was very disappointing for an example where the closures' scopes are clearly limited.
Anyway, it's still early days for Rust, so I hope that problems like this will be improved over time. Overall, it seems like a great language for doing systems development, but I still think a garbage collected language is better for daily driving.
Right before the holidays, I, along with my co-authors of the journal article The Art, Science, and Engineering of Fuzzing: A Survey, received an early holiday present!
Congratulations!
On behalf of Vice President for Publications, David Ebert, I am writing to inform you that your paper, "The Art, Science, and Engineering of Fuzzing: A Survey," has been awarded the 2021 Best Paper Award from IEEE Transactions on Software Engineering by the IEEE Computer Society Publications Board.
This was quite unexpected, as our article was accepted back in 2019 -- four years ago! But it only "appeared" in the November 2021 editions of the journal.
You can access this article here or, as always, on my publications page.
It's been an exciting year so far. I'm happy to announce that two papers I co-authored received awards. Congratulations to the students who did all the heavy lifting -- Jeremy, Qibin, and Alex!
Qibin Chen, Jeremy Lacomis, Edward J. Schwartz, Claire Le Goues, Graham Neubig, and Bogdan Vasilescu. Augmenting Decompiler Output with Learned Variable Names and Types, (PDF) Proceedings of the 2022 USENIX Security Symposium. Received distinguished paper award.
This paper follows up on some of our earlier work in which we show how to improve decompiler. Decompiler output is often substantially more readable compared to the lower-level alternative of reading disassembly code. But decompiler output still has a lot of shortcomings when it comes to information that is removed during the compilation process, such as variable names and type information. In our previous work, we showed that it is possible to recover meaningful variable names by learning appropriate variable names based on the context of the surrounding code.
In the new paper, Jeremy, Qibin and my coauthors explored whether it
is also possible to recover high-level types via learning. There is a
rich history of binary analysis work in the area of type inference,
but this work generally focuses on syntactic types, such as struct
{float; float}
. These type inference algorithms are generally already
built into decompilers. In our paper, we try to recover semantic
types, such as struct {float x; float y} point
which includes the
type and field names, which are more valuable to a reverse engineer.
It turns out that we can recover semantic types even more accurately
than variable names. This is in part because types are constrained by
the way in which they are used. For example, an int
can't be
confused with a char
because they are different sizes.
Alex Shypula, Pengcheng Yin, Jeremy Lacomis, Claire Le Goues, Edward Schwartz, and Graham Neubig. Learning to Superoptimize Real-world Programs, (Arxiv) (PDF) Proceedings of the 2022 Deep Learning for Code Workshop at the International Conference on Learning Representations. Received best paper award.
In this paper, Alex and our co-authors investigate whether neural models are able to learn and improve on optimizations at the assembly code level by looking at unoptimized and optimized code pairings that are generated from an optimizing compiler. The short answer is that they can, and by employing reinforcement learning on top, can learn to outperform an optimizing compiler in some cases! Superoptimization is an interesting problem in its own right, but what really excites me about this paper is it demonstrates that neural models can learn very complex optimizations such as register allocation just by looking at the textual representation of assembly code. The optimizations the model can perform clearly indicate that the model is learning a substantial portion of x86 assembly code semantics merely by looking at examples. To me, this clearly signals that, with the right data, neural models are likely able to solve many binary analysis problems. I look forward to future work in which we combine traditional binary analysis techniques, such as explicit semantic decodings of instructions, with neural learning.
I'm happy to announce that a paper written with my colleagues, A Generic Technique for Automatically Finding Defense-Aware Code Reuse Attacks, will be published at ACM CCS 2020. This paper is based on some ideas I had while I was finishing my degree that I did not have time to finish. Fortunately, I was able to find time to work on it again at SEI, and this paper is the result. A pre-publication copy is available from the publications page.
One of the aspects I like about my job as a researcher at SEI is that I split my time between being an academic researcher and a software engineer. For example, a few years ago, my colleagues and I published a paper on how to employ logic programming to recover object-oriented abstractions from an executable. I was very proud of that paper (and still am!) because it introduces some interesting new approaches to performing binary analysis, and yet is also a usable tool (OOAnalyzer) that people are actually using to reverse engineer real software.
Although basing our system on logic programming made for an interesting paper, it also made it challenging to scale OOAnalyzer to very large and complex executables. We strugged from the beginning to achieve reasonable performance, and eventually concluded that we had to aggressively use tabling in order to achieve reasonable performance. At a high level, tabling is memoization for Prolog. Prolog does have side-effects, though, and part of the challenge for tabling implementations is to make sure that the tables are consistent.
One of the leading Prolog implementations with tabling support is XSB Prolog, which OOAnalyzer has been using since the beginning. Teri Swift, one of the XSB developers, and the leading expert on Prolog tabling, really helped us get enough performance to be able to write our paper. We were able to analyze some large and interesting programs, including mysql.exe, the MySQL client program. The largest program we were able to test in our paper was over 5 MiB.
Unfortunately, there were many programs that we weren't able to test successfully as well. One of these that really stuck in my mind was mysqld.exe. In our paper, we were able to run successfully on mysql.exe, the MySQL client program, but mysqld.exe, the daemon was much larger (22 MiB vs 5 MiB), and remained elusive.
A while after we published our paper, we started working again with Teri Swift, who was helping Jan Wielemaker of SWI Prolog add support for tabling to SWI Prolog. Teri and Jan wanted to use OOAnalyzer as a test case to make sure that SWI Prolog's tabling implementation was working correctly. Since this would mean that OOAnalyzer would have another supported Prolog engine, this was a win-win situation for us! Jan graciously helped us get OOAnalyzer running on SWI, which was useful in its own right. He also developed some invaluable debugging tools to help us understand what tabling is doing behind the scenes.
These tools, combined with a lot of profiling and rewriting iterations, eventually allowed us to improve performance bottlenecks to the point that we could successfully reason through mysqld.exe and other similarly large programs! Along the way, we battled performance bottlenecks, memory that would not be reclaimed, and other problems. But now we can analyze mysqld.exe in about 18 hours, and only using about 12 GiB of memory. And what is encouraging is that even near the end of reasoning about such a large program, the tool is still making new inferences fairly quickly.
These updates are available in the latest release of the Pharos repository. My colleague Cory Cohen recently wrote a tutorial on how to run OOAnalyzer on large executables, with all of the tricks we have learned over the years.
While working with Teri and Jan, I very slowly began to understand some of the underlying problems that would cause performance to degrade in large programs. One of the problems was that, to maintain consistency, the tabling implementation would re-compute tables from scratch. For most tables, this could be done quickly. But for others, it would take up huge amounts of time. I also realized that because of the structure of our problem, which incrementally learns new conclusions over time, the recomputations were not really needed.
I created a work-around in Prolog that I call "trigger rules". I rewrote the most expensive OOAnalyzer rules to be triggered when dependent facts were introduced, which avoided the costly (and unnecessary) recomputation of the table for all values. The conversion process is fragile and manual, but it proved that if we could avoid the costly recomputations, OOAnalyzer's performance would scale pretty well.
More recently, we've been working with Jan on some a tabling method that avoids the costly recomputations (and also avoids the fragile, error-prone manual transformations that I needed to make for trigger rules). The new type of tabling is called monotonic tabling. There are still some kinks to work out, but we are optimistic that this will further improve the performance of OOAnalyzer and make it easier to maintain.
And of course, monotonic tabling will probably make for an interesting subject of a new paper. So we've now come full circle. We wrote the research paper on OOAnalyzer. This led us to engineering challenges. And to solve the engineering challenges, it brought us back to a new research problem.
I'm a long time user of weechat. I don't really use IRC anymore, but I use it to connect to bitlbee for IMs, slack, and so on.
I really like weechat. But for a long time, I've had strange screen corruption problems, such as in the second to last line of:
That's a mild example. Sometimes weechat is all but unusable. Now, I can press Ctrl+L to redraw the screen and make the corruption go away. But it really bugged me one day, so I started investigating.
It took me a while, but eventually I realized that running weechat with a fresh configuration did not exhibit the problem. From there, it was straight-forward (but annoying) to isolate the problematic configuration option. It turned out to be an option called eat_newline_glitch.
According to the weechat user manual:
if set, the eat_newline_glitch will be set to 0; this is used to not add new line char at end of each line, and then not break text when you copy/paste text from WeeChat to another application (this option is disabled by default because it can cause serious display bugs)
If eat_newline_glitch is not turned on, then links to wrap from one line to another are not properly detected by gnome-terminal. It will detect the portion on the first line as a link only. In the modern days of long, automatically generated URLs, this is really annoying to say the least.
Anyway, at some point I must have turned eat_newline_glitch on and
forgot about it, and never associated it with the screen corruption.
I could have just turned the option off, but I obsess over things
like this like to know why things don't work. 😀
The first problem was being able to replicate the problem on demand. I eventually realized that the problem seemed to be related to lines that are exactly the width of the terminal. After some fighting with weechat scripting, I managed to write the following script which prints a command-line that reliably triggers the problem for me:
#!/usr/bin/env python
import subprocess
import time
prefixlen = len("13:52:08 | ")
collen = int(subprocess.check_output("tput cols", shell=True))
n = collen - prefixlen
s = "A" * n
commands = []
commands.append("/set weechat.look.eat_newline_glitch on")
commands.append("/bar hide buflist")
commands.append("/print %s" % s)
commands.append("/print %s" % s)
for _ in range(20):
commands.append("/exec -buffer new echo This text should not be visible in buffer 1")
commands.append("/buffer exec.new")
commands.append("/wait 1 /buffer 1")
print ("weechat -d /tmp -r '%s'" % "; ".join(commands))
The problem is pretty easy to spot. If you're in buffer 1 and you see "This text should not be visible in buffer 1", the bug was present! Here's an example:
Now that I could reliably replicate the problem, how to figure out what was going on? Weechat uses the ncurses library to write the UI to the screen. And interestingly, eat_newline_glitch corresponds to a capability name in terminfo, which is used by ncurses to draw the screen. The official description is not very helpful:
Newline ignored after 80 columns (Concept)
After perusing the ncurses source code and documentation, I found that it has a trace mode. Since ncurses maintains an internal representation of the terminal, it can print out what it thinks the screen should look like.
newscr[ 0] -1 -1 ='WeeChat 2.9-dev (C) 2003-2020 - https://weechat.org/ '
colors[ 0] ='222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222'
newscr[ 1] -1 -1 ='11:11:30 | ___ __ ______________ _____ '
colors[ 1] ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk '
newscr[ 2] -1 -1 ='11:11:30 | __ | / /___________ ____/__ /_______ __ /_ '
colors[ 2] ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk '
newscr[ 3] -1 -1 ='11:11:30 | __ | /| / /_ _ \ _ \ / __ __ \ __ `/ __/ '
colors[ 3] ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk '
newscr[ 4] -1 -1 ='11:11:30 | __ |/ |/ / / __/ __/ /___ _ / / / /_/ // /_ '
colors[ 4] ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk '
newscr[ 5] -1 -1 ='11:11:30 | ____/|__/ \___/\___/\____/ /_/ /_/\__,_/ \__/ '
colors[ 5] ='11611611177kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk '
newscr[ 6] -1 -1 ='11:11:30 | WeeChat 2.9-dev [compiled on Apr 1 2020 10:32:22] '
colors[ 6] ='1161161117799999999999999997888888888888888888888888888888887 '
newscr[ 7] -1 -1 ='11:11:30 | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - '
colors[ 7] ='11611611177111111111111111111111111111111111111111111111111111111111111111 '
newscr[ 8] -1 -1 ='11:11:30 | Plugins loaded: alias, buflist, charset, exec, fifo, fset, irc, logger, python, relay, script, trigger, xfer '
colors[ 8] ='11611611177111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 '
newscr[ 9] -1 -1 ='11:11:30 | WARNING: this option can cause serious display bugs, if you have such problems, you must turn off this option. '
colors[ 9] ='1161161117711111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 '
newscr[10] -1 -1 ='11:11:30 | Option changed: weechat.look.eat_newline_glitch = on (default: off) '
colors[10] ='1161161117711111111111111111111111111111111111111111111111777887771111111118887 '
newscr[11] -1 -1 ='11:11:30 | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
colors[11] ='116116111771111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[12] -1 -1 ='11:11:30 | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
colors[12] ='116116111771111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[13] -1 -1 =' '
newscr[14] -1 -1 =' '
colors[14] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[15] -1 -1 =' '
colors[15] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[16] -1 -1 =' '
colors[16] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[17] -1 -1 =' '
colors[17] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[18] -1 -1 =' '
colors[18] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[19] -1 -1 =' '
colors[19] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[20] -1 -1 =' '
colors[20] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[21] -1 -1 =' '
colors[21] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[22] -1 -1 =' '
colors[22] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[23] -1 -1 =' '
colors[23] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[24] -1 -1 =' '
colors[24] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[25] -1 -1 =' '
colors[25] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[26] -1 -1 =' '
colors[26] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[27] -1 -1 =' '
colors[27] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[28] -1 -1 =' '
colors[28] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[29] -1 -1 =' '
colors[29] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[30] -1 -1 =' '
colors[30] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[31] -1 -1 =' '
colors[31] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[32] -1 -1 =' '
colors[32] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
newscr[33] 5 5 ='[11:14] [2] [core] 1:weechat '
colors[33] ='322222323232322223243555555522222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222'
newscr[34] -1 -1 =' '
colors[34] ='111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
Ncurses' internal representation of the screen does not show the "This text should not be visible" messages, which implies that something ncurses is doing to update the screen is not having the desired effect.
After digging through the ncurses code some more, I found a very useful define called POSITION_DEBUG:
Enable checking to see if doupdate and friends are tracking the true cursor position correctly. NOTE: this is a debugging hack which will work ONLY on ANSI-compatible terminals!
Since eat_newline_glitch was related to the cursor position, this sounded promising! And indeed, with this option turned on, the trace log contained the following messages:
position seen (10, 188) doesn't match expected one (11, 0) in wrap_cursor
position seen (11, 188) doesn't match expected one (12, 0) in wrap_cursor
Not only does it tell us that something is wrong, but also the function it is in, wrap_cursor (with some debug code removed):
/*
* Wrap the cursor position, i.e., advance to the beginning of the next line.
*/
static void
wrap_cursor(NCURSES_SP_DCL0)
{
if (eat_newline_glitch) {
/*
* xenl can manifest two different ways. The vt100 way is that, when
* you'd expect the cursor to wrap, it stays hung at the right margin
* (on top of the character just emitted) and doesn't wrap until the
* *next* graphic char is emitted. The c100 way is to ignore LF
* received just after an am wrap.
*
* An aggressive way to handle this would be to emit CR/LF after the
* char and then assume the wrap is done, you're on the first position
* of the next line, and the terminal out of its weird state. Here
* it's safe to just tell the code that the cursor is in hyperspace and
* let the next mvcur() call straighten things out.
*/
SP_PARM->_curscol = -1;
SP_PARM->_cursrow = -1;
} else if (auto_right_margin) {
SP_PARM->_curscol = 0;
SP_PARM->_cursrow++;
/*
* We've actually moved - but may have to work around problems with
* video attributes not working.
*/
if (!move_standout_mode && AttrOf(SCREEN_ATTRS(SP_PARM))) {
VIDPUTS(SP_PARM, A_NORMAL, 0);
} else {
SP_PARM->_curscol--;
}
position_check(NCURSES_SP_ARGx
SP_PARM->_cursrow,
SP_PARM->_curscol,
"wrap_cursor");
}
The explanation of xenl (the capability name for eat_newline_glitch) is the best I've ever seen. Basically, it means that the cursor's location is unknown as soon as the right-most column is output. Ncurses handles this by marking the cursor location as unknown, which then forces a conservative mechanism to reset the cursor location the next time it is moved. OK, that makes sense, but then why are we getting screen corruption when we set eat_newline_glitch in weechat?
Because setting weechat's eat_newline_glitch to x
sets ncurses'
eat_newline_glitch to !x
. I'm embarrassed to admit this took me
quite a long time to notice.
Ok, so to refresh, I set eat_newline_glitch to true
in weechat,
which sets it to false
in ncurses. So when we execute wrap_cursor,
we take the branch for else if (auto_right_margin)
, which assumes
that the terminal moves the cursor to the left-most column of the next
row. Since this is not what gnome-terminal does, screen corruption
results!
So where is the bug? Weechat surely has a bug, in that screen corruption can occur when you set eat_newline_glitch. But they do warn you about this in a very vague way. In my opinion, ncurses does not have a bug. My terminal has weird behavior when reaching the right-most column, and thus it is no surprise that when we effectively tell ncurses the glitch does not exist, it results in screen corruption.
That being said, I think there is some room for improvement. The problem with eat_newline_glitch is that the behavior is ambiguous by definition; it describes two types of behaviors. If ncurses knew the actual behavior of the terminal, it wouldn't need to manually move the cursor or insert a line break, which wouldn't interfere with the terminal's ability to detect a wrapped URL. So one solution would be to create two capabilities based on the two different behaviors that comprise eat_newline_glitch.
Another option would be to query for the cursor's location the first time wrap_cursor is invoked, and use the result to predict the cursor location in the future.
Anyway, that is my long trip down the rabbit hole of multi-line URLs. Unlike most trips of this nature, this one did not have a very satisfying ending. Hopefully I can save at least one other person from going down the same rabbit hole.
My colleagues and I wrote a survey paper on fuzzing entitled The Art, Science, and Engineering of Fuzzing: A Survey. Late last year this was accepted to appear in the IEEE Transactions on Software Engineering journal, but has not been officially published to date. We also recently learned that it will appear at ICSE 2020. As usual, you can download it from the publications page.
My colleagues and I finished the camera ready version of our DIRE paper on variable name recovery. Although there's no code released at this time, I'm hoping to release a proof of concept Hex-Rays plugin.
As far back as I can remember, one of the accepted dogmas of reverse engineering is that when a program is compiled, some information about the program is lost, and there is no way to recover it. Variable names are one of the most frequently cited casualties of this idea. In fact, this argument is used by countless authors in the introductions of their papers to explain some of the unique challenges that binary analysis has compared to source analysis.
You can imagine how shocking (and cool!) it was then when my colleagues and I found that it's possible to recover a large percentage of variable names. Our key insight is that the semantics of the binary code that accesses a variable is actually a surprisingly good signal for what the variable was named. In other words, we took a huge amount of source code from github, compiled it, and then decompiled it. We then trained a model to predict the variable names, which we can't see in an executable, based on the way that the code accesses those variables, which we can see in an executable. In Meaningful Variable Names for Decompiled Code: A Machine Translation Approach, we showed that this was possible using Statistical Machine Translation, which is one of the techniques used to translate between natural languages.
I'm happy to announce that our latest paper on this subject, DIRE: A Neural Approach to Decompiled Identifier Renaming, was accepted to ASE 2019. In this paper, we found that neural network models work really well for recovering variable names in decompiled code. I'll post our camera ready as soon as its finished.
I'll be speaking at the 2019 Central Pennsylvania Open Source Conference (CPOSC) in September. I attended CPOSC for the first time last year, and was very impressed by the quality of the talks. The name is a little misleading; talks are not necessarily related to open source software. I'll actually be giving a primer on code reuse attacks.
Many people don't know that in addition to my work with computers, I'm very passionate about exercise and fitness. I enjoy circuit training, running, soccer, and, the focus of this post, skiing. I've actually only been skiing seriously for a few years, but I like to ski challenging terrain. Sadly, the "mountains" where I live, in central Pennsylvania, leave much to be desired. So usually we go on at least one substantial ski trip per year. This year, for various reasons, my wife gave me the green light to go anywhere I wanted. After a fair amount of research, I chose Zermatt, Switzerland. We visited in early January 2019, for 6 days. (We also spent several days in Lucerne, which was quite nice too!) The point of this post is to share some of my thoughts and experiences. I find that ski resorts vary greatly (and especially so between continents!), so I find it helpful when people describe their experiences at different resorts.
The ski resort in Zermatt is called Matterhorn Paradise. Matterhorn Paradise is huge, and is split into three different regions: Sunnega/Rothorn, Gornergrat, and Schwarzee/Klein Matterhorn. Each region is named after the major mountains it comprises. For instance, the main peaks in the Sunnega/Rothorn are Sunnega and Rothorn. Each region also has its own base station in Zermatt. Sunnega/Rothorn is accessed by a funicular train; Gornergrat is accessed via a cog train; and Schwarzee/Klein Matterhorn is accessed by a combination of gondolas and cable cars.
Matterhorn Paradise is quite large. When fully open, they have 53 lifts and 143 marked runs.
Before I go further, it's time for a brief aside. Ski runs are classified as on piste or off piste, which is usually synonymous with groomed and ungroomed runs. Groomed runs are usually flattened each day by grooming machines and so they are relatively flat and easy to ski. Off piste runs generally "bump up" with large moguls, and can have a wide variety of snow conditions. In Zermatt, almost all runs are on piste, and are well groomed. The off piste runs are called itineraries. Although there is a lot of great skiable areas in the resort besides the itineraries, these are usually accompanied by signs saying there is no marked run, and the snow is not avalance controlled. Since rescue services are only available on marked runs, and I was skiing by myself, I didn't tempt fate! In large US resorts, off piste areas are quite different. They are truly areas. There are typically measured in acres, and are all controlled by the ski patrol. This is great for me, since I typically ski alone.
I really enjoyed the itineraries in Zermatt. The problem was that there weren't enough of them. On the best day, there were 11 itineraries open. And some of them shouldn't have been open -- you know the snow is bad when you find yourself heading towards the ice because you know how it's going to behave. But most of the itineraries were a lot of fun. For the most part, they were empty. I would get off a crowded cable car of skiers, take a turn at the entrance to an itinerary, and would watch as 20 skiers went past without even slowing down. More snow for me! I also had some close encounters with some type of mountain goats, too, which was neat. On other days, though, due to wind and lift closures, there were sometimes only one or two itineraries open. And it's pretty boring to ski the same thing over and over again.
Zermatt is well known for its free skiing, and I tried to book a guide while I was there. Unfortunately, I couldn't find one because the snow conditions were so bad. Bummer!
Finally, there's the on piste runs. There were a few steep, wide runs that I found enjoyable. But they didn't have very much sustained pitch, which is weird considering the resort is in the alps! Also, a lot of the runs are cat tracks that really aren't that enjoyable at all.
One of the cool things about Zermatt is that it's linked to Cervinia (when it's not too windy). Cervinia's terrain was (almost?) all on piste, but had softer snow and more sustained steep runs, so I actually preferred the on piste skiing there to Zermatt's. Also, it's cool to ski to Italy for lunch.
Although this isn't really the terrain, the view of the mountains adds a lot to the overall experience of Zermatt (and Cervinia). I kept stopping to appreciate the view at various points on runs. It's really a magical setting. Even though it wasn't the best skiing I've experienced, it was still quite an experience.
When the resort was mostly open, the quality of the lift infrastructure really shined. There were never any lines, and you could usually get a lift or gondola to yourself if you were willing to wait a minute or two. All lifts that I saw were high speed detachables. Going straight up or down the mountain is fairly easily. However, navigating laterally is a lot more challenging. I found the signage to be lacking, and even when you knew where to go, it took quite a long time. If you want to go from opposite ends of the resort, such as from Sunnega/Rothorn to Schwarzee/Klein Matterhorn, it can easily take an hour. In some cases, it’s easier to descend to Zermatt, take a (free) bus to the other valley station, and take the lift up.
However, this is a tale of two resorts. When Zermatt was mostly open, the lift infrastructure was outstanding. The majority of the days I was there, at least one major lift was not functioning. The most common reason for this is high winds, which shut down lifts to Rothorn, Schwarzee, and Klein Matterhorn. Especially when the Schwarzee lift is shut down, which is the primary way to access the majority of the Klein Matterhorn terrain, the resort shrinks dramatically, and the lifts on the other part of the mountain develop significant lift lines. In addition to longer lift lines, some terrain becomes inaccessible, which can reduce the variety of the terrain available.
Another frustrating aspect about Matterhorn Paradise is that it's fairly hard to tell what terrain is actually going to be open that day. On two days, parts of the mountain initially showed as open, but when I got there I discovered that they were closed. Since my second choice of terrain was all the way on the other side of the resort, this wasted more than an hour of ski time!
Don't start first thing in the morning. The mountains are so tall that visibility isn't good until 9:30 or 10AM anyway, so there's no point being on first chair unless you like skiing by feel only. Plus the runs won't be accurately marked as open until around that time anyway.
Food in Switzerland is expensive. Really, really expensive. Burgers cost about 20. Get to know the grocery stores, which are expensive but at least affordable! Or, you know, ski to Italy!
It's probably evident that I was a little disappointed in Zermatt. For me, and the conditions that were present, it just wasn't the best fit. That being said, if you're an intermediate skier and really like skiing on piste, it might be the place for you. Or if you're a more advanced skier and are present during conditions that permit more backcountry skiing, you also might have a better time than I did. In either case, the atmosphere of being in the Alps is great, even if the skiing wasn't the greatest (for me). But for my next trip, I'm probably going to stay closer to home.. maybe Whistler or Alta. We'll see!
Powered with by Gatsby 5.0