### **BAP: A Binary Analysis Platform**

David Brumley Ivan Jager Thanassis Avgerinos Edward J. Schwartz

**Carnegie Mellon University** 

## 3 Simple Lines

| add %eax , %ebx | ; ebx = eax+ebx ( sets OF, SF, ZF, AF, CF, PF)   |
|-----------------|--------------------------------------------------|
| shl %cl , %ebx  | ; ebx = ebx << cl ( sets OF, SF, ZF, AF, CF, PF) |
| jc error        | ; jump to error if carry flag is set             |

### Can you reach the error?

### **SHL** Specification

SAUSAR/SHUSHR-Shift

### INSTRUCTION SET REFERENCE, N-Z

INSTRUCTION SET REFERENCE, N-Z

### SAL/SAR/SHL/SHR-Shift

| Opcode***           | Instruction      | Op/<br>En | 64-Bit<br>Mode | Compat/<br>Leg Mode | Description                              |
|---------------------|------------------|-----------|----------------|---------------------|------------------------------------------|
| D0 /4               | SAL r/m8, 1      | А         | Valid          | Valid               | Multiply r/m8 by 2, once.                |
| REX + D0 /4         | SAL r/m8**, 1    | Α         | Valid          | N.E.                | Multiply r/m8 by 2, once.                |
| D2 /4               | SAL r/m8, CL     | в         | Valid          | Valid               | Multiply r/m8 by 2, CL times             |
| REX + D2 /4         | SAL r/m8**, CL   | в         | Valid          | N.E.                | Multiply r/m8 by 2, CL times             |
| C0 /4 ib            | SAL r/m8, imm8   | с         | Valid          | Valid               | Multiply r/m8 by 2, imm8 times.          |
| REX + CO /4 ib      | SAL r/m8**, imm8 | С         | Valid          | N.E.                | Multiply r/m8 by 2, imm8 times.          |
| D1 /4               | SAL r/m16, 1     | Α         | Valid          | Valid               | Multiply r/m16 by 2, once.               |
| D3 /4               | SAL r/m16, CL    | в         | Valid          | Valid               | Multiply r/m16 by 2, CL times.           |
| C1 /4 ib            | SAL r/m16, imm8  | с         | Valid          | Valid               | Multiply r/m16 by 2, imm8 times.         |
| D1 /4               | SAL r/m32, 1     | Α         | Valid          | Valid               | Multiply r/m32 by 2, once.               |
| REX.W + D1 /4       | SAL r/m64, 1     | Α         | Valid          | N.E.                | Multiply r/m64 by 2, once.               |
| D3 /4               | SAL r/m32, CL    | В         | Valid          | Valid               | Multiply r/m32 by 2, CL<br>times.        |
| REX.W + D3 /4       | SAL r/m64, CL    | в         | Valid          | N.E.                | Multiply r/m64 by 2, CL times.           |
| C1 /4 <i>ib</i>     | SAL r/m32, imm8  | с         | Valid          | Valid               | Multiply r/m32 by 2, imm8 times.         |
| REX.W + C1 /4<br>/b | SAL r/m64, imm8  | С         | Valid          | N.E.                | Multiply r/m64 by 2, imm8 times.         |
| DO /7               | SAR r/m8, 1      | A         | Valid          | Valid               | Signed divide* r/m8 by 2, once.          |
| REX * D0 /7         | SAR r/m8**, 1    | A         | Valid          | N.E.                | Signed divide* r/m8 by 2, once.          |
| D2 /7               | SAR r/m8, CL     | в         | Valid          | Valid               | Signed divide* r/m8 by 2, Cl<br>times.   |
| REX + D2 /7         | SAR r/m8**, CL   | в         | Valid          | N.E.                | Signed divide* r/m8 by 2, Cl<br>times.   |
| CO /7 <i>lb</i>     | SAR r/m8, imm8   | с         | Valid          | Valid               | Signed divide* r/m8 by 2,<br>imm8 time.  |
| REX * CO /7 ib      | SAR r/m8**, imm8 | с         | Valid          | N.E.                | Signed divide* r/m8 by 2,<br>imm8 times. |

| Opcode                | Instruction           | Op/<br>En | 64-Bit<br>Mode | Compat/<br>Leg Mode | Description                              |
|-----------------------|-----------------------|-----------|----------------|---------------------|------------------------------------------|
| D1 /7                 | SAR r/m16,1           | A         | Valid          | Valid               | Signed divide* r/m16 by 2<br>once.       |
| D3 /7                 | SAR r/m16, CL         | в         | Valid          | Valid               | Signed divide* r/m16 by 2<br>CL times.   |
| C1 /7 ib              | SAR r/m16, imm8       | с         | Valid          | Valid               | Signed divide* r/m16 by 2<br>imm8 times. |
| D1 /7                 | SAR r/m32, 1          | A         | Valid          | Valid               | Signed divide* r/m32 by 2<br>once.       |
| REX.W + D1 /7         | SAR r/m64, 1          | A         | Valid          | N.E.                | Signed divide* r/m64 by 2<br>once.       |
| D3 /7                 | SAR r/m32, CL         | в         | Valid          | Valid               | Signed divide* r/m32 by 2<br>CL times.   |
| REX.W + D3 /7         | SAR r/m64, CL         | В         | Valid          | N.E.                | Signed divide* r/m64 by 2<br>CL times.   |
| C1 /7 ib              | SAR r/m32, imm8       | с         | Valid          | Valid               | Signed divide* r/m32 by 2<br>imm8 times. |
| REX.W + C1 /7<br>ID   | SAR r/m64, imm8       | с         | Valid          | N.E.                | Signed divide* r/m64 by 2<br>imm8 times  |
| D0 /4                 | SHL r/m8, 1           | Α         | Valid          | Valid               | Multiply r/m8 by 2, once.                |
| REX + D0 /4           | SHL r/m8**, 1         | Α         | Valid          | N.E.                | Multiply r/m8 by 2, once.                |
| D2 /4                 | SHL r/m8, CL          | В         | Valid          | Valid               | Multiply r/m8 by 2, CL tim               |
| REX + D2 /4           | SHL r/m8**, CL        | в         | Valid          | N.E.                | Multiply r/m8 by 2, CL tim               |
| CO /4 <i>ib</i>       | SHL r/m8, imm8        | с         | Valid          | Valid               | Multiply r/m8 by 2, imm8 times.          |
| REX + CO /4 <i>ib</i> | SHL r/m8**, imm8      | с         | Valid          | N.E.                | Multiply r/m8 by 2, imm8 times.          |
| D1 /4                 | SHL r/m16,1           | Α         | Valid          | Valid               | Multiply r/m16 by 2, once                |
| D3 /4                 | SHL r/m16, CL         | В         | Valid          | Valid               | Multiply r/m16 by 2, CL times.           |
| C1 /4 ib              | SHL r/m16, imm8       | С         | Valid          | Valid               | Multiply r/m16 by 2, imml times.         |
| D1 /4                 | SHL r/m32,1           | Α         | Valid          | Valid               | Multiply r/m32 by 2, once                |
| REX.W + D1 /4         | SHL r/m64,1           | Α         | Valid          | N.E.                | Multiply r/m64 by 2, once                |
| D3 /4                 | SHL <i>r/m32</i> , CL | в         | Valid          | Valid               | Multiply r/m32 by 2, CL times.           |
| REX.W + D3 /4         | SHL r/m64, CL         | В         | Valid          | N.E.                | Multiply r/m64 by 2, CL times.           |

### SAL/SAR/SHL/SHR-Shift

Vol. 2B 4-353

### INSTRUCTION SET REFERENCE, N-Z

Vol. 28 4-357

The SAR and SHR instructions can be used to perform signed or unsigned division, respectively, of the destination operand by powers of 2. For example, using the SAR instruction to shift a signed integer 1 bit to the right divides the value by 2.

instruction to shift a signed integer 1 bit to the right divides the value by 2. Using the SAR instruction to perform a division operation deep not produce the same result as the IDV instruction. The quotient from the IDV instruction is rounded taward zero, whereas the "valued" if the SAR instruction is rounded taward regis-when the IDV instruction is used to divide -2 by 4, the result is -2 with a menunder of -1.1 If the SAR instruction is used to divide -2 by 4, the result is -2 with a menunder the "the result" is 1, however, the SAR instruction stores only the most significant bit of the remainder (in the CT Bag).

Do of the CP flag is a state of the CP has (-1 task). The CP flag is set to 0 if the most-significant bit of the result is the same as the CP flag (that is, the top two bits of the original generative result is is the same as the CP flag (that is, the top two bits of the original generative result is is the same set. The SR is not too, the OP flag is started for all 1-bit shifts. For the SRR instruction, the OP flag is started for all the significant bit of the original generative results are shown in the SRR instruction, the OP flag is started for all 1-bit shifts. For the SRR instruction, the OP flag is started for all the original generative results are shown in the original ge

In 64-bit mode, the instruction's default operation size is 32 bits and the mask width for CL is 5 bits, Using a REX prefix in the form of REX.R permits access to additional registers (R8-A1). Using a REX prefix in the form of REX.R permits access to additional 64-bits and sets the mask width for CL to 6 bits. See the summary chart at the begin-ring of this section for encoding data and limits.

### IA-32 Architecture Compatibility

The 8086 does not mask the shift count. However, all other 1/-32 processors (starting with the Intel 286 processor) do mask the shift count to 5 bits, resulting in a maximum count of 31. This masking is done in all operating modes (including the virtual-8086 mode) to reduce the maximum execution time of the instructions.

IF 64-Bit Mode and using REX.W THEN ountMASK ← 3FH; ELSE countMASK ← 1FH; FI  $\begin{array}{l} tempEOUNT \leftarrow (\text{EOUNT AND countMASK}); \\ tempDEST \leftarrow \text{DEST}; \end{array}$ WHILE (tempCDUNT ≠ 0)

IF instruction is SAL or SHL THEN DE - MSBIDESTE

SAUSAR/SHUSHR-Shit

### INSTRUCTION SET REFERENCE, N-Z ELSE (\* Instruction is SAR or SHR \*)

 $CF \leftarrow LSB(DEST)$ 

4-354 Vol. 28

IF instruction is SAL or SHL THEN DEST ← DEST = 2; ELSE. IF instruction is SAR THEN DEST ← DEST / 2; (\* Signed divide, rounding toward negative infinity \*). ELSE (\* Instruction is SHR \*) DEST ← DEST / 2 ; (\* Unsigned divide \*) FI; tempCOUNT ← tempCOUNT - 1; OD; (\* Determine overflow for the various instructions \*) IF (COUNT and countMASK) = 1 THEN IF instruction is SAL or SHL THEN OF ← MSB(DEST) XOR CF; FLSE IF instruction is SAR THEN 06 / 01 ELSE (\* Instruction is SHR \*) OF ← MSB(tempDEST) FI; ELSE IF (COUNT AND count/MASK) = 0 THEN All flags unchanged; ELSE (\* COUNT not 1 or 0 \*) OF ← undefined: FI; FI;

### Flags Affected

The CF flag contains the value of the last bit shifted out of the destination operand; it is undefined for SHL and SHR instructions where the count is greater than or equal to the size (in bits) of the destination operand. The OF flag is affected only for 1-bit

### INSTRUCTION SET REFERENCE N/Z

| Opcode              | Instruction      | Op/<br>En | 64-Bit<br>Mode | Compat/<br>Leg Mode | Description                                |
|---------------------|------------------|-----------|----------------|---------------------|--------------------------------------------|
| C1 /4 ib            | SHL r/m32, imm8  | с         | Valid          | Valid               | Multiply r/m32 by 2, imm8 times.           |
| REX.W + C1 /4<br>ib | SHL r/m64, imm8  | С         | Valid          | N.E.                | Multiply r/m64 by 2, imm8 times.           |
| DO /5               | SHR r/m8,1       | A         | Valid          | Valid               | Unsigned divide r/m8 by 2,<br>once.        |
| REX + DO /5         | SHR r/m8**, 1    | A         | Valid          | N.E.                | Unsigned divide r/m8 by 2,<br>once.        |
| D2 /5               | SHR r/m8, CL     | В         | Valid          | Valid               | Unsigned divide r/m8 by 2,<br>CL times.    |
| REX + D2 /5         | SHR r/m8**, CL   | В         | Valid          | N.E.                | Unsigned divide r/m8 by 2,<br>CL times.    |
| CO /5 <i>ib</i>     | SHR r/m8, imm8   | с         | Valid          | Valid               | Unsigned divide r/m8 by 2,<br>imm8 times.  |
| REX + CO /5 ib      | SHR r/m8**, imm8 | С         | Valid          | N.E.                | Unsigned divide r/m8 by 2,<br>imm8 times.  |
| D1 /5               | SHR r/m16, 1     | A         | Valid          | Valid               | Unsigned divide r/m16 by 2<br>once.        |
| D3 /5               | SHR r/m16, CL    | В         | Valid          | Valid               | Unsigned divide r/m16 by 2<br>CL times     |
| C1 /5 ib            | SHR r/m16, imm8  | С         | Valid          | Valid               | Unsigned divide r/m16 by 2<br>imm8 times.  |
| D1 /5               | SHR r/m32, 1     | A         | Valid          | Valid               | Unsigned divide r/m32 by 2,<br>once.       |
| REX.W + D1 /5       | SHR r/m64, 1     | A         | Valid          | N.E.                | Unsigned divide r/m64 by 2 once.           |
| D3 /5               | SHR r/m32, CL    | В         | Valid          | Valid               | Unsigned divide r/m32 by 2<br>CL times.    |
| REX.W + D3 /5       | SHR r/m64, CL    | В         | Valid          | N.E.                | Unsigned divide r/m64 by 2<br>CL times.    |
| C1 /5 <i>lb</i>     | SHR r/m32, imm8  | С         | Valid          | Valid               | Unsigned divide r/m32 by 2,<br>imm8 times. |
| REX.W + C1 /5<br>ib | SHR r/m64, imm8  | C         | Valid          | N.E.                | Unsigned divide r/m64 by 2,<br>imm8 times. |

SAL/SAR/SHL/SHR-Shift

### INSTRUCTION SET REFERENCE, N-Z

Vol. 28 4-355

shifts (see "Description" above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.

### Protected Mode Exceptions

| #GP(0)          | If the destination is located in a non-writable segment.                                     |
|-----------------|----------------------------------------------------------------------------------------------|
|                 | If a memory operand effective address is outside the CS, DS,<br>ES, FS, or GS segment limit. |
|                 | If the DS, ES, FS, or GS register contains a NULL segment<br>selector.                       |
| #SS(0)          | If a memory operand effective address is outside the SS<br>segment limit.                    |
| #PF(fault-code) | If a page fault occurs.                                                                      |
|                 |                                                                                              |

#AC(0) If alignment checking is enabled and an unaligned memory erence is made while the current privilege level is 3. #UD If the LOCK prefix is used.

### Real-Address Mode Exceptions

| #GP | If a memory operand effective address is outside the CS, DS,<br>ES, FS, or GS segment limit. |
|-----|----------------------------------------------------------------------------------------------|
| #SS | If a memory operand effective address is outside the SS<br>segment limit.                    |
| #UD | If the LOCK prefix is used.                                                                  |

### Virtual-8086 Mode Exceptions

| #GP(0)          | If a memory operand effective address is outside the CS, DS,<br>ES, FS, or GS segment limit. |
|-----------------|----------------------------------------------------------------------------------------------|
| #SS(0)          | If a memory operand effective address is outside the SS<br>segment limit.                    |
| #PF(fault-code) | If a page fault occurs.                                                                      |
| #AC(0)          | If alignment checking is enabled and an unaligned memory                                     |

| #AC(0) | If alignment checking is enabled and an unaligned memory<br>reference is made. |
|--------|--------------------------------------------------------------------------------|
| #UD    | If the LOCK prefix is used.                                                    |

### Compatibility Mode Exceptions Same exceptions as in protected mode.

### 64-Bit Mode Exceptions

SALISAD/SULISUD\_SHH

If a memory address referencing the SS segment is in a non-#SS(0) canonical form.

#GP(0) If the memory address is in a non-canonical form.

Vol.28 4-359

| Instruction Operand Encoding |                  |           |           |           |
|------------------------------|------------------|-----------|-----------|-----------|
| Op/En                        | Operand 1        | Operand 2 | Operand 3 | Operand 4 |
| Α                            | ModRM:r/m (r, w) | 1         | NA        | NA        |
| в                            | ModRM:r/m (r, w) | CL (r)    | NA        | NA        |
| С                            | ModRM:r/m (r, w) | imm8      | NA        | NA        |

\*\* In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is

\* Not the same form of division as IDIV; rounding is toward negative infinity.

\*\*\*See IA-32 Architecture Compatibility section below

### Description

NOTES:

INSTRUCTION SET REFERENCE N-Z

used AH BH CH DH

Shifts the bits in the first operand (destination operand) to the left or right by the number of bits specified in the second operand (count operand). Bits shifted beyond the destination operand boundary are first shifted into the CF flag, then discarded. At the end of the shift operation, the CF flag contains the last bit shifted out of the destination operand.

The destination operand can be a register or a memory location. The count operand The becoming of physical can be a regarded or internetly reaction. The count of the first optimized in the count of the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). of 1.

The shift arithmetic left (SAL) and shift logical left (SHL) instructions perform the same operation; they shift the bits in the destination operand to the left (toward more significant bit locations). For each shift count, the most significant bit of the destination operand is shifted into the CF flag, and the least significant bit is cleared (see Figure 7-7 in the Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 1).

The shift arithmetic right (SAR) and shift logical right (SHR) instructions shift the bits of the destination operand to the right (toward less significant bit locations). For each of the obstantion of the range (other has a significant of the constraints) in the CF flag, and the most significant bit of the destination operand is shifted into the CF flag, and the most significant bit is either set or cleared depending on the instruction type. The SHR instruction clears the most significant bit (see Figure 7-8 in the Intel% e G and IA-32 Architectures Software Developer's Manual, Volume 1); the SAR instruction sets or clears the most significant bit to correspond to the sign (most significant bit) of the original value in the destination operand. In effect, the SAR instruction fills the empty bit position's shifted value with the sign of the unshifted value (see Figure 7-9 in the Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 1).

4-355 Vol.2B

### SAL/SAR/SHL/SHR-Shift

### INSTRUCTION SET REFERENCE, N-Z

| #PF(fault-code) | If a page fault occurs.                                                                                               |
|-----------------|-----------------------------------------------------------------------------------------------------------------------|
| #AC(0)          | If alignment checking is enabled and an unaligned memory<br>reference is made while the current privilege level is 3. |
| #UD             | If the LOCK prefix is used.                                                                                           |

### (taken from Intel Manual)

4-360 1/01 28

# Dynamic Symbolic Execution on Binaries [Schwartz et al, USENIX'11]

 Finding inputs that explore different paths after executing 100,000s of assembly instructions



### **Compiler-like Design**





## A Simple Intermediate Language

- Consists of 17 language constructs
  - 7 statements and 10 expressions

program ::= stmt\* stmt ::= var := exp | j mp exp | cj mp exp,exp,exp | assert exp | l abel label\_kind | addr address | speci al string

 Our binary symbolic executor consists of ~250 lines of OCaml code



## **Verification Condition Generation**

- BAP provides support for the following VC generation algorithms:
  - Dijkstra's WP
  - Flanagan & Saxe's WP
  - Directionless WP
  - Forward Symbolic Execution
- Interfaces to SMT solvers

   Support for SMTLIB1 & SMTLIB2 formats

# Static Checking of Safety Properties

### [Jager et al, TR'10]

- GNU coreutils leaf functions
  - Integer overflows
  - Memory overwrites
- Formula optimizations improved performance up to 8x



### **BAP** in research

### **Q:** Return-Oriented Programming [Schwartz et al, USENIX'11]

 Finding byte-sequences (gadgets) that perform certain actions



## Many Applications in Security

### Don't redo the engineering. Do the science.





Tucek et al, Sweeper: Defending against fast worms Kang et al, Renovo: Hidden Code Extraction from Packed Executables

MALWARE ANALYSIS

Yin et al, Panorama: Information Flow Analysis to Uncover Malware Brumley et al, Automatic Deviation **REVECTIORSE** 

### ENGINEERING

Automatic Extraction of Protocol Message Format

BEXPLOIT GENERATION

### Are we alone?



## Conclusion

- BAP is the newest incarnation of our framework for binary analysis
- BAP comes with a variety of algorithms and features to make analysis easier
- You can download it for free at:

http://bap.ece.cmu.edu/

### BAP 0.3 just came out!

# Thank you!

<u>thanassis@cmu.edu</u> <u>http://www.ece.cmu.edu/~aavgerin</u> <u>http://bap.ece.cmu.edu/</u>

**Questions**?