Introduction to Computer Security

Reverse Engineering

Worth 60 points

Description

This project focuses on cryptography and reverse engineering. It is divided into 4 parts. For each part you're given a file containing a unique string that you need to submit. You can generate your files using the gen-reveng binary in the CS-354 repo (under projects). Note that you will supply your netid as the only command line argument, and it must be entered precisely; everyone's solution is uniquely generated based on their netid, so if you enter it incorrectly, the autograder will not give you credit when you submit your solutions.

To get started, open up your student container (clone the repo if needed) and navigate to the project directory in /mnt.

git clone https://github.com/cs354/CS-354.git
bash CS-354/student_environment.bash
cd /mnt/projects/reverse-engineering     # This is your working directory
exit

IMPORTANT NOTICE

During a kernel update for vicious machine on 06/27/2025, the part2 file was broken and could not be executed anymore. To address this issue, we have provided a repair tool to fix the part2 file automatically. Specifically, you need to run the gen-reveng program on vicious machine, not in the student environment. This is to prevent changes to the part2 file from requiring root permission. Then run the /fix_part2 program to patch your part2 file. Running part2 binary produces a segmentation fault before the patch, and prints some "Generating the string..." after the patch.

# on vicious machine, outside the student container
./gen-reveng yournetidlowercase
/fix_part2 /home/yournetidlowercase/CS-354/mnt/projects/reverse-engineering/part2

Part 1: The goal is simple. Crack the password. It is an MD5 hash, known to John the Ripper as format=raw-md5. Feel free to use your own password list and whatever strategy you're most comfortable with. The password is easy, but it probably won't be found on a wordlist without any other rules. Note: since the password is generated randomly from a wordlist based on real passwords, there is a small chance that your solution contains an inappropriate word or phrase. This is in no case a reflection of the opinions of the course designers or staff, and is merely a consequence of real-world usage.
Part 2: Find the string generated by the binary, probably using gdb. It is generated as the program runs. The correct string is in ASCII-encoded hexadecimal. The binary may not run on your local machine. If you get a segfault immediately, try running on vicious.
Part 3: This is an x86-64 ELF executable run through a 16-byte XOR cipher, as if it is a packed payload. The string for this part of the project is the base-10 exit code of the original program. To find that, you must determine the 16-byte key, decipher the program, run it, and view the exit code. The UNIX command echo $? prints the exit code of the last program run.

Where to start: Running file on the original program (before it was ciphered) yielded the following: part3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, no section header. Possible values for these header fields are listed in the official specification: https://refspecs.linuxbase.org/elf/TIS1.1.pdf. This should give you an idea of what the ELF header looked like before it was run through the cipher. If you don't like reading manuals, you can also try to reconstruct the file by comparing it to other ELF binaries.

This part is best solved using the command-line tool xxd and a custom C program. However, you are free to solve the problem in any way you wish.
Part 4: This is base64 encoded encrypted text under 128 bit AES in CBC mode. The IV is the first 16 bytes of the ENCRYPTED (i.e. not the original) text, and the remaining bytes are the actual encrypted text. The 128 bit AES encryption key is h4ckth1sk3yp4d16. Submit the decrypted text. The correct solution is ASCII-encoded hexadecimal. Hint: (1) There are three stages to this problem: decode the base64, then separate the IV from the rest of the encrypted message, then decrypt. (2) If text is mangled after decrypting, you probably need to double-check that the IV is set correctly.

Tools:

john
gdb
xxd
base64
openssl

A final note:

It is certainly possible for you to reverse-engineer the algorithms we used to generate your unique solutions. You are welcome to do so to receive full credit, but this is a much more challenging task than doing the project as intended.

Additional Readings

For those who are curious about how a kernel update can affect the things built upon containerization, we have some additional readings here:

To maximize obfuscation, the max page size of part2 binary is set to 1 at the linker stage while compiling the part2 source file, resulting in the three independent segments: text, data, and rodata (read-only data) being merged into one segment at runtime. Originally, these segments have different permissions: text->r-x, data->rw-, and rodata->r--. Before the update, the user program loader in the kernel will set the execution permission of the final page to the OR result of all pages' execution permissions. Thus, the final page is executable as long as there exists at least one executable page before merging. However, the user program loader after kernel update changed the page permission operation from OR to AND, making the final coalesced page of part2 binary non-executable. This can be confirmed by checking the memory map information of part2 binary at runtime, which unsurprisingly outputs a page without the "x" permission flag. If you launch the program through gdb, it will cause a page fault and receive a SIGSEGV when it tries to execute the first assembly instruction.

In order to fix this, what we need is either an ad hoc user program loader that brings the old feature back or a tool that adds execution permission to all three pages of part2 binary, such that the part2 program is executable at runtime. To simplify the patch, we chose the latter one and implemented a tool that is accessible to all users under the / directory. Briefly speaking, the tool finds the corresponding p_flags bit that controls the execution permission of the three segments (text, data, and rodata) in the program header and sets 1 at that bit so that all three pages are executable, and therefore the final coalesced page is also executable.

Finally, let's come back to discuss why containerization fails to maintain the portability of this project. The key point here is that the container shares the same kernel as the host machine. A process in the container is mapped into a process on the host machine, isolated and scheduled by the kernel of the host machine. Therefore, a significant change in the kernel can affect what is expected to happen in the container. Similarly, a program that runs smoothly on Ubuntu 14.04 can crash easily on Ubuntu 24.04. The kernel is changing and evolving, fixing bugs and patching vulnerabilities along the way. The change affects the part2 binary is probably a security update. Hope you enjoy this project!

Introduction to Computer Security

Northwestern CS, Fall Quarter 2025

Reverse Engineering

Description