ShellGym: Reversing for Exploitation

In line with the concept of using real-life examples to introduce basic reverse engineering concepts and tools, we covered a recent Winrar path traversal CVE (CVE-2025-8088) and its exploitation process, in the recent ShellGym session with Div0. Exploit POCs already exist, and detection techniques exist as well to detect the malformed rarfile containing the path traversal. Can we evade the detection with a rarfile with encrypted file names (ie, the -hp option in rar.exe)? While this is possible, the POCs no longer work, as they rely on editing an unencrypted rarfile. This short writeup just goes through the process, and our takeaways from this session.

Technical Writeup

Firstly, we generally can assume that the rarfile is built in memory, and that it exists in an unencrypted form while it is being built. Subsequently, encryption is applied and the rarfile is written to disk. We also know that the filenames are written inside this unencrypted rar file, including the alternate data stream (ADS) file. Hence, it should be possible to edit the ADS name and change it to the path traversal, as if we were editing a normal rarfile, prior to encryption being applied. We just need to find a good time and place to do it, within rar.exe's execution. Which is why we turn to some static and dynamic RE to help us with this.
The "learning" comes from hypothesising what the developer would need to do, in order to craft the rarfile. If files are being read, we'd be using the ReadFile API and in order to use ReadFile, we need a handle from the CreateFile API. We can use that as a start point for our analysis, and see where and when CreateFile is called by rar.exe's functions. Naturally, we do expect that rar.exe has many functions that might call CreateFile, and statically reversing might be a tough approach.
Instead, we can just use a debugger (eg, windbg) and set a breakpoint whenever CreateFile is accessed, and see when the ADS file is read by rar.exe. After which, we can trace execution to see what rar.exe does to collate information from the file, including its name, content, and so on. Hence, we set the breakpoint and look at the contents of rcx, where the ADS filename should appear.

Note: The windbg command is done in a one-liner to create this image, instead of having many “g”s to advance the program. There’s no need for such a command beyond some automation.

Now, we can just follow the execution in IDA and windbg to see what rar.exe does with the filename, and pretty soon, we arrive at a memcpy call that sets up the source as the ADS filename. We simply let the memcpy happen, and edit the copied buffer in the destination (at rcx):
Letting rar.exe continue execution, our path traversal makes it into the encrypted rarfile, which bypasses any detection that looks for path traversal characters through file inspection. Naturally, threat actors  trying to do phishing would encrypt the rarfile with a convincing password (eg, the usual date of birth + last few digits of IC... which are somewhat easy to acquire, or something associated with the victim). So, update your Winrar, people!

Note how the unencrypted rar has the path traversal indicators in the file, whereas the encrypted version doesn’t

Regarding this ShellGym and learning how to use tools

Learning tool-based reverse engineering from scratch in 4 hours is definitely not possible. And, fighting with tool-use tends to impede the higher-order thought process on how to approach the technical problem. Arguably, this shellGYM was the toughest one so far for newcomers, so far. While we could use CTF-like puzzles, we still believe that real-life problems are more representative, and that there are simpler real-life problems to solve. I think many of our attendees, given some self-learning time and some online guides on the usage of tools, would be able to find the solution mentioned above, perhaps over a longer duration. This will be our approach for the fundamental Forensics, Reverse Engineering and Exploitation (FREE) course for the future, specifically:
  1. No course content on how to use tools. Instead, we'll link trainees to suitable resources (and AI, of course).
  2. Walkthroughs on how to solve some problems, in which the tools are used and the thought process is explained.
  3. Self-guided practice and consultation, based on the walkthrough, and other similar problems at the same difficulty level.
Fundamentally, we believe that every individual has their own approach, preferences, and workflow. Instead of imposing our approach on trainees and assessing them rigidly, as far as possible, we want all our trainees to build their own foundations and gain the confidence in their process. After all, the industry needs people that know how to think and conduct their own research independently, rather than a bunch of people that know how to use tools and need perpetual guidance from their seniors. That way, you'll have a use after FREE.
Next
Next

Two courses, one common development approach