“steganography” – obfuscating PDF exploits in depth“steganography” – obfuscating PDF exploits in depth
Shortly after last week’s discovery of a PDF exploit which used the method of this.getPageNumWords() & this.getPageNthWord() for obfuscation, we found another, but much more powerful exploit obfuscation technique in PDF exploits. This technique uses a so-called “steganography” method to hide malicious Javascript code in images embedded in PDF files, it is so powerful as it could bypass almost all AV engines.
The sample was detected as “exploit CVE-2013-3346” by our EdgeLogic engine, same as the previous one.
The sample was first submitted to VirusTotal on 2017-10-10, with a filename “oral-b oxyjet spec.pdf”.
Only 1 AV engine has detected this exploit last week (however, as of writing, the detection is increased to 5/57).
After opened, the PDF disguised as an IRS document looked normal.
Two layers of obfuscation were used in this sample. The first layer is what we have previously disclosed – the method of “this.getPageNumWords()” and “this.getPageNthWord()”. The exploit uses “this.getPageNumWords()” and “this.getPageNthWord()” to read and execute the Javascript hidden as “content”. The related code can be found in PDF stream-64.
stream-64
The second layer is new and it’s our focus in this blog. The “Javascript content” is stored in stream-119, let’s see what it looks like.
After beautifying the Javascript, it shows as following:
In order to figure out what the Javascript have done, we need to learn these two PDF JS APIs, the this.getIcon() and the util.iconStreamFromIcon() at first. Following is an extract from Adobe’s reference.
According to the API references, these two APIs, working together, are to read the stream of an image named as “icon” stored in the PDF file.
By examining the above Javascript code, we figured out that the code’s function is to read and decode the “message” hidden in the icon’s stream. Once it read the “message” successfully , it will execute the “message” as Javascript code, via “eval(msg)”.
The icon stream named “icon” in the object-131 could be saved as a “jpg” file and viewed in image viewer without problem. As shown below:
The malicious data is hidden in the image while the image is still viewable
Nevertheless, there’s no suspicious data can be found inside the icon file, since the malicious code data is heavily obfuscated.
What does the final executed Javascript look like? Here is a piece of the real code, after successful de-obfuscating.
Therefore, we confirmed the exploited vulnerability is CVE-2013-3346.
Furthermore, we deduce that this sample and the previous one were from the same author, for following reasons.
- Both of them exploit the same vulnerability (CVE-2013-3346)
- The similarity of the Javascript code in these two exploits.
After some googling, we found that the attacker likely copied a project/technique called “steganography.js”, which is open sourced here. The project was developed working on browsers. We believe the person behind the PDF samples made their innovation as they successfully leveraged the technique in PDF format. We could not find any information mentioning such technique in PDF exploits before, so we believe this is the first time that the “steganography” technique is used to hide PDF exploits.
Conclusion
We were impressed by this technique, which is perfect for malicious code obfuscation for PDF exploits. By using this technique, all streams look normal, all images are viewable, everything looks legitimate. This can probably explain why almost all AV engines missed it.
In this blog we researched into the truly advanced “steganography” technique used for obfuscating PDF exploits, which is a demonstration of the power of our EdgeLogic engine as we are able to beat this obfuscation technique, among many others.
Just like the previous one, the “steganography” technique could not only be used to obfuscate this exploit (CVE-2013-3346) but also can be applied to many other PDF exploits including zero-days. We ask security defenders to pay close attention to it.