Screen captures are super useful in my workflow and OS X makes it easy with just a few key combinations. However, I was really curious (worried) if someone could take a screen shot without my knowledge. So, I decided to figure out how that mechanism works and see if there was a way build malware to covertly steal these pixels.
Reversed the screencapture utility to find out how it uses the standard framework functions. Then traced the mechanism to the WindowServer and wrote a utility to covertly grab screens; sandbox-exec can’t stop the screen gabs. Used frida to detect someone grabbing the screen pixels covertly. There is malware, from 2013(!), that steals people’s pixels: macs.app . As expected, there are multiple ways to get the screenshots. This was known since, at least, 2011!
Why does OS X allow any GUI/CUI program to capture the entire screen? There are dangerous security implications here! I propose to add mach message ID filtering to the sandbox configuration. WindowServer needs a mechanism to white list signed binaries that can execute privileged RPC functions.
How does the capture work?
If you wish to reproduce or follow the steps I’ve taken, linked below are the binaries that I used for the reverse engineering. The binaries are from MacOS High Sierra version 10.13.3.
- screencapture (7a76ff24fbb9e2f1b1ca07e6d3f351114cf5af42)
- SkyLight (1481334038bd636ba0fc4c983c04e1787b33a5d5)
MacOS comes with a utility for capturing the screen pixels into an image file: /usr/sbin/screencapture. It is a useful utility and, I’m guessing, screencapture is what gets executed when I press the right key combinations on the desktop to take full or partial screenshots. So, I decided to reverse it and see how it actually does the capturing. Turns out it wasn’t so complicated.
Starting the trace at the very beginning. This is where the command line arguments are processed; see __text:100002640 and __text:10000287E. So, there is a good chance that this is where we should start tracing.
To be user friendly, the utility uses a shutter sound to indicate that the screen has been captured. So, I turned up my speakers and started debugging! The sound would serve as guiding light to help narrow down the useful code.
Unfortunately, the sound is played very early in the process. At least, when I hear the sound, I know I’m on the right path.
I know this is the sound playing function because it is essentially the wrapper to these calls (below). Also, because I can hear the sound after the functions finish execution!
Let’s go back to the take_the_screenshot function (where the sound is played). Using a debugger, I step through a bunch of instructions (tedious!) when I notice a function that calls _CGDisplayCreateImage of the CoreGraphics framework. That looks promising!
I named this function doCapture but at this point I’m not 100% certain if the name is accurate. However, without going into that function, I notice that the calls after doCapture, within the take_the_screenshot function, record an image to disk. I’m guessing the image being written to disk is the screenshot in question. Seems like a reasonable assumption, so I decided to follow that thread.
I named this function writeImageToDisk. And if you look inside, there are all sorts of references to recording images to a file on disk. Particularly interesting are the error messages:
And so, this is more support that doCapture is the function that does all the interesting bits. Let’s keep the name and dig into it some more.
CGDisplayCreateImage looks promising, but at this point it could have number meanings. However, I’m a reverse engineer, I’m not afraid of going down a few rabbit holes! Well, this function is actually just a stub:
So, I go to the CoreGraphics (_CG gave that away!) framework and look for the function there:
Ummm, what? That function doesn’t look like it does anything useful! Worse, it does not look like it can even execute. What’s going on here? Well, we go to our trusty LLDB debugger! Obviously, there is some sort of a runtime linking mechanism that replaces the CoreGraphics function with something else.
Looking at the stack trace, it becomes obvious that the actual implementation used is actually the similarly named SLDisplayCreateImage function from the SkyLight private framework. So, what we saw in the CoreGraphics framework was some sort of a stub – makes sense, since there is non-executable content in there! Let’s keep digging 🙂
Looking at the assembly of _SLDisplayCreateImage, I can see that it is essentially a wrapper function for _SLSHWCaptureDesktop
Intuitively, I’d expect that the actual contents for the screen pixels will be in a buffer of some service. So, I would not expect the user application to access that buffer directly in order to capture an image. That means there should be some sort of an IPC mechanism between the user application and the GUI service. On OS X, IPC means MACH PORTS .
Below is the disassembly of the section of the function that sends a mach port message to the GUI Service in order to obtain the actual pixel content.
Even though the assembly looks messy, the message is very simple and looks like this in psuedo-code:
This message executes an RPC function which take the arguments of the capture rectangle dimensions along with the display ID to capture from.
Tracing the remote_port variable, we can see that it is derived from a bootstrap call from within the _SLSMainConnectionID call.
It is a bit of a distraction to follow these steps in the same detail. However, there is a stack trace that looks like this:
Looking at the references to _bootstrap_look_up2, two names show up that look interesting:
We need to find out which service publishes these ports with these names. I wasn’t quite sure how to do that directly, so I took a slightly different approach. Instead, I set a breakpoint on the _mach_msg and looked at the message header to obtain the remote port number:
The remote port number is 0x00002113. Then using lsmp command line tool, I can see that port 0x2113 belongs to the WindowServer process:
Loading the WindowServer in IDAPro, I can see that it uses the same framework at its core as the screencapture utility. That’s kinda cool!
The WindowServer program is basically a simple wrapper for the functionality in the SkyLight library that I’ve been analyzing all this time. This makes life easier in many ways. So, I looked for a corresponding capture function – just thinking that one should exist by, perhaps, a slightly different name. Doing a simple text search, I found _XHWCaptureDesktop. Without hesitation, I attached the debugger and set a breakpoint. This is the resulting backtrace which looks super interesting!
Setting a breakpoint on _XHWCaptureDesktop and triggering a screencapture, we get a nice trace that confirms the theory! This is great because if we want to keep an eye on who takes screenshots on the system, we can just look for calls to this function!
Detecting a screenshot
After analyzing the process of how the screencapture utility works, I became curious if there was a way to detect when my screen gets captured. One mechanism is to use the mdfind utility. This is what Dave DeLong  used in his method. However, it seems to depend on the capture utility to generate an image file and set the kMDItemIsScreenCapture = 1 attribute within the file. Fairly certain that malware wouldn’t do that. Well, unless you’re developing KitM.A malware (see the Malware section)! This section is my exploration for how to perform detection of someone capturing the pixels off of my screen using the method reverse engineered in this article.
Detecting if some process has requested a screenshot is actually quite easy with the right tools. Using LLDB is too heavy and we don’t really want to breakpoint a service that is being used. So, instead I decided to use Frida . It is a great tool for dynamic analysis and uses techniques similar to those that would be applied by a production endpoint security tool.
For some reason Frida would not resolve the _XHWCaptureDesktop function, however I was able to specify it by the offset into the dynamic library. The name resolution is probably some sort of a bug within Frida because all the other tools I’ve used (IDAPro, LLDB, nm) have resolved the symbol just fine.
Luckily for us, the mach message that contains the request from the client is passed in as an argument to the _XHWCaptureDesktop function. The pointer is passed in the RDI register.
We can see that the message ID is 0x0000732a (see the psuedocode above, in the screencapture reverse engineering section, for details) and the local port is 0x000153ab that is the port this request was sent from. Let’s use lsmp to track this port.
There’s not really a good way to format the output of lsmp, but if you scroll to the side you will see that 0x000153ab is connected to the WindowServer process. This is how we can derive the PID of the process that made the request.
Just to confirm, we can also see that the WindowServer process has a reference to this port as well:
In the conclusion, I mentioned that back in 2013 there was some malware that had screenshotting as one of its features. So, I obtained this sample. It is called MAC.OSX.Backdoor.KitM.A by F-Secure and, by now, it is detected by everyone. You can download it here: malware_KitM.zip Password infect3d.
Doing some quick reverse engineering, it’s easy to see that the malware actually uses the screencapture utility that comes with the OS. It generates the screenshot images and uploads them somewhere. What’s interesting is that it means these screen capture images could be found using the mdfind kMDItemIsScreenCapture:1 command.
Building the grabber
Let’s say I was a Russian Hacker and I wanted to covertly steal your pixels. Using the screencapture utility would work, but I don’t want to give myself away by shouting the shutter sound. Luckily for me there’s a super easy way of doing it myself! All I have to do is use the right libraries that are already on every OS X instance.
As it turns out, there is more than one way to grab screen pixels. In his blog, Felix Krause  uses the CGWindowListCreateImage function to capture the image. He goes a step further and actually sends the image through an OCR tool to extract the text. Cool! Below is my code for leveraging the same mechanism as the screencapture utility was revealed in the previous section.
Let’s see how this code works in action:
The video on the left shows screen capturing via the command line or SSH. The video on the right shows the same thing by via a Cocoa App that is running from with in a very restrictive sandbox. The sandbox configuration that you would get if you get an application from the AppStore. Below is the screenshot 😉 of the sandbox configuration that the app was build with. I know it was taking affect because I had to allow the App to store files in the Downloads folder otherwise it would get blocked by the sandbox.
No other permission was given to the App. By default the App pretty much cannot do anything on the system. This means that malware could come fully sandboxed and still steal your precious pixels!
As far as I could tell, pretty much any user and any process that has access (which is a lot!) to the GUI window server can request all the pixels. The closest way I found, as far as prevention, was to use the sandbox, via sandbox-exec command, mechanism with a strongly defined policy.
I’m not really an OS X expert, but I read some blogs . There I found that OSXReverser has developed a manual on how to configure the sandbox. The closest thing I could find was to prevent the process from looking up the WindowServer port via its name. However, this is not a practical mechanism because lots of applications will want to access the GUI and, more important, port numbers aren’t that hard to bruteforce!
Instead, I really wish there was a mechanism to block mach messages with a specific message ID. For example, something like this:
Dare I say that we need a way to do deep message inspection and filtering on OS X? Ideally, there should be a mechanism where the WindowServer could white list the processes that are allowed to call certain RPC functions.
This way not every process would be allowed to steal pixels. Pixels that could contain private, confidential information like banking records, secret keys, or plans to the Lockheed Martin F-35 Lightning II !
0 – Mach Overview
1 – Frida
8 – @patrickwardle: …but we can’t say we weren’t ‘warned’ From 2011
Synack provides initiatives to help foster the researcher community and engage top talent; technology to optimize researcher efficiency and accelerate vulnerability discovery, opportunities to work on unique targets, personalized support, and skills development. We do this through the Synack platform and our SRT Levels program which includes fun competitions, gamification, mentorship, and specialized projects.
Apply to join the Synack Red Team and become one of the chosen few. We provide the best support for our researchers, and put the highest quality, most relevant features into our platform – it was designed by hackers for hackers.
If you’re up for the challenge, apply today, and use code “SRTBLOGS” in your application.