Deconstructing and Reimplementing macOS’ screencapture CLI

I like to be prepared for worst-case-scenarios.
And when I found out my Mac app ScreenFloat didn’t work anymore on Apple’s upcoming macOS Sierra because of a new sandbox restriction (you can read the backstory here), I knew fixing it could have gone one of two ways:

  1. Apple fixes it for me in a new beta of macOS Sierra (which, as I now know, is what happened in the form of a new sandbox entitlement), or
  2. I’d have to write my own solution for creating screenshots, basically reimplementing macOS’ screencapture command line utility

At the time, I didn’t know Apple would provide a new sandbox entitlement, so for me, the only choice was to take a couple of days and reimplement macOS’ screencapture‘s functionality.
Now, when I say “reimplement”, I mean I looked at the features I needed in ScreenFloat and implemented them, leaving those I didn’t need aside (fullscreen screenshots, screenshot sounds or capturing windows’ shadows, to name a few).
Time was of the essence, after all, and I didn’t know how long it would take me to implement this stuff.

Deconstructing screencapture

Before I started working on my own solution, I thought it’d be good to understand how macOS’ screencapture utility was implemented.
I executed ‘strings /usr/sbin/screencapture’ in Terminal thinking I could find a clue as to how capturing the screen was done, but all I could find were references to private APIs, like CGSGetScreenRectForWindow or CGSGetWindowLevel:


kCGSMovementParent
CGSGetScreenRectForWindow failed: %d
CGSGetWindowLevel failed: %d

I could not find any references as to how capturing the screen is done, but I suspect the CGDisplay* APIs are used, and ‘nm /usr/sbin/screencapture’ seems to confirm that theory:


U _CGDisplayBounds
U _CGDisplayCreateImage
U _CGDisplayCreateImageForRect

Next, I wanted to know how screencapture draws its selection rectangle and cursor.
Knowing there’s basically no drawing on screen without an NSWindow, I created a small app that would filter out screencapture‘s windows during an interactive screenshot, create an image of each and write them to disk.
In doing so, I learned the following:

  • screencapture uses 5 windows to display its selection rect: 4 for the edges and 1 for the fill
  • These windows are present even if you’re not currently making a selection (albeit transparent), and they follow around your mouse cursor
  • The selection cursor isn’t drawn in its own window, it’s an ordinary NSCursor, using private APIs to set it
screencapture CLI's windows during selection

The windows surprised me.
Why would you need 5 windows to draw something you could draw in one, using Core Animation or an NSView?

screencapture has existed since Mac OS X Jaguar (10.2), and there was no Core Animation framework yet, so that’s out of the running.
That leaves NSView’s -drawRect:. Why not use that? Frankly, I don’t know. But I suspect it’s a performance thing – perhaps drawing 5 individual single-colored windows was faster on Mac OS X Jaguar (and still might be on today’s macOS) than one NSView’s -drawRect: and they just kept going with it over the years.
A friend and indie-colleague of mine, Andreas Monitzer (@anlumo1 on twitter) confirmed my suspicions: “Single-colored windows don’t need -drawRect:, and that’s probably just way more efficient.”

Also interesting is that screencapture adds specific Spotlight metadata tags to the screenshot files it creates:

  • kMDItemIsScreenCapture – a boolean value indicating whether the image file is a screenshot (YES). Only present when the file is a screenshot, so there’s no case where NO would be specified – the tag would be missing instead.
  • kMDItemScreenCaptureType – the type of screenshot: “display” for a fullscreen screenshot, “window” for a window-selected screenshot and “selection” for an interactive screenshot.
  • kMDItemScreenCaptureGlobalRect – Not really a rectangular value. As far as I can tell, it only contains the interactive screenshot’s origin point’s x value (where on the screen the screenshot was taken).

To set the cursor, I suspect there’s some private API magic at play.
An app can set its cursor in different ways: via cursor rectangles or directly via NSCursor. But it only works if the app is active and its window is frontmost – something that isn’t the case with screencapture.
‘nm’ reveals the private API CGSRegisterCursorWithImages, and several more:

U _CGSCreateRegisteredCursorImage
U _CGSGetCurrentCursorLocation
U _CGSHardwareCursorActive
U _CGSHideCursor
U _CGSRegisterCursorWithImages
U _CGSSetSystemDefinedCursor
U _CGSShowCursor
U _CoreCursorSet

So much goodness I’m not able to use in a Mac App Store app, just because it’s hidden away in a private API…

screencapture also does another thing I was interested in: When you press ⌘-⇧-4, followed by the space bar, you can move your mouse cursor over individual windows to screenshot them exclusively, and it’ll tint it to let you know about the selection.
I have no idea how it’s done – it’s either a window that gets painted over the selected window, inserted at the correct hierarchy level, or it’s a private API that lets you paint over any NSWindow.

Behind-Window-Window-Selection

Now I was ready to begin

Reimplementing screencapture

In reimplementing the features I needed, I had to hit several milestones:

Milestone #1: Actually capturing the screen somehow

Because I was using screencapture via NSTask to do all the heavy lifting for me, I had no idea where to start for creating user-selected screenshots.
I started with giving AVFoundation a try, as I remembered a couple of WWDC sessions mentioning capturing the screen – for video. Soon enough, though, issues popped up.
Like the image’s compression. A screenshot created with AVFoundation wasn’t anywhere near the quality a screencapture screenshot has – although it was to be expected, since it’s for videos, and videos are heavily compressed.
When text is involved, it’s especially painful:

Comparison of AVFoundation capturing and screencapture's outputLeft: AVFoundation’s output. Right: screencapture‘s

There are different quality settings you can try to play with to improve the shot a little, but it’ll never come close to anything screencapture produces. That’s just unacceptable.
After poking around screencapture, I learned Apple provides APIs for exactly this purpose: CGDisplayCreateImage, to capture an entire screen, and CGDisplayCreateImageForRect, for a manual selection, which is what I was interested in.

#Milestone 2: Drawing into a completely transparent NSWindow

For NSWindow to react to mouse events, it needs to have a colored background with an transparency value of at least 0.05. That sounds very low, but it’s still very noticeable when it’s suddenly put over your screen. You may not be able to pinpoint it, but you know something happened.

A white window with an alpha value of 0.05Left: The desktop. Right: The desktop below a white window with an alpha value of 0.05

That’s why I’m very grateful to Nick Moore, indie developer of PilotMoon fame.
He discovered that you can have a completely transparent NSWindow accept mouse events, by setting its contentView’s layer’s contents to a transparent NSImage.

#Milestone 2: Selection Drawing

With that out of the way, I could move on to actually drawing a selection.
For what the selection would look like, I didn’t have to think much. I wanted to keep it consistent with screencapture: white borders with a white-transparent fill.
But the APIs I’d use to draw the selection were up to me.
Since I didn’t want to use 5 windows like screencapture does, I briefly experimented with NSView’s -drawRect:, but discarded that in favor of something more modern and more performant: CAShapeLayer.
In my tests, it’s looks and feels just as the original, and even if it’s not, it’s indistinguishable to my eyes (on a retina MacBook Pro Mid-2012).

The selection’s functionality would have to be the same as screencapture‘s – drag to select, keep the space bar pressed to move the selection, release the space bar to continue the selection, press the space bar once to enter window-selection mode, press it again to exit again. Nothing that couldn’t be done with NSResponder’s ordinary methods.

Moving a selection with the space barMy reimplementation’s behavior when using the spacebar to move the selection.

Milestone #3: Custom Selection- and Window Capturing

Because of CGDisplayCreateImageForRect, creating a screenshot of a custom selection is fairly straightforward.
Window capturing is a little more work. You *could* do it with CGDisplayCreateImageForRect, but you’d have remnants beneath the window’s rounded corners of anything that was below it at the time of the capture.
It’s good to know Apple provides an API for those cases as well, then: CGWindowListCreateImage. It will let you define the window you’d like to capture and the “features” it should have (shadows, no shadows, include windows below it, don’t include them, etc).
You can get a reference to the window you’d like to capture using CGWindowListCopyWindowInfo – it’ll give you its rect on screen, its window level and more information about it.

Milestone #4: Window Selection Drawing

That last API, CGWindowListCopyWindowInfo, also comes in handy when drawing the window selection (for when you hit the spacebar).
After all, you need to know where all the windows are and their dimensions on the screen, so you can draw the selection accordingly.
Once I have that information, it’s easy to put up another CAShapeLayer above the selected window.
But wait. What if the selected window is beneath another window, or several windows?

Faulty Window SelectionA first try at implementing window selection, clearly failing for windows that are beneath other windows.

That’s where it gets tricky and why I assume Apple is using private APIs here, which lets it either insert a new, selection-color-colored, translucent window into the window hierarchy at the right position or draw directly onto the window.

My solution was to use another CAShapeLayer.
It’s based on CGPath (which I convert to from an NSBezierPath) to draw the colored overlay.
I use the results from CGWindowListCopyWindowInfo to find out which windows are atop the currently selected window, create an NSBezierPath from their bounds, subtract them from my initial NSBezierPath and feed that to the CAShapeLayer.
It works pretty well:

Working Window Selection

It didn’t work like this right away, though. There was a lot of trial and error, sweat and, yes, tears involved in this. But I think it was worth the effort. Doesn’t it look just like the original?

Milestone #5: Drawing the Cursor

The cursor in screencapture has a unique feature: it displays the coordinates next to it, or if a selection is being made, the dimensions (width and height) of that selection.
I wanted the very same thing, so custom drawing was necessary.

I started out using NSCursor, but for every mouse move I’d need to create a new NSImage with the coordinates (or dimensions) and set it as the cursor – that seemed pretty inefficient to me.
I then moved on to using part NSCursor, part CATextLayer. The NSCursor part would be the crosshair, a constant image, set-it-and-forget-it.
The text layer would be updated in -mouseMoved:, updating its position to be next to the cursor and its contents to reflect the cursor position or dimensions.
Sure enough, it worked, but when moving the mouse cursor around fast, the text layer would “lag behind”, not correctly sticking to the cursor. It’s nothing major, but it bothered me.
With this, we come to my final approach.
I hide the system cursor entirely and draw both the crosshair and the text using CALayers. Since they both are now updated inside the same -mouseMoved: (or -mouseDragged:) call, there’s no noticeable “lag” for the text layer – they now move together nicely, as if drawn in one NSCursor object.

Milestone #5: Focus without focus

The tricky thing about screencapture is that it has mouse and keyboard focus without it taking you out of the app you’re in (the window’s close, minimize and fullscreen buttons are not greyed out, for example).
Again, in my desire to be consistent with the Apple-provided command line utility, I needed the very same thing.

It’s difficult to get certain NSResponder method calls if your window is not key, and it prompted another trial and error session to get the right combination of -isKeyWindow, -acceptsFirstResponder, etc.
It now works, but it’s not as nice as Apple’s implementation.
Which brings us to the Caveats section.

Caveats of my implementation

Caveat #1: “Across-Screens Screenshots”

With screencapture, you can make a selection that spans several displays, due to the way it’s implemented (using NSWindows to draw the selection).
With my implementation, that’s not possible, as I put up one transparent window for each screen that’s connected to your Mac. But I settled – I think it’s very rare you’d make that sort of a selection.

Caveat #2: The Cursor

In some cases, the system cursor will pop up again, and it won’t go away until a new screenshot session is started.
This is the hideous result:

The system cursor drawn above the custom cursor

Caveat #3: Capturing Fullscreen Windows

When you click the green fullscreen button on a window, it transitions from being one window into being two windows without you knowing about it – one for the the titlebar/toolbar (which moves down a little when you move your mouse to the top edge of your screen to reveal the menu bar), and one for the actual window.
screencapture is somehow aware of this, and when you do a window selection, it will properly draw its selection rectangle above the entire window.
My implementation doesn’t know about fullscreen windows and treats those two windows separately:

My implementation's fullscreen window selection bugInstead of the entire window, just the titlebar and its shadow are selected.

I haven’t found a solution to this, yet.

Caveat #4: Exposé

When you start Exposé to show all open windows, screencapture can be used to screenshoot them individually.
My implementation falls short of that, as Exposé seems to be a semi-modal mode where other windows can not be moved over it.
Of course, screencapture can.

Pros of my implementation

Apart from these caveats, I also see a couple of upsides to having your own implementation:

Pro #1: Control over the UX

With a custom implementation, I can change anything I want at any time, be it any cursor icon or the behavior in general. I don’t have that luxury with Apple’s built-in tool.

Pro #1 and a half: Timed Shots

Speaking of UX:
Yes, Apple’s built-in tool features timed shots (a screenshot that is not created immediately, but after a small delay). But I think it’s clumsily implemented.
When you start a timed shot:

  • You don’t know how much time there’s left until the shot is taken
  • You can’t click anything beneath the selection rectangle

With a custom implementation, I can provide a better experience here.

Pro #2: Screenshot placement

Something that concerns ScreenFloat exclusively is the placement of the floating shot after taking it.
Using screencapture, I just centred the shot at the mouse cursor. It works, but it’s not very nice.
With a custom implementation, I know exactly where the screenshot was taken and can place it accordingly:

Screenshot Appearance

or I can do a little animation to make it more clear that a screenshot was created:

Screenshot Appearance Animation

Pro #3: Sandbox

With a custom implementation, I don’t need to worry about temporary entitlements – it works without any.

Open Source?

I’m planning on making the source available at some point, but before I do, there’s still a couple of things I need to do, like implement timed shots, which I haven’t gotten around to yet.
Also, so I don’t make a fool of myself, I need to clean up the code, and that’s dependent on the free time I get, which recently is little (I’m not complaining – I love being busy).

Anyway, this has been quite the journey, and if I can manage to fix some of the caveats I described above, I might use my own implementation instead of Apple’s built-in screencapture CLI in ScreenFloat at some point. But for now, I’m happy I’m again able to use Apple’s built-in tool.

Eternal Storms Software Logo– – – Do you enjoy my blog and/or my software? – – –
Stay up-to-date on all things Eternal Storms Software and join my low-frequency newsletter (one mail a month at most).
Thank you :)

Yoink Automator Workflow: Add Last Saved File to Yoink

Yoink users have been automating adding various files to Yoink via Automator Workflows for a while now – from adding mail attachments or screenshots, to adding files from the Terminal.
Douglas (@douglasjsellers on twitter) today adds to this list of wonderful workflows an Automator Workflow that lets you quickly add files that you created/saved recently.

Here’s what he says about it:

I cooked up my Automator Service that lets you send the last file that you saved (from any application) to Yoink.
When bound to a key combination this allows you to do things like “Export to Web” from Adobe Photoshop, hit the key combo and then the newly created png is on Yoink.
Or say you’re editing a file in Emacs and you want to add it as an email attachment. You save the file, hit the key combo and the file will then be in Yoink for easy dragging into your email.
I also use it heavily to get recently downloaded files from chrome to Yoink.

The Automator Workflow

Screenshot of Automator Workflow for Yoink

The main part of this workflow consists of a complex shell script that finds files that were recently saved, excluding files that are less likely to be needed in Yoink – obviously, this is something everybody needs to configure for themselves, but since this is an Automator Workflow, it is easily done.

Download

The Automator Workflow is available for download here (~130 KB).
My thanks to Douglas for his awesome work.

Installation & Keyboard Shortcut Setup

To install this workflow, download it from above, unzip it, double-click it and click on Install when this dialog comes up:

Instaling the Automator WorkflowClick on Install if you’d like to install the service, click on Open with Automator if you’d like to make changes

To create a keyboard shortcut for this service:

  1. Launch System Preferences
  2. Click on Keyboard -> Shortcuts -> Services
  3. Find ’Send Last File to Yoink’ in the list, under ‘General’
  4. Click on ‘add shortcut’ and enter the shortcut you’d like to use to activate the service

If you have any feedback regarding this workflow or if you’d like to share a workflow of your own, please be sure to get in touch either via twitter or eMail. Thank you and enjoy :)

How To: Detect Force Touch-Capable Devices on the Mac

In an effort to show preferences for configuring Force Touch in my apps (in particular, Yoink) only when a Force Touch device is actually available, I had to find a way to figure out how to detect Force Touch devices.

If this was iOS, I’d be done by now

On iOS, Apple provides a simple API for this:

UIForceTouchCapabilityUnavailable and UIForceTouchCapabilityAvailable

which you can check by calling UIView’s – (UIForceTouchCapability)traitCollection;. Lo and behold, a simple API like this is sadly not available on OS X. On the Mac, you have to do it yourself.

IOKit is where it’s at

Using IOKit, you can sort of set up a set of properties you’re looking for in a device and see if it returns anything. This does not only include an external or internal keyboard, mouse or trackpad, but also graphic cards, for example. To find out what the right properties are, I downloaded Apple’s Hardware IO Tools for Xcode 7.1 from their developer downloads site (Apple Developer account required) and launched the app IORegistryExplorer.

Digging down the IO Registry

I do have a Magic Trackpad 2 “attached” to my Mac via Bluetooth, so I tried searching for the term “Trackpad”, and sure enough, I saw my internal and external ones:

IORegistry TrackpadsThe Trackpads attached to my Mac, either via Bluetooth or internally.

Having found the Magic Trackpad 2, the next step is to see what properties it offers and if they are unique to the class of the device:

Trackpad PropertiesJackpot!

Sure enough, there it is – ForceSupported: True. I could not find such a key in the internal trackpad’s properties, hinting that it might be exclusive to devices that do support Force Touch. There’s also an entry for “Manufacturer”, which is “Apple Inc.”. Perfect.

Looking for Devices with IOKit

Now all I have to do is filtering devices by the Manufacturer – “Apple Inc.” -, iterate over the resulting devices, and filter out devices that have a DefaultMultitouchProperties key, containing a ForceSupported key with a value of true. If such a device is found, it means a device with Force Touch capabilities is available.

Code Listing #1

In this method, I create a dictionary mDict that is used to find matching devices. In this case, I’m looking for devices with the Manufacturer set to “Apple Inc.”. I query for possible devices using IOServiceGetMatchingServices. I can then iterate over the returned io_iterator_t iterator and recursively over the children in the core of all of this: – (BOOL)_containsForceTouchDevice:io_iterator_t)iterator;.

Code Listing #2

Here, we iterate recursively over iterator’s objects, checking for the DefaultMultitouchProperties key and, subsequently the ForceSupported key (and value).

Testing it with other Trackpads

The code you see above is final. However, in a previous version, all I could use to test it with was my Magic Trackpad 2 – an internal Force Touch Trackpad (the likes of which the new MacBooks and MacBook Pros feature) was not available to me directly. So I sent the first draft of the code to my friend (and fellow developer) Maurice Kelly (@mauricerkelly on twitter) who was kind enough to volunteer as my “test subject”; and – of course – it didn’t work. Turns out I shouldn’t assume the vendorID to be the same (which I used in the first draft just like the Manufacturer to filter the possible results). After leaving it out, it worked fine over all currently available Force Touch Trackpads.

Hot (Un-)Plugging

Sometimes you might want to get notified when devices are plugged in to or unplugged from the Mac. In Yoink, I’d like to display preferences specific to Force Touch if according hardware is available, but not show them if there isn’t any hardware connected that supports it. Also, as a nice touch, I’d like to hide the preferences if according hardware is disconnected and show it again when it is connected. We can accomplish this like that:

Code Listing #3

Sandbox

The code works just as well in the Sandbox environment, these entitlements have to be set, though, for the notifications to work:

Sandbox Entitlements

com.apple.security.device.usb and com.apple.security.device.bluetooth

Future Proof?

I’m not sure this code is future proof. The basic APIs will stay around, sure, but it all depends on the keys DefaultMultitouchProperties, ForceSupported and their according values which are not defined in the IOKit header files, which I unsuccessfully searched for constants pertaining to force/pressure. So, depending on this, the code might break with future versions of the Force Touch Trackpad and I’ll have to keep testing it as Apple releases new hardware. Nevertheless, I’m quite happy with the code and it works very well.

Getting the Source Code

I’ve uploaded a sample project to my server, you can download it here. The category on NSApplication in which this is available can be downloaded from Github. The example app is pretty simple:

Force Touch Device Detection Example App Screenshot

At launch, it asks if a Force Touch capable device is available and displays an according message. If subsequently the availability status changes, the message will be updated accordingly.

If you’d like to get in touch, you can mail me or write me on twitter. I’m looking forward to hearing from you :)