Digital Media | Spotlight: Photography | Inside Aperture | Blogs
The Deep Dark Depths of the Database (and some fingerprinting)
Yeah, sounds promising as a blog post for mere mortals, doesn’t it? ;-)
In the process of developing Aperture Assistant, I’ve been quite a lot of time delving around inside Aperture’s SQLite3 database file. Initially, this was an extremely daunting prospect as my only formal programming training comes from some Basic and LOGO a couple of decades ago at school, but by starting slowly and reading around on the net I started getting somewhere.
So, why access the DB directly?
When I started automating Aperture it’s AppleScript Dictionary was relatively skimpy, and there were certain limits to how much information you could get via scripting. That’s changed a lot in the last few releases but there’s still information in there that you can’t get at any other way, like the location of the Master file, the capture minute and second (vital for making more sophisticated naming conventions), whether a Preview exists and a few other bits and pieces.
What is a database?
At it’s simplest, a database is a big series of tables, like those you’d find in Word, Pages, Excel, Numbers etc. So you might have a table which contains information about each Version (in Aperture’s database this is the ZRKVERSION table), such as the Version Name, whether there are any adjustments (so Aperture knows whether to show the ‘adjusted’ badge), if the Preview is up to date, what the rating is, what Project the Version belongs to, post-cropping and rotation pixel dimensions, the VersionUUID (Unique Universal ID, a way of uniquely identifying something, oddly enough) etc.
Then you might have another table (the ZRKFILE table, seeing a pattern here?) with all the information about the Master files, remembering that you can have multiple Versions per Master so you can’t easily stuff it all into the same table. In here you can store info such as the file size, Master location, file creation date, is it a RAW file etc.
Then another for all the searchable information such as EXIF and IPTC data, another for Projects etc. etc. Altogether there are 21 tables in an Aperture 2 database.
Why use databases?
Most of this information is also stored in the multitude of XML files buried in the Library package, so why doesn’t Aperture read the information from those? One word - speed. The database can be loaded up into RAM in one go as it’s a single file, reading the same info from tens or hundreds of thousands of files would be enormously slower.
So, where is this database file?
Hidden. I’m not going to tell you.
No, seriously. This is a file that you should not be fiddling with. Go away and forget about it.
Oh, OK then. Go find your Library in the Finder, Right/Control-click on it and choose ‘Show Package Contents’. You will see a new Finder window showing the contents of your Library. It will look surprisingly familiar - all your top-level folders are there, as are all your Projects, but with the file extension ‘.approject’ added. What we want is a folder called ‘Aperture.aplib’, go into this and the database file is called ‘Library.apdb’.
Do NOT make changes to this file! Reading it is fine, except sometimes when Aperture is launching, but writing to it, changing it’s name or doing other silly things can severely screw up your Library. Don’t do it!
So, how to read it? SQLite3 is a standard, widely-used database format, in fact it’s the exact same type of database used by LightRoom. This means that there are tools out there designed to let non-programmers access these database files. One that I’ve had good use out of is SQLite Database Browser. It’s not been updated for awhile, but it does what it’s designed to do, so that’s no problem. Drag ‘Library.apdb’ onto SQLite Database Browser and it will open the database, first giving you a list of all those oddly-named tables. You can also view the names of all the columns in the tables.
But the bit that’s really handy is the Browse Data tab - in here you can view all the data in a table, for instance looking at the information that is stored about each Version.
However, there’s a table you should avoid looking at if you have a large Library - ZRKSEARCHABLEPROPERTIES. This contains every searchable property for every Version in the Library, such as EXIF, IPTC & custom tags. This can be a scary table to look at - in my 51,000 image Library it has over 2,000,000 rows! SQL Database Browser struggles a bit with this.
Ch-Ch-Changes...
Aperture’s database underwent some pretty wide-ranging changes going from Aperture 1.5 to 2, with new tables, new columns, changed column names etc. But the changes are definitely for the better. There were a few bits of info that, frankly, were a pain to find - you’d have to look up a value in one table to know where to look in a second table to find the term to search for in a third or fourth table, and the columns might have different names in different tables, so the ‘zmaster’ column in one table was the ‘z_pk’ column in another table.
Aperture 2 added more columns to most of the tables and rationalised the column names, so that columns are now more likely to have the same name in different tables.
As a specific example, in Aperture 1.5 finding out the ID of a Version’s Master involved looking up the ‘zmaster’ column in the Versions table and then matching this up with the ‘z_pk’ column in the Masters table so you could look up the ‘zuuid’ column to get the Master ID.
In Aperture 2 you simply look at the ‘zmasteruuid’ column in the Versions table.
As you can imagine this is much faster as you’ve immediately cut the number of DB hits in half for a relatively small increase in DB file size, and the simplified naming makes it far easier to trace the paths of information through the different tables.
Preview Fingerprinting
A tie back into userland stuff - that VersionUUID we saw back in the ZRKVERSION table is the unique ID for finding a specific Version, and what is called the ‘image version ID’ in AppleScript - if you’ve listened to Derrick & Sal’s latest podcast about Aperture-InDesign integration, this is the fingerprint information that is embedded into Previews with Aperture 2.0.1 and above.
This makes it very easy for developers to do interesting things integrating Aperture with other apps - ‘simply’ read the IPTC Special Instructions tag from the Preview file and you’ve instantly got the image version ID for use in scripts!
As a simplified example of the kinds of things this makes possible, I took some of Sal’s InDesign example code and stripped it down to a near-minimum - take a set of exported Preview files and drop them onto an AppleScript droplet to add the originating Versions to a new Album in Aperture.
http://aperture-assistant.com/post/108
This means that you could pass reduced quality Preview JPEGs to a client, have them return the ones they want hi-res files of, and easily make an Album collecting them together by dropping the returned files onto the droplet. No searching through the Library for the right Versions, no worrying about people fiddling with the names of the files you sent them, no worrying that you’ve changed the Version name.
I tell you, this fingerprinting stuff is going to be HUGE when it comes to integrating Aperture into wider workflows. :-)
Hmm, thinks - do Aperture’s web output images contain the fingerprint? A quick test shows that they don’t, time to go visit the feedback page again...
Ian
Comments (6)

Just to clear one thing up - the speed difference between XML files and databases don't have anything to do with it being a single file that can be loaded into RAM in one go. In fact, just about everything about that statement is wrong. First, I can certainly create an XML schema that would store all of that data in file, but that wouldn't gain me any speed. Second, it is not often that we would load an entire database into RAM - usually they won't fit (in truth, most database systems bypass the filesystem altogether and have all sorts of sneaky ways to access the disk directly). Granted SQLite databases are a little different in that regard, but that isn't terribly important. The real difference in speed comes from how the data is accessed in the file. In an XML file, the data has to be accessed linearly - we start at the beginning and keep reading until we find the data we want. In fact, this is usually why we get collections of XML files than the single monolithic one. Picking the correct small file is faster than reading through the large one. To put this in the broadest terms, databases are stored in a much more clever fashion that allows us to jump nearly instantly to the information we want (depending on the database schema and what we are looking for, of course). That is where the speed difference comes from.
So I got the results right, but for the wrong reasons?
Ian
Hi Ian (and Black),
that´s the kind of article we should see more often in this blog.
Great information, thank you.
Francois
Yup.
So why did I go to such lengths to correct a relatively small point that had little bearing on your overall article (It seems like that is the subtext of your comment)? The short answer is that I teach computer science. As you have probably found, to most professors, details matter, and it is hard to resist the urge to teach. As you said, you've been picking up this information from the net. Well, this page just became part of the collective knowledge that will be used by the next person in your shoes. While my explanation isn't necessary for your article, perhaps it will help the reader who later has a project of his or her own (or even you, on further projects) where the difference does matter.
No subtext, just asking. :-)
I tend to slip into lecture mode fairly easily myself and can be *extremely* fussy about apparently minor details...
Ian
Great article. I agree that more detailed and technical Aperture articles are a great way forward. Also thanks to Black for the updates on XML performance. Interestingly, Firefox also uses SQLlite to store its internal configuration settings.
You mention that Apple significantly changed the database schema between 1.5 and 2. Do you see Apple making changes to the schema in subsequent Aperture 2 point releases ? This could add some risk to developing scripts that directly access the database. Does Apple provide any documents for accessing the Aperture database directly, or do they discourage it ?
The fingerprinting stuff does seem to open up a whole host of possibilities for external workflow scenarios. With the new plug-in architecture, interesting times ahead for Aperture.
Cheers, Ian