Up until this point, we have been adding to our DAM drives. Consolidation gathered everything together. Conformation added both order and context to the working DAM drive while identifying possible duplicates within it. Now, act on this information and begin to cull the repository of duplicate data and media you do not need to process.
This is the riskiest part of the process because it is subtractive.
When the cull is complete, you will have less data than at the start. In the end, this will be a good thing as you reclaim disk space and narrow your image collection down to only the images you want to process, preserve and distribute.
Continue to work only on your working DAM drive. The other remains a valuable back up in case anything goes wrong during the cull. And, just in case, our original source cards/drives remain untouched in the WIPE box, waiting for us to finish.
The first step in the process is removing duplicates, commonly known as deduping. There are two basic options for deduping: manual and automated. As one might suspect, manual deduping is more time consuming, but allows one more subjective analysis of the dupes to be removed. Automated dedupe tools are generally faster and, in many ways, more accurate.
By comparing file sizes, EXIF data and running checksum analysis, dedupe software compares folders and files at deeper, more technically precise level. The tradeoff is less opportunity for subjective (human) comparison of the files and, depending on how the software manages dupe deletion, the potential for a smaller, but less well organized repository.
When deduping, we are concerned about two types of duplicates:
- Project Duplicates (folders or libraries)
- Media Duplicates (individual frames or movies)
How and when you dedupe will largely depend on where the media currently lives (folders vs. libraries). This guide covers both scenarios.
Duplicate Project Folders
For file system duplicates, I use Gemini, a tool that offers a hybrid approach. When it finds duplicates, Gemini presents them in a visual interface that allows confirmation of folders or individual files.
Lets walk through a simple example. Here, we have our new repository with two suspected duplicate folders.
To confirm these are duplicates, drag the parent folder (01_FAMILY) over to Gemini and run a scan. The result looks like this:
Click on the individual folders to open them up and do a manual check. Once duplicates are confirmed, check the box next to the folder to be removed and click Remove Selected to send it to the Trash. In this example, if there were more than one duplicate folder, they would have all shown up in the interface.
We could also have selected just the two suspected duplicate folders and run the scan. The result would have shown suspected duplicates shared between the two (or more) duplicate folders.
This view shows suspected duplicate files. When a suspected duplicate file is clicked, the file is displayed in the interface. You can switch between them to do a manual check from right in the application. Removing duplicate files follows the same process as removing duplicate folders.
One could simply point Gemini at the root level of the repository and it would find all of the duplicate folders and files. However, I suggest moving through the subfolders (e.g. 00_FAMILY) that we created during consolidation step. Large DAM drives can tax the best deduping tools. Working in smaller chunks puts less stress on the system and lets one focus on a single, discrete collection of folders at a time.
Step through each subfolder in order and, you will remove most, if not all, of your duplicate folders and loose files.
Duplicate Aperture and iPhoto Libraries
With the file system clean, turn your attention to your Aperture and iPhoto libraries. If you have multiple libraries assembled in on one folder, the first step is to run a Gemini scan against that folder to see if any of the libraries are duplicated. If so, remove the duplicated libraries.
Next, remove duplicates from Aperture and iPhoto. In my case, I had a relatively small collection of libraries and was not in the habit of editing photos in Aperture. So, I chose go through each library manually to find and delete duplicates. Then, I exported the remaining images as originals.
If you have large libraries or relied heavily on Aperture to edit images, you will ultimately need to migrate directly from Aperture using Rich Harrington’s guide to migrating directly from Aperture to Lightroom. In this case, you may want to use a software tool like Duplicate Annihilator to find and remove duplicates within your Aperture libraries. For the record I have not used Duplicate Annihilator myself, but it comes highly recommended by several trusted friends.
If you have iPhoto libraries to move as well, I suggest pulling them into Aperture first. To do so, follow this procedure outlined by Apple. Once you have everything moved over to Aperture, you are able to process those images right along with your other libraries.
A Few Thoughts on Bad Media
When I first became serious about still photography, I was a fan of the spray and pray school of shooting. I relied heavily on burst mode and bracketing to compensate for weakness in my framing, focus and exposure skill set. As a result, my early work is full of bad frames. Now, my work still has bad frames but far fewer of them and those are much closer to the target.
When building (or rebuilding) an archive, some advocate that one should keep everything. This argument begins with RAW, which allows the modern photographer to recover shots that are poorly exposed. Who knows if a future plugin will mimic light field cameras and allow one to correct focus errors as easily as RAW allows one to fix exposure? And, after all drive space/speed gets cheaper every year.
So, why not keep everything, just in case?
Ultimately, that is a choice you’ll need to make for your own peace of mind. As for me, I have chosen to aggressively cull my back catalog of bad frames and bad sequences (e.g. HDR, time lapse). Am I embarrassed by my earlier, less stellar work? Id be lying if I said no. But, that is not the reason.
My reasoning is much more pragmatic. Why would I go back to old images that have significant problems when I now have the skills to produce better raw materials? The answer is I wouldn’t. So, Ive devised the following checklist to help identify and eliminate bad frames.
- Is the image in focus?
- Is the image composed well?
- Is the image exposed well?
- Is the image of a subject I may never have the opportunity to shoot again?
- Can the image be used as an example of what not to do when teaching?
- Does the image mark an important milestone, personally or professionally?
This checklist works for me. Yours may look very different. The important thing is that you devise a cull checklist that works for you. Give it some serious thought. Write it down. Revise it. And, when you have it nailed down, print it and post it next to your workstation.
NOTE: Item #6 has saved many bad frames from the chopping block, especially those from my iPhone. My iPhone is primarily used to document my family. As great as the iPhone camera has become, it still has limitations. Still, if a shot captures some important moment in my family life, Ill forgive a lot of technical problems in the image.
Planning for Media Duplicates and Bad Media
The primary focus of this step is to remove big chunks of duplicate data from your consolidated back catalog. Now armed with a lean, clean repository of images and a well-reasoned set of cull guidelines for media duplicates and bad media, we are almost ready to pull our images into Lightroom.
Before beginning the import, you need a clear plan for when to execute the cull. There are three points where one can execute the media cull; which to use depends on where the media originated and where the media currently lives.
- Before Lightroom Import: This option is entirely manual and involves scanningall of the images in the file system and making decisions before you import to Lightroom. This is approach the most time-consuming, least user-friendly and most prone to errors. As a result, I do notadvise this approach.The only exception to this an iPhoto library synced to your iPhone. If you are like me, you have a lot of images in your phone that are not suitable for archiving. Many of mine are snapshots I text my wife to make sure I am buying the right item when out grocery shopping. Youd be surprised how quickly shots like that add up in your iPhone (and therefore iPhoto) library. For this reason, I suggest doing this cull (and only this one) before your Lightroom import.
- During Lightroom Import: If you have lots of folders of loose images, import them into Lightroom just as you would if you were plugging in a card directly from your camera. Go folder by folder and only add only the photos you think you can process with great results.
- After Lightroom Import: When importing your Aperture library directly into Lightroom, you may not need to cull at all. Most likely, you did a good job of culling during the original import into Aperture. If that is not the case, you need to do your cull after Lightroom finishes bringing over your library as the migration tool assumes you want to bring the Aperture library in as a complete set.
With your plan now fully formed, it is time for the COPY step, where you will import your images into Lightroom and, when done, make backups of everything.