Need to optimize messed up library

Status
Not open for further replies.

lbeck

Active Member
Joined
May 21, 2015
Messages
150
Location
Hillsborough, NC
Lightroom Experience
Intermediate
Lightroom Version
6.x
Bottom line - is there a way to filter a smart collection for the missing files marker?

I've known for some time that my library is fraught with too many duplicates. I didn't know how many until I ran duplicate finder plugin and find that I have about 20K dups! This is the result of importing photos from several different HDDs, CDs, DVDs, SDs, etc. Not surprising because of my paranoia regarding throwing away images to an irretrievable place.

I have a pretty good (though not complete) understanding of the filtering process and the files are too similar to filter by any criteria that I can find (size, date, etc.)

I understand that there is no way to blanket delete dups for good reason. But I should be able to eliminate ~10K of duplicate files. I've begun the process that will likely take weeks of spare time on and off. I'm able to find the folders where the dups are located, delete the folders (in Windows file manager) that are truly duplicates, and see the ! where the missing files are in Lr catalog. Is there some way to then filter for missing files?

I'm tempted to simply start a new catalog but still for the 40K+ images I will need to determine what has already been imported since some are in different folders by event, person, and/or date. I eventually plan to have a date-only library and take the other stuff and place it in the keyword box since one date will likely have dozens of photos e.g., from the folder "John Smith Wedding".

So I'll accept any advice or criticism regarding my dilemma. But for now I'd like to determine if there is a way to aggregate the missing files (the ones with the ! notation) so that I can bulk-mark them as a set (probably with a color label) for later removal as I become more confident that they are truly worthless duplicates. I know that I can select each dup individually but they frequently are scattered through my "all duplicates" smart collection.

Thanks for your help.
 
Menu: 'Library - Find All Missing Photos'.
 
By identical ..... do you mean the same filenames ? but the same name may appear in multiple folders.

[PS....Johans suggestion will find all missing images, but I would like to understand how you define duplicate images]
 
Last edited:
Yes. Johan's report is accurate. Don't know how I missed it.

Gnits. No - the filenames are different and the duplicate photos appear in multiple folders. Prior to Lr I organized (disorganized) my library into folder like "Smith Wedding" and then named individual files like "Susan's Mother." I also kept the original files with the camera-assigned name like DSC-2467. For images that were edited there's no problem because they were not found as duplicates.

The attached screen capture may better define my problem.
  • Photo 110 has a filename "19971022-Shannon's Sydney1.JPG" and is located in F:\Photos sent to Lr - All Through 2015\Images - Dec 24, 2005 Archive\Family-Henegars\Scottsboro Feb '03 (DC).
  • Photo 111 has a filename "19971022-19971022-Shannon's Sydney1.JPG" and is located in folder F:\Photos sent to Lr - All Through 2015\1997\1997-10-22
As you can see, they are identified as dups. Everything is the same except the filename. Sometimes there are more than 2 dups and frequently with different source HDDs.

I'm trying to find some genius way to bulk-tag the 10K dups and quickly browse groups to delete. The best way that I've found so far is to locate each folder where duplicate file resides and delete the entire folder. Then I can color-tag the missing photos, review the action, and delete if wanted.

Any suggestions for minimizing this effort is appreciated.
2016-08-20.jpg
 
The downside of deleting the duplicates in Explorer is you don't know which version of the photo you've edited in LR. If you haven't deleted stuff in Windows yet, there's a Lightroom plug-in that would help identify the duplicates within LR itself: Lightroom Plugins - Duplicate Finder for Lightroom
 
The downside of deleting the duplicates in Explorer is you don't know which version of the photo you've edited in LR. If you haven't deleted stuff in Windows yet, there's a Lightroom plug-in that would help identify the duplicates within LR itself: Lightroom Plugins - Duplicate Finder for Lightroom
Thanks, Victoria. Actually, the plugin that you cite here and in your Lr 6 FAQ book is what identified the dups still in Lr. That plugin created the Smart Collection shown in my screen shot (19.4K photos). My approach so far is to:
  1. locate each duplicate pair in Explorer
  2. decide if the entire folder is duplicated, if so and there are no keywords involved in either folder, simply delete one folder on my HDD - then tag the missing photos and delete in Lr
  3. If one folder has dups with keywords that's of course the one that I don't delete
The problem with this approach is that frequently a duplicate file is included in different, dissimilar folders. Then I need to add a step to delete only a subset of photos in the larger folder. Good news is that most folders contain 30-50 images so there's not >19K choices needed. Bad news is that there's likely >500 looks/choices needed.

I may try Johan's cited utility. Possibly that can winnow the needed choices.

I really appreciate the expertise and suggestions provided here. Hindsight is always 20/20, but once I get my library reasonably cleaned out, I'll be a much more efficient user from now on :)

Lee
 
The downside of deleting the duplicates in Explorer is you don't know which version of the photo you've edited in LR. If you haven't deleted stuff in Windows yet, there's a Lightroom plug-in that would help identify the duplicates within LR itself: Lightroom Plugins - Duplicate Finder for Lightroom
I used Duplicate Cleaner Free to delete hundreds of MB of duplicate NEFs, JPGs, and even WAV files from my "uncatalog" file. Recommended and you can't beat the price.

Phil
 
Is there a way to sort by file path? This would be a great tool for me in this effort because most of the dups have the same filename and most other metadata but are located in different folders (file paths).
 
Use the plugins Lr_Transporter or Listview.

LR/Transporter - Import, Export and Manipulate your Metadata from Adobe Lightroom


Each of these allow you to export metadata incl file names, folder names, path names and lots of other good stuff.

I use these to create mailmerge text files with Photoshop, Indesign and Microsoft Word when I want to place metadata professionally on a page layout. [ A gripe... something that Lr should allow ]

Just import the csv files into Excel or similar.

You can use these plug-ins to order your images by folder, name, size or whatever.
 
Thanks, Gnits. I have listview and transporter.

After sorting in Excel is there a way to transfer the sort info to Lr? If I sort and then mark files, I would then want to somehow import this info into Lr so that I cold mark the duplicate files for deletion.

I can of course delete folders in Explorer but frequently a dup file resides in two dissimilar folders. One may contain 33 files and the other 44 files. What I have been doing in Lr is to:
  • identify a subset of my "all duplicates" smart collection
  • see that half of those in the subset are in a different folder (file path).
  • mark half for deletion - wanting to identify the half using filters rather than to look at each individual file

I see filter options for almost everything except file path.

If I could somehow do my sorting in Excel and then use the sort info to mark the files in Lr that would work well for me. If I could only see the sorted info in Excel It still would necessitate finding those individual files and manually marking each for deletion.

Thanks.
 
After sorting in Excel is there a way to transfer the sort info to Lr?

I have read of this option with some plug-ins, but have never tried it myself. Maybe someone with more knowledge or experience may be able to add to this conversation.

I see filter options for almost everything except file path.

I find others missing on a regular basis. A shame that a tool based on a database does not expose more commonly used fields.

One field that I really miss in Lr is the colour space of the image. I often receive images from others and would like to know the embedded profile... if any without a round trip to Photoshop.
 
Thanks, Gnits.

I've used Listview and Transporter but not for a while. My impression is that the info can't be transported to Lr - I hope I'm wrong.

I have used an old, no longer available utility called PhotoImpact. Originally by Ulead and then Corel. This was great because you COULD export data to DBase format and then of course to Excel, do work on files there and then export the info back to PI. Great for sorting and bulk addition of keywords. Not nearly as good as Lr for most other functions, however.

As you suggest, maybe a ListView or Transporter guru can enlighten me regarding their import/export functionality.
 
Do all of our images start with the date taken? If so, you could move them all to folders named based on years, sort by the date taken and find a bunch of files that are the same really quickly.
 
After sorting in Excel is there a way to transfer the sort info to Lr? If I sort and then mark files, I would then want to somehow import this info into Lr so that I cold mark the duplicate files for deletion.

It can't import sort info, but you can import updated metadata, for example, you could get it to set a color label for all the photos you want to delete. Or just save the filenames of the ones you want to select, then use LR/Transporter's "Mark" facility to mark photos from the list.
 
I was wondering if the following workflow may help (or a cleverer version..)
  1. Copy all files to a back up location....
  2. Create a number of smart collections to capture all your edited work. e.g. 'develop, has adjustments', or perhaps you may have keyworded some files, in that case, also have a smart collection for 'keywords aren't empty.'The goal of this is to export all 'worked on' photos.
  3. Export these photos as catalog to a new master photo location you will be using from now on
  4. Find all duplicates in all locations using recommended duplicate finder
  5. Where one of a pair of duplicates exists in the 'new master photo location' folder, delete file not in new location (i.e. delete file not 'worked on')
  6. Where neither of the duplicates are in the new location (i.e. neither file has been 'worked on') delete based on your preferred folder to keep.
  7. That will leave photos where the duplicates have both been edited, hopefully that will be a smaller number to manage.
  8. Import remaining files to catalogue created in step 3.
I am sure the gurus on the forum will point out if this is in any way workable. But there must be a better way than the painstaking method you have at the moment.
 
The problem with this approach is that you ignore metadata such as keywords. You'd lose all of them for all the 'not worked on' images. So if you do decide to take this route, and you did keyword all your images, then select all images first and use 'Save Metadata to Files' so preserve your keywords before starting this.
 
The problem with this approach is that you ignore metadata such as keywords. You'd lose all of them for all the 'not worked on' images. So if you do decide to take this route, and you did keyword all your images, then select all images first and use 'Save Metadata to Files' so preserve your keywords before starting this.
Hi John,
I did suggest also adding a smart collection for 'keywords aren't empty.' This would also move the keyworded files to the new catalogue.
Or perhaps i have misunderstood your comment?
 
Hi John,
I did suggest also adding a smart collection for 'keywords aren't empty.' This would also move the keyworded files to the new catalogue.
Or perhaps i have misunderstood your comment?

OK, so you did, but keywords were just an example. You would have to make smart collections for any possible added metadata: flags, color labels, captions, titles, etc. The other problem is that if you added at least one keyword to every image (as you should if you use Lightroom properly), this solution would fail because every set of duplicates would still end up in your new catalog...
 
OK, so you did, but keywords were just an example. You would have to make smart collections for any possible added metadata: flags, color labels, captions, titles, etc. The other problem is that if you added at least one keyword to every image (as you should if you use Lightroom properly), this solution would fail because every set of duplicates would still end up in your new catalog...

John, I did say these 2 collections were examples...

Regarding the other problem, this may or may not help the op depending on how many pictures have been untouched. I am going to put my neck out and guess that at least some of the duplicates are untouched, in that case this method may help.
 
Yes, if images are indeed completely untouched, this method might help. I'm not disputing that. I'm just giving anyone who wants to use it a warning to make really sure that an image is completely untouched. If you make a small mistake in your smart collection setup or forget something, you risk losing it and that may do more harm than good. Saving metadata to xmp files before doing anything else is a good way to minimize that risk. That's all I wanted to say.
 
Last edited:
Status
Not open for further replies.
Back
Top