One of the many things the developers of Thumbsplus got right was a proper normalized database schema. When I first inspected the layout of a Thumbsplus database I knew I was in good hands. In Thumbsplus image files get unique keys and image galleries are simply lists of image keys. Images can appear in any number of galleries, without duplication, just the way the gods of database design intended.
Assigning unique keys and grouping by key lists is so correct that it was a shock to discover that SmugMug, until recently, eschewed this principle. Prior to a recent upgrade if you wanted to display an image in more than one gallery you had to … shudder with horror …. make copies! Whenever I made an image copy I felt like I was masturbating in an art museum.
This outrage is now fixed and you can place an image in as many galleries as you want without copying. Unfortunately there is a residual problem. How do you hunt down and exterminate all your bogus copies? In an acronym: MD5. SmugMug assigns MD5′ s to all images. If two MD5’s are the same there is an extremely high likelihood you are dealing with copies. So all you have to do is find images with identical MD5’s and delete the extra copies. The following J verb uses image tables created from the XML captured by my SmugMug metadata dumper to do just this.