Finding Duplicate MP3’s in your Mac OSX iTunes Library

I recently ran into a problem in my ongoing process of switching to Mac OSX from Windows. I decided that I wanted to jump fully into using iTunes as my central music repository. In my manually organized MP3 folders, I had MP3’s in folders by Artist then Album along with a playlist (M3U) file for each Album folder. This is a carry-over from using WinAmp in Windows. To play a whole album in correct order, I found it easy to just have an M3U file in each directory. Anyway, when importing my whole collection into iTunes under Mac OSX Leopard, it created duplicates of all of the files that had an associated playlist/M3U file. Ugh.

Correcting this issue is either a manual process (with some help from the View/Show Duplicates option in iTunes), or something that you can get shareware programs to deal with. The manual process wasn’t an option for me, as I have quite a number of MP3’s, and now quite a number of duplicates. I’m not sure why Apple hasn’t created a little better routine to deal with this issue, or an option to ignore M3U files on importing. iTunes is free, and for this reason, I disagree with the idea of paying for the solution. I also had an issue with the amount of money some of the shareware authors were asking.

I found a few freebie solutions to the problem, but didn’t like the way they worked. With all credit to Adam Kalsey (who incidentally looks a lot like Adam Corrola in the picture on his blog), I was able to use the following method to fix the problem.

More info and full details on how to do this after the jump…

In a post on his blog, Adam Kalsey suggested using a Unix utility called Duff (DUplicate File Finder) for detecting the duplicates. Unfortunately, being a slight-newbie in Mac land, I didn’t have my system configured for compiling software, and also didn’t know how to make Mac OSX cooperate 100%.

The first thing to do is prepare your Mac to compile files that are distributed in source form. To do this, insert your original Mac OS CD (in my case, Leopard 10.5). You will see the following screen:
OSX Install Screen
Double-click on the Optional Installs folder. You will get the following screen:
OSX Xcode Tools Install
Double-click on the Xcode Tools folder. Then you’ll get the following screen:
OSX Xcode Tools Screen
Double-click on the XcodeTools.mpkg and follow through the installer to put the Mac Development Tools for compiling from source code.

Once this is done, I tried compiling Duff, but ran into some issues. Basically, the error I was getting was during the ‘make install’. It was responding with:

/bin/sh: /Users/tgeorge/Desktop/duff-0.4/install-sh: Permission denied
make[1]: *** [install-binPROGRAMS] Error 126
make: *** [install-am] Error 2

Here’s the method I used to get Duff working. First, download the latest duff-0.x.tar.gz from the Duff Site to the Desktop on your Mac. Go to the Terminal (hint, Applications/Utilities/Terminal if you don’t know where it is).

Type the following:

cd ~/Desktop
tar -xvfz duff-0.4.tar.gz
(obviously replace with the proper version/filename if it’s different)
cd duff-0.4.tar.gz
./configure
(you shouldn’t see any errors from the output)
make
chmod 755 install-sh
(this is the key to avoiding the dreaded Error 126 from above)
make install

Now Duff should be working properly. To check, try the following:

duff -h

You should get the Duff help and command line switches. If so, awesome, you can continue using Adam’s method, which I will copy below (with some modifications to make it work with Mac).

duff -r /Volumes/music/ > duplicatemusic.txt

This will take quite some time depending on how many songs you have, mine took almost 30 minutes during-which there is no screen output as it’s writing it to a file instead.

Also, before running the following command, which will DELETE the MP3 files from your disk, I would strongly recommend opening the duplicatemusic.txt file in your favorite text editor and MAKE SURE all of the files listed are OK to delete. We can be pretty sure they are duplicates at this point, but we don’t know if you may have them duplicated on purpose, or whatever. Also, keep an eye out for files that may have been originally named ending in ‘1.mp3′ that you may wish to keep (like anything part 1.mp3, or 2001.mp3, stuff like that), as that is what the following command looks for in order to determine what files to delete.

You can also do a cat duplicatemusic.txt | grep ‘1 1.mp3′. This will give some tips where you might have some issues during the deletion process.

cat duplicatemusic.txt | grep 1.mp3 | tr '\012' '\000' | xargs -0 rm

You also might want to run the commands again looking for 2.mp3, 3.mp3, and up (in case there are multiple duplicates out there). There ya go, a free way to remove those nasty duplicates from your iTunes library.

A quick update, the above procedure doesn’t clean the iTunes database itself.  There will eventually be broken items.  To clean these up, check out this post on Paul Mayne’s blog.

12 Responses to “Finding Duplicate MP3’s in your Mac OSX iTunes Library” »»

  1. Comment by Buck | 04/04/08 at 8:42 am

    Very grateful for this write up! Dupe removal seems to require something new each time depending on what Apple’s updates have done to break third-party apps, but this one sounds reliable for the future. Thanks!

    As a side note, anyone implementing this should probably also run these for the alternate formats:

    cat duplicatemusic.txt | grep 1.m4a | tr ‘12′ ‘00′ | xargs -0 rm
    cat duplicatemusic.txt | grep 1.m4b | tr ‘12′ ‘00′ | xargs -0 rm
    cat duplicatemusic.txt | grep 1.wav | tr ‘12′ ‘00′ | xargs -0 rm

    And of course if anyone’s stubborn enough to be using the OGG plugin, well you’d have figured it out by now…

  2. Joe
    Comment by Joe | 05/09/08 at 6:27 pm

    Thanks for the tutorial.
    I was killing myself! I changed install-sh to install.sh before i chmod’ded. (don’t know why I did that), but searching for:
    make[1]: *** [install-binPROGRAMS] Error 127
    make: *** [install-am] Error 2
    brought me here. Thanks!

  3. Comment by fredooo | 06/03/08 at 4:57 am

    nice tutorial, thx

  4. Comment by CoreDownload | 07/07/08 at 4:06 am

    I am a new MaxOS user, and I think MAC rules. Thanks for this tip, I will use it as soon as I register for iTunes.

  5. Comment by Matthias | 07/12/08 at 6:03 pm

    Dear Todd,

    thanks a lot for your explanation of Adam’s method to identify the duplicate files. I am not a well versed in using either the Terminal or the like and so it took me a while to get things to work (and for some reason even your process still created the same error for me, I had to go and change the access rights to the usr/local/ folders concerned, worked fine also).

    My current problem is that I do not quite understand the comman line for deleting the entries.

    For me the cause of the duplicate files on HDs is the same as for many others: I transferred my iTunes music to an external HD. Somewhere along the way my library got all confused, however, (or i may have accidentially deleted the right version of it) and so it thought that many of my files where still sitting on the laptop (which they were also, I had not removed many of them yet – they are, however, not in the standard folder) – so I consolidated the library, got all the wrongly located files copied over again – and had a nice set of duplicates on my hands, taking up many GBs of space. My problem with the procedure Adam and you describe is that I am now in the position that I actually want to keep the 1.mp3 dupes and get rid of the others (”original copies”) on the external HD. How do I amend the command line accordingly – could you help?

    Would be immensely grateful if you could give me a suggestion on how to proceed.

    Thansk a lot in advance,
    Matthias

    P.S.: And on an aside: how do I use for duplicate files found in different copies of iPhoto libraries, also now in a duff created text file? I like the list, but doing things manually will be a tiring process…. (I guess one could use a folder syncing tool for this). I might check out the duff manual before I bother anyone with this separate question…

  6. Comment by Kristian | 01/06/09 at 5:28 am

    Ouch that seems complicated. Is there no native duplicate file finder for mac?

  7. Comment by Joshua | 02/26/09 at 10:08 pm

    Thanks for pointing out this command-line tool. Just an FYI, duff in in MacPorts. (http://www.macports.org/) After you install MacPorts you can just do a ‘port install duff’ and you’re good to go.

  8. Comment by Pete | 03/07/09 at 7:36 pm

    Thank you for this, very educational!

  9. Comment by Bdwyne | 03/22/09 at 11:45 am

    It’s still a little confusing but I figured it out thanks to this post. Was getting tired of hearing the same songs so often.

    Thanks!

  10. Comment by Vishal | 05/21/09 at 9:31 pm

    Very useful, thanks.

    As an additional note, a couple of additional grep’s that were useful for my library:

    grep 1.MP3 (or alternatively, grep -i 1.mp3)
    grep -i ‘ .mp3′
    grep -i ‘1.mp2′

    To account for dupes that itunes creates when copying files with ‘-’, ‘(’:
    grep -i ‘\s*-\s*.*.mp3′
    grep -i ‘(.*.mp3′

  11. Comment by Leader | 05/22/09 at 6:43 am

    Thanks for good explanation how to detect duplicates.it is very timely.

Trackbacks/Pingbacks »»>

  1. [...] After trying to convince various AppleScripts to make iTunes clean itself up, I stumbled across these instructions on the blog of Todd George on how to find and remove byte-for-byte duplicates from iTunes. It saved my sanity. Note that this simply removes the files from the filesystem, and not the entries from the iTunes library itself. Thankfully, Todd provides a link to a great method of finding the now dead entries in your library and removing them WITHOUT any additional scripts or programs. [...]


Leave a Reply »»