New Challenge: ID3v2.4 autotagging w/ cover art in Python
I’ve been ripping my CDs into directories for years, mostly keeping an Artist / Album structure. This has worked for almost a decade, as I’d invariably use a folder-based browser to choose music to listen to. I’ve recently migrated to using Apple products, including iTunes, and am missing out on a good portion of my archived music collection. I could go through the albums by hand, but would prefer a scalable solution.
Existing tools:
Mutagen: http://code.google.com/p/quodlibet/wiki/Mutagen
Reads and writes ID3v2 and parses ID3v2.4 tags. Well-documented, clean code. I have not seen any code samples for writing the frame that holds the image, but this seems to be the best library out. If it doesn’t support it out of the box, it’d provide a great starting point for the feature.
FreeDB: http://www.freedb.org/
FreeDB is the public-domain offshoot of CDDB, a database of CD-media information (metadata). People add CDs into the system by scanning the CD and entering the Artist/Album/Tracks/etc by hand. CDDB then matches the starting/ending times of the individual tracks to the information provided by the user, and the next person who scans their CD gets that information for free. Although CDDB was bought by Gracenote, the user-contributed information continues to be available through FreeDB.
Pymad and MAD: http://spacepants.org/src/pymad/ and http://www.underbit.com/products/mad/
Mpeg Audio Decoder and Python interface. Although I’m not terribly interested in playing MP3 files at the moment, this lets me measure the start and end times of each track. Since I ripped the tracks in full, I retrieve the album metadata using FreeDB.
My own directory parsing code: http://github.com/philippp/albumparser/tree/master
Parses a directory tree for groups of audio files and tries to infer Artist – Album information from the directory names.
Strategy:
Use my directory parser and mutagen to determine which albums need tagging
Use pymad and freedb to retrieve neccessary metadata
Use Amazon to retrieve cover art where available
Use mutagen to write new metadata

Leave a Reply