Digital Media Audio Blogs > Audio

Drawbacks of TRM identifiers


Related link: http://www.gonze.com

MusicBrainz uses audio fingerprints as primary keys to link metadata from different rips of the same song. To do this it uses a toolkit called TRM from a company called Relatable. TRM ids have severe drawbacks.

One, TRM generates way too many false negatives. In my testing I found that it was barely better at finding duplicate files than a byte hash like sha1. Two, it is closed source and probably encumbered by patents, so it can't become an open standard for audio fingerprints, and it can't be tweaked to support the needs of third party applications.

It's extremely clueful for MusicBrainz to use audio fingerprints for primary keys, a genuine innovation even. But MB is a metadata project, not a fingerprinter, and they couldn't use an better fingerprinter because one doesn't exist.

Details on my weblog.

Categories





AddThis Social Bookmark Button



Comments (7)
Read More Entries by Lucas Gonze.

7 Comments

dscotson said:

how to make friends and influence people
I agree fully that open/free is better for many reasons and remember that my initial mistaken reaction was to a brief intro. I just think you may be factually incorrect in your assesment of trm and your test seemed rather lax which undermines the parts of your argument I agree with.

That 94.4% figure you quote, for example, I could be wrong but I think higher is better, since the other 5.6% give false positives, i.e. identify more than one unique track.

The figure that perhaps you meant to quote is the 24.9% of tracks that have more than one trm identifier which means a newly generated trm may not match one held in the DB even if they are the exact same song.

But obviously some margin of error is necessary given different ripping techniques and encoding schemes combined wih plain user error and dirty CD's making the same song 'different', so whether this is particularly high or low I can't tell.

lucas_gonze said:

how to make friends and influence people
Well, whether a bleeped version is the same as the original is an application-level decision. Sometimes you want less exact matching. And that's pretty much my point -- applications are not able to make that decision for themselves because there's no open solution.

dscotson said:

how to make friends and influence people
I was referring more to concentrating on 'false negatives' of files when you didn't *know* they where the same song (thinking in particular of bleeped pofanities on single versions of rap songs) and no mention of the balance between false-positives and false-negatives.

lucas_gonze said:

how to make friends and influence people
You should repeat my tests, then. This would be a good thing anyway, because we need to know more details about when TRM identifiers do work.

*However*, MusicBrainz' own database stats confirm my results. See http://www.musicbrainz.org/stats.html, and check out the number of TRM IDs which identify exactly one track -- 94.4%!

dscotson said:

how to make friends and influence people
Sorry, my mistake, I didn't realise this was a summary as the link was to the domain rather than a permalink to an individual article.

The piece as a whole sounds much more reasonable (though I have doubts about your testing methodology, but I assume it is merely a rhetorical device).

lucas_gonze said:

read full entry at http://gonze.com
I'm not sure if you've read the whole (much longer) piece at http://gonze.com, because I do go into the kind of detail you're talking about. The format of these O'Reilly weblogs doesn't really emphasize that link -- I'll edit to do that.

The purpose is to start a conversation about open standards for audio fingerprints used as primary keys. It's extremely clueful for MusicBrainz to use audio fingerprints for primary keys, a genuine innovation even. But MB is a metadata project, not a fingerprinter, and they couldn't use an open fingerprint because one doesn't exist.

dscotson said:

how to make friends and influence people
Are you trying to start a conversation or a fight?

I think you would achieve whatever goals you have if your criticism was a bit more constructive (or even informative).

Are there free alternatives?

You claim it is not a great tech, is it really better than nothing?

What do you want me, i.e. the general oreilly reading public, to do about it?

Topics of Interest

Related Books

Recommended for You

Archives


 
 


Or, visit our complete archive.  

Stay Connected