jbailey – Project Bar-B-Q

2018 Workgroup Topic Proposals

MID.LY (Music Industry Data)

Posted on October 4, 2018

The MIDI standard went from bright idea to being embraced by Roland, Yamaha, Korg, Kawai, and Sequential Circuits to shipping in product in a span of only 18 months. Notwithstanding currently views limitations of MIDI, that’s a remarkable time frame considering how enduring the standard has been.

My proposal is that we come together again as an industry to create a new standard. But, this time around we create a standard for data.

Advances in machine learning are allowing breakthrough new approaches to solving previously hard- or impossible-to-solve audio problems. We can identify and label objects within an audio stream. We can unmix a mixture. We can blindly segment a recording based on musical or other events. We can interpolate between acoustic and timbral spaces.

Where machine learning differs from the traditional science in music technology is that we need – in addition to tech, know-how and creativity – large amounts of well-labelled and structured training data.

Those of us pursuing this work are having to do this ourselves, likely duplicating work to create proprietary sets. The academic community provides some stuff, but it’s not enough.

In the biotech world, The Broad Institute was founded to support exactly this kind of collaborative partnership. Companies pool their data, collectively gain access, do work … and have been able to create major breakthroughs in genomics and biomedical research. Some companies in that space are going public without a business model or even a product … simply because of the tech they have created via access to this communal data set.

So, what if we joined forces in our corner of the world?