The Twelfth Annual Interactive
Music Conference PROJECT BAR-B-Q 2007 |
![]() |
Group Report: Call for a Highly Distributed Metadata |
Participants: | |
Ron Kuper, Sonos |
Stefan Scheffler, Steinberg |
Scott Snyder, Edge of Reality |
Chris Grigg, Control G |
Larry the O, Toys in the Attic | Barry Threw |
Problem The enormous amount of assets created in digital media production beg for a comprehensive, interoperable and extendable metadata solution. From creation to consumption information attached to and derived from these assets is useful for management, search, cross referencing, rights management, and processing. However, the amount of data that is being generated on the web, in production pipelines, and in personal libraries is growing at such a fantastic rate that accessing this information in an efficient and accurate way seem an almost insurmountable problem. The irony is that even while this pool of data must reach a critical mass to be valuable the utility of this vast amount of information is limited by the speed and easy with which it can be filtered, recovered, updated, understood, and abstracted. Proprietary formats and ontologies have been created by virtually every company dealing with digital media production to fill this need, but with standardization of these efforts could be leveraged to create a protocol for communication between every stage of the media production pipeline from recording and production all the way through to delivery and music enjoyment. To realize this goal, we have examined some of these existing schemas, identified their shortcomings, and propose a solution based on a highly distributed metadata. Existing Formats There are several existing formats for tagging audio such as ID3, CDDB, MusicBrainz, and ASF, but they all suffer from one or more of the following deficiencies.
Because of these deficiencies, none of these existing standards are sufficient for our purposes of standardized metadata communication. All of these existing solutions are specific to a specific stage in the digital music life-cycle, and thus do not take into account other information from different producers that could be relevant if combined. Taking stock of some of the stages of content creation and consumption, and delineating some of the relevant data for these stages, is necessary for the formulation of a comprehensive data standard. Just some of the stages that digital audio assets go through include pre-production, live performance or computer aided composition, recording, editing, mixing, mastering, distribution, and consumption; and each of these stages requires its own specialized set of metadata to describe the information relevant to its function.
The existing metadata standards catalog some of this information for their respective theaters of commerce. However, on further examination the standards have surprisingly little overlap. Upon comparison of the metadata standards J2ME MMAPI, 3GP, SMIL 1 and 2, ID3v2, SMF, and XM it is found that only the “title” field is common among them. Clearly, a system for coalescing this data and for retrieving non-existent data is necessary for rich global data solutions. These existing standards to also do not take into account computer generated and community generated data fields. There are also several prior attempts at creating global identifiers for data files such as SSRID, ISRC, and an RIAA proposed project, but these all seem to have not overcome inertia or have shortcomings over our proposed GUID system. There are a couple of efforts to rectify a similar set of problems such as MPEG-7 and they should be considered for interoperability and prior work. Solution We propose a system of metadata distributed in a syndicated way around the web. This system, called Highly Distributed Metadata (HMD) consists of three primary components.
This system is based around having a unique identifying key (GUID) and a URL accompanying all digital assets. This key is used to refer to the file in all subsequent situations where metadata is required. Given this key, you can navigate to a URL or a local database and retrieve all metadata in an RSS-like way. Additionally, other URLs could be specified to retrieve additional metadata. In this way all data is duplicated and distributed and so can be accessed from any compliant software, website, or mobile device. A suggested format for the URL identified is: http://<BaseDomain>/hdm/<GUID>/<FieldID>?<OptionalFormat> Where: BaseDomain = 2nd level Domain name (DNS) of server, ex. lotsametadata.org GUID = String encoding the content ID GUID as hex characters, FieldID = The metadata field to return. More research into options and requirements for encoding these fields is necessary before making a specific recommendation. OptionalFormatSpec = The format you want the contents returned; ASCII, UTF-8, UTF-16, binary, etc. Examples: The returned information would have to at least include a result code, a format indicator, the data, and a signature to determine the authority of the provider. One of our major obstacles to overcome is creating a system of GUIDs that can be retrofitted into and cross-referenced from existing systems. Benefits The benefits of standardizing a metadata format are myriad, and provide benefits to existing business as well as open new avenues of commerce to entities inclined to aggregate, sort, and interpret the critical mass of information that will be accessible via this new format. Some of the first projected use cases are:
Action Items The following action items are necessary as first steps to solving this problem.
section 7 |
Copyright 2000-2014, Fat Labs, Inc., ALL RIGHTS RESERVED |