Categories
Interesting Technology Metadata Video Technology

What are the different types of metadata we can use in production and post production?

I’ve been thinking a lot about metadata – data about the video and audio assets – particularly since we use metadata extensively in our Intelligent Assistance software products and for media items for sale in Open TV Network. And the new “Faces” and “Places” features in iPhoto ’09 show just how useful metadata can be.

Back in the days when tape-based acquisition ruled there wasn’t much metadata available. If you were lucky there would be an identifying note on or with the tape. For linear editing that was all that was available at the source – the tape. The only other source metadata would be frame rate and frame size, and tape format and perhaps some user bits with the Timecode. With a linear system that was all you could use anyway.

With non-linear editing we moved media into the digital domain and added additional metadata: reel names; clip names, descriptions etc and with digital formats we’re getting more source metadata from the cameras.

But there are more types of metadata than just what the camera provides and what an editor or assistant enters. In fact we think there are four types of metadata: Source, Added, Derived and Inferred. But before I expand on that, let me diverge a little to talk about “Explicit” and “Implicit” metadata.

These terms have had reasonable currency on the Internet and there’s a good post on the subject at Udi’s Spot “Implicit kicks explicit’s *ss.” In this usage, explicit metadata is what people provide explicitly (like pushing a story to the top of Digg) while implicit metadata is based on the tracks that we inadvertently leave.

Actions that create explicit metadata include:

  • Rating a video on Youtube.
  • Rating a song in your music player.
  • Digging a website on Digg.

Actions that create implicit metadata include:

  • Watching a video on Youtube.
  • Buying a product on Amazon.
  • Skipping past a song in your music player as soon as it gets annoying.

We didn’t think those terms were totally useful for production and post production so instead we think there are the four types noted above.

Source

Source Metadata is stored in the file from the outset by the camera or capture software, such as in EXIF format. It is usually immutable.  Examples:

  • timecode and timebase
  • date
  • reel number
  • codec
  • file name
  • duration
  • GPS data
  • focal length, aperture, exposure
  • white balance setting

Added

Added Metadata is beyond the scope of the camera or capture software and has to come from a human. It can be added by a person on-set (e.g. Adobe OnLocation) or during the logging process. Examples:

  • keywords
  • comments
  • event name
  • person’s name
  • mark good
  • label
  • auxiliary timecode
  • transcription of speech (not done by software)

Derived

Derived Metadata is calculated using a non-human external information source. Examples:

  • speech recognition software can produce a transcription
  • a language algorithm can derive keywords from a transcription
  • locations can be derived from GPS data using mapping data (e.g. Eiffel Tower, Paris, France) or even identifying whether somewhere is in a city or the country
  • recalculation of duration when video and audio have different timebases
  • OCR of text within a shot.

Derived metadata is in its infancy but I expect to see a lot more over the next few years.

Inferred

Inferred Metadata is metadata that can be assumed from other metadata without an external information source. It may be used to help obtain Added metadata. Examples: 

  • time of day and GPS data can group files that were shot at the same location during a similar time period (if this event is given a name, it is Added metadata)
  • if time of day timecode for a series of shots is within a period over different locations, and there is a big gap until the next time of day timecode, it can be assumed that those shots were made together at a series of related events (and if they are named, this becomes Added metadata)
  • facial recognition software recognizes a person in 3 different shots (Inferred), but it needs to be told the person’s name and if its guesses are correct (Added) 

We already use inferred metadata in some of our software products. I think we will be using more in the future.

So that’s what we see as the different types of metadata that are useful for production and post production.