Descriptions
about the
CONTENT...
Descriptions
related to
INTELLECTUAL PROPERTY...
Descriptions
identifying
a media asset's
INSTANTIATION...
Descriptions
beyond
the PBCore Metadata
|
|
QuickStart
Guide
to Understand the PBCore Metadata
|
What
is Metadata, Anyway?
What
is a Metadata Dictionary?
What
is a Metadata Element?
What
is a Qualified or Refined Element?
What
is a Controlled Vocabulary?
What
are Element Attributes?
What
is an Application Profile?
What
is the Dublin Core (DCMI)?
What
is the Public Broadcasting Core (PBMD Project)?
How
do I Review the Metadata Elements of the PBCore?
More
Resources for the Curious and Compulsive
|
What
is Metadata, Anyway?
Metadata is descriptive information
about a resource. The resource may be video or audio, an image or graphic,
a text-based document, or any other informational item whether electronic
or not.
The primary purpose of metadata is sharing...the
ability to describe a resource and allow someone to discover, review,
select, and retrieve an item.
Examples of metadata include the name
of an item; descriptions or abstracts about its content; keywords or
subject classifications; file formats; authors; producers; distributors;
publishers; copyright and usage restrictions; etc.
Metadata needs to be structured in some
way. The descriptions available through metadata cannot be created in
a random or ad hoc manner. In other words, metadata should follow a
well-documented, formalized scheme.
By the way, the "descriptions"
are called "metadata." However, the "thing" being
described is often referred to as the "essence." Essence +
Metadata yields a media asset that has value to various end-user communities.
For additional background information on
why we should be using the same metadata standards, link to these references:
- Mary Jane McKinvens article "The
Case for Shared Metadata Standards" in the May 13,
2002, issue of Current.
- The project background paper and "PBMD Project
Progress Report"
submitted to the 2003 Dublin Core Conference by the Public
Broadcasting Metadata Working Group
|
What
is a Metadata Dictionary?
It's all about definitions. When creating a
systematic method for describing media resources, you have to start
by creating metadata categories. Whether created from scratch or harvested
from other sources, metadata categories must be defined...thus a metadata
dictionary.
In defining a dictionary,...
|
What
is a Metadata Element?
The Periodic Table of Elements contains a carefully
structured visualization of the chemical building blocks
of the universe as we know it. Metadata Elements are the descriptive
building blocks used to verbally or visually describe the world of resources,
assets, media, or "essence."
End of metaphor (our apologies to chemists
everywhere). The Public Broadcasting Core consists of dozens of Metadata
Elements. Currently we have 58 Elements which can be accessed from the
table on the left side of this web page.
|
What
is a Qualified or Refined Element?
We're talking about drilling down to the descriptive
core of a media essence or resource. If the descriptors attached to
a specific element aren't specific or expressive enough to fully identify
an item, then that element may be further refined or qualified.
The reason to use qualified elements is to
achieve a detailed level of description that best suits a community
of users (such as Public Broadcasting) without going overboard. After
all, at some point in the process, a real, live person must describe
an asset using the elements provided. An overly simplistic set of elements
fails to capture the nature of that resource. An overly compulsive set
of elements may capture every aspect of a resource, but be too difficult
to use, too time consuming to implement, and too confusing to understand
by most humans. One needs a set of elements and qualified elements that
is "just right." Currently the PBCore has 58 Elements, many
of which are "Refined Elements" or "Elements with Qualifiers."
Qualified Elements are recognized by a ".extension" (for example,
"Title.Alternative").
In addition to refining an element by creating
"Qualified" Elements, refinement can be achieved by using
a restricted set of descriptors (see What
is a Controlled Vocabulary?).
Refinement is also achieved if the grammar
of a descriptor is controlled. A good example of this type of refinement
is the order in which a person's name is displayed, e.g., LastName,
FirstName MiddleName, Credentials (for a very interesting discussion
on the complexities of displaying names, see "Representing
People's Names in Dublin Core").
Another example is the manner in which dates
are represented, whether you order the data by Month/Day/Year, Day/Month/Year
or Year/Month/Day (for a discussion of the representation variables
in displaying dates and times, see the W3C report on "Date
and Time Formats").
|
What
is a Controlled Vocabulary?
By controlling the vocabulary or the descriptors
associated with a metadata element, we can improve the discovery of
media resources and their retrieval. For some elements, like TITLE,
there are few restrictions on the words and terms that are used. For
many other elements, the possible descriptors may be limited or controlled
in order to insure consistency. Basically, a controlled vocabulary consists
of a carefully predefined set of values that are permissable. Without
the use of controlled vocabularies, users may enter their search criteria
only to find a limited number of "hits" or, in other situations,
an explosive number of irrelevant results.
Some controlled vocabularies are very short
and simple lists of allowable Terms. In other situations, the number
of authorized terms may be large and more complex. These vocabularies
are referred to as Authority Files and are carefully crafted compilations
of the appropriate manner and form in which to describe something. If
the relationships of meaning that exist between terms need to be expressed,
then a Thesaurus may be employed. A Thesaurus shows all allowable terms
and the relationships between them.
The PBCore uses Controlled Vocabularies wherever
precision is needed and ambiguity is to be avoided.
|
What
are Element Attributes?
Once again the topic focuses on specificity
and standardization. There are many specifications on how to define
data elements. If one hopes to share metadata descriptions with other
organizations and entities (interoperability), then it's best to follow
an established set of guidelines in setting up and defining metadata
elements. A commonly understood framework allows diverse groups to appreciate,
even harvest, data from each other.
The PBCore has used a modified standard for
describing data elements used in databases and documents. It is called
ISO 11179:
Specification and Standardization of Data Elements. In using
this standard, each metadata element is identified by numerous attributes.
By defining each attribute in the set of attributes for a metadata element,
that element receives a carefully honed statement of meaning.
The attributes attached to the PBCore metadata
elements are displayed by selecting any of the elements listed on the
left side of this web page. What are the attributes we've used in the
PBCore Metadata Elements?
Name |
The actual name of the
element, including qualified elements. |
Definition |
A brief definition of the element. Guidelines for
entering values and actually applying an element are described under
the attribute Guidelines for Usage. |
Refinements
and
Encoding Schemes |
If a particular controlled vocabulary is to be used
with an element, then a URL reference is included, as well as a
pull-down list of the allowable values if the Term List is short
and manageable. Otherwise a reference or link to an authority file
and its originating organization is provided.
If a particular syntax, punctuation or grammar
is used to guide the form in which descriptions are entered, then
either the rules are presented or a URL link to the rules is provided. |
Guidelines
for Usage |
Statements about the appropriate
way in which to apply a metadata element. The Guidelines are a brief
appoximation of a user's guide to understanding and applying a particular
metadata element. |
Obligation
to Use |
A metadata element does not have
to be employed when describing a media resource. Typically this
attribute indicates if the use of an element is MANDATORY, OPTIONAL,
or RECOMMENDED. This attribute may also be referred to as Obligation
or Status. |
Repeatable
Element |
Some metadata schemes, such as the Dublin Core,
suggest if you need to apply more than one value to a single element,
you can repeat the presence of an element and its associated value.
This attribute may use such terms as REPEATABLE, UNBOUNDED, or actually
use a number. |
Type of
Data Entry |
Any database designer must indicate
what type of data is permissible for a field as a value is typed
or entered. Typically, this attribute indicates if the value is
a TEXT STRING, NUMBER, or DATE. |
Examples |
Definitions of an element are often
enhanced by using real world examples. The PBCore provides these
examples as an aid to understanding. |
Element
Label |
Usually the attribute Name and the
attribute Label are the same. The Label is used indicate the exact
manner in which an element is referenced. |
Element
Version |
While developing metadata, several
versions of elements or the meaning attached to them will emerge
over time. Like software editions that are released, Element Version
indicates the version you are viewing (hopefully, the most recent
version). |
Namespace
Identifier |
http://library.csun.edu/mwoodley/
dublincoreglossary.html
A unique name that identifies an organization that has developed
an XML schema. A namespace is identified via a Uniform Resource
Identifier (a URL or URN). For example, the namespace for Dublin
Core elements and qualifiers would be expressed respectively in
XML as:
xmlns:dc = "http://dublincore.org/elements/1.0/"
xmlns:dcq = "http://dublincore.org/qualifiers/1.0/"
>
The use of namespaces allows the definition
of an element to be unambiguously identified with a URI, even
though the label "title" alone might occur in many metadata
sets. In more general terms, one can think of any closed set of
names as a namespace. Thus, a controlled vocabulary such as the
Library of Congress Subject Headings, a set of metadata elements
such as DC, or the set of all URLs in a given domain can be thought
of as a namespace that is managed by the authority that is in
charge of that particular set of terms. |
Registration
Authority |
http://library.csun.edu/mwoodley/
dublincoreglossary.html
A system to provide management of metadata elements.
Metadata registries are formal systems that provide authoritative
information about the semantics and structure of data elements.
Each element will include the definition of the element, the qualifiers
associated with it, mappings to multilingual versions and elements
in other schema. A registration authority
facilitates the consistent use of a metadata element by all parties
and communities. It also contributes to the longevity of a metadata
element as it maintains its integrity over time. |
Language
of the Element |
Depending on the Registration Authority
for a metadata element or its country of origination and usage,
the language used for the element is indicated. For the PBCore,
the Language is expected to be English and uses the designation
"eng". Standards exist to express languages in either
two-letter or three-letter codes. ISO-639-2:
Codes for the representation of names of languages as a 3-letter
code.
http://www.loc.gov/standards/iso639-2 |
|
|
|
What
is an Application Profile?
http://library.csun.edu/mwoodley/
dublincoreglossary.html
An Application Profile is a set of metadata
elements, policies, and guidelines defined for a particular application.
The elements may be from one or more element sets, thus allowing a
given application to meet its functional requirements by using metadata
from several element sets including locally defined sets. For example,
a given application might choose a subset of the Dublin Core that
meets its needs, or may include elements from the Dublin Core, another
element set, and several locally defined elements, all combined in
a single schema. An Application Profile is not complete without documentation
that defines the policies and best practices appropriate to the application.
For the Public Broadcasting Core of Metadata,
we have drawn metadata elements, policies and guidelines from the Dublin
Core Metadata Dictionary Project, from the Video
Development Project (ViDE), from many digital asset management
projects underway at various public broadcasting stations, producers
and developers, and other entities. Thus we are building an Application
Profile for the PBCore.
With the presentation of version 1.0 of the
PBCore, are we finished? Probably not. As the PBCore is used by various
communities, we will undoubtedly add extensions to the existing set
of metadata elements to accommodate specials needs (see PBCore
Extensions).
For example, extensions that we know are important
to Public Broadcasting are those related to the use of their media resources
in educational venues. The Dublin Core has a draft proposal for metadata
elements being assembled by its Education
Working Group. The element Audience has already been folded
into the PBCore. Other educationally oriented elements include Standard
(academic/curriculum), Mediator, InteractivityType, InteractivityLevel,
and TypicalLearningTime. These elements are under consideration and
may be folded in the PBCore or treated as special extensions for use
by certain Public Broadcasting communities.
|
What
is the Dublin Core (DCMI)?
http://www.dublincore.org
The Dublin Core Metadata Dictionary Project is an
open forum engaged in the development of interoperable online metadata
standards that support a broad range of purposes and business models.
DCMI's activities include consensus-driven working groups, global
workshops, conferences, standards liaison, and educational efforts
to promote widespread acceptance of metadata standards and practices.
The Dublin Core is a 15-element metadata
element set intended to facilitate discovery of electronic resources.
The Dublin Core has been in development since 1995 through a series
of focused invitational workshops that gather experts from the library
world, the networking and digital library research communities, and
a variety of content specialties. See Section 1 of this guide or the
Dublin Core Web Site.
The Dublin Core Metadata Dictionary Project is the
body responsible for the ongoing maintenance of Dublin Core. DCMI
is currently hosted by the OCLC
Online Computer Library Center, Inc., a not-for-profit
international library consortium. The work of DCMI is done by contributors
from many institutions in many countries. DCMI is a consensus-driven
organization organized into working groups to address particular problems
and tasks. DCMI working groups are open to all interested parties.
Instructions for joining can be found at the DCMI web site under Working
Groups.
|
What
is the Public Broadcasting Core (PBMD Project)?
On its surface, “metadata” appears
to be an arcane topic reserved for librarians or systems engineers.
In fact, in a rapidly evolving and deeply challenging media environment,
a well-formed Metadata Dictionary directly addresses our core mission
of serving the people of the United States, and may, to a great extent,
determine our future relevance. By working hard to sensibly describe
our content, and to facilitate easy access and use by teachers, scholars,
lifelong learners, engaged citizens and community partners, we will
reaffirm our national and local value as providers of media of the highest
editorial integrity, offered for the public good.
It is our fervent hope that the Metadata Dictionary Project (or Public Broadcasting Metadata Dictionary Project, PBMD Project) also models
a process of cross-disciplinary consensus within our "industry"
around critical standards – a template that will serve us in good
stead as we meet the many data challenges of the future.
Within public broadcasting, the application
of a shared metadata dictionary will facilitate the exchange and delivery
of content and data (including both program elements and completed programs)
throughout our multiplatform production teams, our system of interconnected
licensees and out to our broadcast and Internet constituents. It is
a critical first step as PBS, NPR, PRI, individual stations, and others
begin to acquire and use asset management systems to organize their
content.
The project has been extant since January of 2002,
and during its first two phases of CPB Future Fund support, a team
of individuals representing public broadcasting's key institutions
and endeavors, along with subject matter experts has worked to:
-
Develop consensus regarding project objectives
and timeline;
-
Recognize and codify the way our constituents
use our content and content information. (Developed use cases based
on interviews with producers, broadcast operation staff, educators,
website creators, etc.);
-
Examine relevant metadata standards in the media
and library communities, to ascertain their applicability to our
content and constituencies;
-
Make information about the PBMD Project available via
numerous conference presentations and a project website;
-
Contribute and combine the substantial metadata
work already performed at key institutions in public broadcasting
(PBS, NPR, WGBH, KUED, MPR);
-
Form a preliminary consensus regarding a single
set of metadata protocols - the Public Broadcasting Core (PB Core)
Metadata, Preliminary Version 1.0.
For more information on the Project, please see the project background
paper and "PBMD Project
Progress Report"
submitted to the 2003 Dublin Core Conference by the Public Broadcasting
Metadata Working Group.
|
|
|
|
How
do I Review the Metadata Elements
of the PBCore?
The PBCore Metadata Descriptors are called
"Elements." Currently we have 58 Elements, many of which are
"Refined Elements" or "Elements with Qualifiers."
Qualified Elements are recognized by a ".extension" (for example,
"Title.Alternative").
We have gathered the PBCore Elements into
three clusters; each cluster houses elements of a similar nature...
- CONTENT...
20 elements describing the actual intellectual content of a media
asset or resource.
- INTELLECTUAL PROPERTY...
9 elements related to the creation, creators and usage of a media
asset or resource.
- INSTANTIATION...
29 elements that identify the nature of the media asset as it exists
in some form or format in the physical world or digitally.
To review detailed information about an Element
or Qualified Element, click on its name from the list of elements in
the left hand column of a page. There you will see outlined the attributes
for an element.
|
More
Resources for the Curious and Compulsive
For additional background on Digital Asset
Management and Metadata, please refer to our two websites that provide
numerous links to presentations, papers and additional resources.
CPB Asset Management
http://www.pbcore.org/cpbasset
Public Broadcasting Metadata Dictionary Project
http://www.pbcore.org
|
LAST |
|
|
NEXT |
|
|