Home Page for PBCore Metadata

DRAFT v0.9 03 Feb 2004

Project Background
QuickStart Guide
Glossary of Terms
RFC & Test Implementation Docs
THE ELEMENTS Help
LAST NEXT

Descriptions about the
CONTENT...

01.00
01.01
01.02
01.03
01.04
03.00
04.00
04.01
04.02
04.03
08.00
08.01
08.02
11.00
13.01
13.02
14.01
14.02
16.01
16.02

Go to Top of Page

Descriptions related to
INTELLECTUAL PROPERTY...

02.00
02.01
05.00
05.01
06.00
06.01
15.01
15.02
15.03

Go to Top of Page

Descriptions identifying
a media asset's
INSTANTIATION...

07.01
07.02
07.03
07.04
09.01
09.02
09.03
09.04
09.05
09.06
09.07
09.08
09.09
09.10
09.11
09.12
09.13
09.14
09.15
09.16
09.17
09.18
09.19
09.20
10.00
12.00
12.01
18.00
19.00

Go to Top of Page

Descriptions beyond
the PBCore Metadata

99.00

Go to Top of Page
 

QuickStart Guide
to Understand the PBCore Metadata


What is Metadata, Anyway?
What is a Metadata Dictionary?
What is a Metadata Element?
What is a Qualified or Refined Element?
What is a Controlled Vocabulary?
What are Element Attributes?
What is an Application Profile?
What is the Dublin Core (DCMI)?
What is the Public Broadcasting Core (PBMD Project)?
How do I Review the Metadata Elements of the PBCore?
More Resources for the Curious and Compulsive

 


What is Metadata, Anyway?

“Metadata” is descriptive information about a resource. The resource may be video or audio, an image or graphic, a text-based document, or any other informational item whether electronic or not.

The primary purpose of metadata is sharing...the ability to describe a resource and allow someone to discover, review, select, and retrieve an item.

Examples of metadata include the name of an item; descriptions or abstracts about its content; keywords or subject classifications; file formats; authors; producers; distributors; publishers; copyright and usage restrictions; etc.

Metadata needs to be structured in some way. The descriptions available through metadata cannot be created in a random or ad hoc manner. In other words, metadata should follow a well-documented, formalized scheme.

By the way, the "descriptions" are called "metadata." However, the "thing" being described is often referred to as the "essence." Essence + Metadata yields a media asset that has value to various end-user communities.

For additional background information on why we should be using the same metadata standards, link to these references:

    1. Mary Jane McKinven’s article "The Case for Shared Metadata Standards" in the May 13, 2002, issue of “Current.”
    2. The project background paper and "PBMD Project Progress Report" submitted to the 2003 Dublin Core Conference by the Public Broadcasting Metadata Working Group

Go to Top of Page

What is a Metadata Dictionary?

It's all about definitions. When creating a systematic method for describing media resources, you have to start by creating metadata categories. Whether created from scratch or harvested from other sources, metadata categories must be defined...thus a metadata dictionary.

In defining a dictionary,...

Go to Top of Page

What is a Metadata Element?

The Periodic Table of Elements contains a carefully structured visualization of the chemical building blocks of the universe as we know it. Metadata Elements are the descriptive building blocks used to verbally or visually describe the world of resources, assets, media, or "essence."

End of metaphor (our apologies to chemists everywhere). The Public Broadcasting Core consists of dozens of Metadata Elements. Currently we have 58 Elements which can be accessed from the table on the left side of this web page.

Go to Top of Page

What is a Qualified or Refined Element?

We're talking about drilling down to the descriptive core of a media essence or resource. If the descriptors attached to a specific element aren't specific or expressive enough to fully identify an item, then that element may be further refined or qualified.

The reason to use qualified elements is to achieve a detailed level of description that best suits a community of users (such as Public Broadcasting) without going overboard. After all, at some point in the process, a real, live person must describe an asset using the elements provided. An overly simplistic set of elements fails to capture the nature of that resource. An overly compulsive set of elements may capture every aspect of a resource, but be too difficult to use, too time consuming to implement, and too confusing to understand by most humans. One needs a set of elements and qualified elements that is "just right." Currently the PBCore has 58 Elements, many of which are "Refined Elements" or "Elements with Qualifiers." Qualified Elements are recognized by a ".extension" (for example, "Title.Alternative").

In addition to refining an element by creating "Qualified" Elements, refinement can be achieved by using a restricted set of descriptors (see What is a Controlled Vocabulary?).

Refinement is also achieved if the grammar of a descriptor is controlled. A good example of this type of refinement is the order in which a person's name is displayed, e.g., LastName, FirstName MiddleName, Credentials (for a very interesting discussion on the complexities of displaying names, see "Representing People's Names in Dublin Core").

Another example is the manner in which dates are represented, whether you order the data by Month/Day/Year, Day/Month/Year or Year/Month/Day (for a discussion of the representation variables in displaying dates and times, see the W3C report on "Date and Time Formats").

Go to Top of Page

What is a Controlled Vocabulary?

By controlling the vocabulary or the descriptors associated with a metadata element, we can improve the discovery of media resources and their retrieval. For some elements, like TITLE, there are few restrictions on the words and terms that are used. For many other elements, the possible descriptors may be limited or controlled in order to insure consistency. Basically, a controlled vocabulary consists of a carefully predefined set of values that are permissable. Without the use of controlled vocabularies, users may enter their search criteria only to find a limited number of "hits" or, in other situations, an explosive number of irrelevant results.

Some controlled vocabularies are very short and simple lists of allowable Terms. In other situations, the number of authorized terms may be large and more complex. These vocabularies are referred to as Authority Files and are carefully crafted compilations of the appropriate manner and form in which to describe something. If the relationships of meaning that exist between terms need to be expressed, then a Thesaurus may be employed. A Thesaurus shows all allowable terms and the relationships between them.

The PBCore uses Controlled Vocabularies wherever precision is needed and ambiguity is to be avoided.

Go to Top of Page

What are Element Attributes?

Once again the topic focuses on specificity and standardization. There are many specifications on how to define data elements. If one hopes to share metadata descriptions with other organizations and entities (interoperability), then it's best to follow an established set of guidelines in setting up and defining metadata elements. A commonly understood framework allows diverse groups to appreciate, even harvest, data from each other.

The PBCore has used a modified standard for describing data elements used in databases and documents. It is called ISO 11179: Specification and Standardization of Data Elements. In using this standard, each metadata element is identified by numerous attributes. By defining each attribute in the set of attributes for a metadata element, that element receives a carefully honed statement of meaning.

The attributes attached to the PBCore metadata elements are displayed by selecting any of the elements listed on the left side of this web page. What are the attributes we've used in the PBCore Metadata Elements?


Name

The actual name of the element, including qualified elements.

Definition

A brief definition of the element. Guidelines for entering values and actually applying an element are described under the attribute Guidelines for Usage.

Refinements and
Encoding Schemes

If a particular controlled vocabulary is to be used with an element, then a URL reference is included, as well as a pull-down list of the allowable values if the Term List is short and manageable. Otherwise a reference or link to an authority file and its originating organization is provided.

If a particular syntax, punctuation or grammar is used to guide the form in which descriptions are entered, then either the rules are presented or a URL link to the rules is provided.


Guidelines for Usage

Statements about the appropriate way in which to apply a metadata element. The Guidelines are a brief appoximation of a user's guide to understanding and applying a particular metadata element.

Obligation to Use

A metadata element does not have to be employed when describing a media resource. Typically this attribute indicates if the use of an element is MANDATORY, OPTIONAL, or RECOMMENDED. This attribute may also be referred to as Obligation or Status.

Repeatable Element

Some metadata schemes, such as the Dublin Core, suggest if you need to apply more than one value to a single element, you can repeat the presence of an element and its associated value. This attribute may use such terms as REPEATABLE, UNBOUNDED, or actually use a number.

Type of Data Entry

Any database designer must indicate what type of data is permissible for a field as a value is typed or entered. Typically, this attribute indicates if the value is a TEXT STRING, NUMBER, or DATE.

Examples

Definitions of an element are often enhanced by using real world examples. The PBCore provides these examples as an aid to understanding.

Element Label

Usually the attribute Name and the attribute Label are the same. The Label is used indicate the exact manner in which an element is referenced.

Element Version

While developing metadata, several versions of elements or the meaning attached to them will emerge over time. Like software editions that are released, Element Version indicates the version you are viewing (hopefully, the most recent version).

Namespace Identifier

http://library.csun.edu/mwoodley/
dublincoreglossary.html

A unique name that identifies an organization that has developed an XML schema. A namespace is identified via a Uniform Resource Identifier (a URL or URN). For example, the namespace for Dublin Core elements and qualifiers would be expressed respectively in XML as:

xmlns:dc = "http://dublincore.org/elements/1.0/"
xmlns:dcq = "http://dublincore.org/qualifiers/1.0/" >

The use of namespaces allows the definition of an element to be unambiguously identified with a URI, even though the label "title" alone might occur in many metadata sets. In more general terms, one can think of any closed set of names as a namespace. Thus, a controlled vocabulary such as the Library of Congress Subject Headings, a set of metadata elements such as DC, or the set of all URLs in a given domain can be thought of as a namespace that is managed by the authority that is in charge of that particular set of terms.


Registration Authority

http://library.csun.edu/mwoodley/
dublincoreglossary.html

A system to provide management of metadata elements. Metadata registries are formal systems that provide authoritative information about the semantics and structure of data elements. Each element will include the definition of the element, the qualifiers associated with it, mappings to multilingual versions and elements in other schema.

A registration authority facilitates the consistent use of a metadata element by all parties and communities. It also contributes to the longevity of a metadata element as it maintains its integrity over time.


Language of the Element

Depending on the Registration Authority for a metadata element or its country of origination and usage, the language used for the element is indicated. For the PBCore, the Language is expected to be English and uses the designation "eng". Standards exist to express languages in either two-letter or three-letter codes.

ISO-639-2: Codes for the representation of names of languages as a 3-letter code.
http://www.loc.gov/standards/iso639-2



Go to Top of Page

What is an Application Profile?

 

http://library.csun.edu/mwoodley/
dublincoreglossary.html

An Application Profile is a set of metadata elements, policies, and guidelines defined for a particular application. The elements may be from one or more element sets, thus allowing a given application to meet its functional requirements by using metadata from several element sets including locally defined sets. For example, a given application might choose a subset of the Dublin Core that meets its needs, or may include elements from the Dublin Core, another element set, and several locally defined elements, all combined in a single schema. An Application Profile is not complete without documentation that defines the policies and best practices appropriate to the application.

For the Public Broadcasting Core of Metadata, we have drawn metadata elements, policies and guidelines from the Dublin Core Metadata Dictionary Project, from the Video Development Project (ViDE), from many digital asset management projects underway at various public broadcasting stations, producers and developers, and other entities. Thus we are building an Application Profile for the PBCore.

With the presentation of version 1.0 of the PBCore, are we finished? Probably not. As the PBCore is used by various communities, we will undoubtedly add extensions to the existing set of metadata elements to accommodate specials needs (see PBCore Extensions).

For example, extensions that we know are important to Public Broadcasting are those related to the use of their media resources in educational venues. The Dublin Core has a draft proposal for metadata elements being assembled by its Education Working Group. The element Audience has already been folded into the PBCore. Other educationally oriented elements include Standard (academic/curriculum), Mediator, InteractivityType, InteractivityLevel, and TypicalLearningTime. These elements are under consideration and may be folded in the PBCore or treated as special extensions for use by certain Public Broadcasting communities.

Go to Top of Page

What is the Dublin Core (DCMI)?

 

http://www.dublincore.org

The Dublin Core Metadata Dictionary Project is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. DCMI's activities include consensus-driven working groups, global workshops, conferences, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices.

The Dublin Core is a 15-element metadata element set intended to facilitate discovery of electronic resources. The Dublin Core has been in development since 1995 through a series of focused invitational workshops that gather experts from the library world, the networking and digital library research communities, and a variety of content specialties. See Section 1 of this guide or the Dublin Core Web Site.

The Dublin Core Metadata Dictionary Project is the body responsible for the ongoing maintenance of Dublin Core. DCMI is currently hosted by the OCLC Online Computer Library Center, Inc., a not-for-profit international library consortium. The work of DCMI is done by contributors from many institutions in many countries. DCMI is a consensus-driven organization organized into working groups to address particular problems and tasks. DCMI working groups are open to all interested parties. Instructions for joining can be found at the DCMI web site under Working Groups.

Go to Top of Page

What is the Public Broadcasting Core (PBMD Project)?

On its surface, “metadata” appears to be an arcane topic reserved for librarians or systems engineers. In fact, in a rapidly evolving and deeply challenging media environment, a well-formed Metadata Dictionary directly addresses our core mission of serving the people of the United States, and may, to a great extent, determine our future relevance. By working hard to sensibly describe our content, and to facilitate easy access and use by teachers, scholars, lifelong learners, engaged citizens and community partners, we will reaffirm our national and local value as providers of media of the highest editorial integrity, offered for the public good.

It is our fervent hope that the Metadata Dictionary Project (or Public Broadcasting Metadata Dictionary Project, PBMD Project) also models a process of cross-disciplinary consensus within our "industry" around critical standards – a template that will serve us in good stead as we meet the many data challenges of the future.

Within public broadcasting, the application of a shared metadata dictionary will facilitate the exchange and delivery of content and data (including both program elements and completed programs) throughout our multiplatform production teams, our system of interconnected licensees and out to our broadcast and Internet constituents. It is a critical first step as PBS, NPR, PRI, individual stations, and others begin to acquire and use asset management systems to organize their content.

The project has been extant since January of 2002, and during its first two phases of CPB Future Fund support, a team of individuals representing public broadcasting's key institutions and endeavors, along with subject matter experts has worked to:

  • Develop consensus regarding project objectives and timeline;
  • Recognize and codify the way our constituents use our content and content information. (Developed use cases based on interviews with producers, broadcast operation staff, educators, website creators, etc.);
  • Examine relevant metadata standards in the media and library communities, to ascertain their applicability to our content and constituencies;
  • Make information about the PBMD Project available via numerous conference presentations and a project website;
  • Contribute and combine the substantial metadata work already performed at key institutions in public broadcasting (PBS, NPR, WGBH, KUED, MPR);
  • Form a preliminary consensus regarding a single set of metadata protocols - the Public Broadcasting Core (PB Core) Metadata, Preliminary Version 1.0.

For more information on the Project, please see the project background paper and "PBMD Project Progress Report" submitted to the 2003 Dublin Core Conference by the Public Broadcasting Metadata Working Group.

Go to Top of Page

 
 
 

How do I Review the Metadata Elements
of the PBCore?

The PBCore Metadata Descriptors are called "Elements." Currently we have 58 Elements, many of which are "Refined Elements" or "Elements with Qualifiers." Qualified Elements are recognized by a ".extension" (for example, "Title.Alternative").

We have gathered the PBCore Elements into three clusters; each cluster houses elements of a similar nature...

  1. CONTENT...
    20 elements describing the actual intellectual content of a media asset or resource.
  2. INTELLECTUAL PROPERTY...
    9 elements related to the creation, creators and usage of a media asset or resource.
  3. INSTANTIATION...
    29 elements that identify the nature of the media asset as it exists in some form or format in the physical world or digitally.

To review detailed information about an Element or Qualified Element, click on its name from the list of elements in the left hand column of a page. There you will see outlined the attributes for an element.

Go to Top of Page

More Resources for the Curious and Compulsive

For additional background on Digital Asset Management and Metadata, please refer to our two websites that provide numerous links to presentations, papers and additional resources.

CPB Asset Management
http://www.pbcore.org/cpbasset

Public Broadcasting Metadata Dictionary Project
http://www.pbcore.org

Go to Top of Page

LAST NEXT

 

 

 


Go to CPB PBCore website...

© 2003 Corporation for Public Broadcasting
- CPB Privacy Policy -