Go to home page for the PBCore



Go to home page for PBCore

Go to User Guide for PBCore

PBCore in Use

PBCore ELEMENTS
Help


01.00 
01.01
01.02

02.00 
02.01
02.02

03.00
03.01
03.02

04.00
04.01
04.02

05.00
05.01
05.02

06.00
06.01
06.02

07.00
07.01
07.02

08.00
08.01

09.00
09.01



15.00
15.01
15.02

16.00
16.01
16.02

17.00
17.01
17.02

18.00
18.00



25.00
25.01 
25.02
25.03
25.04
25.05
25.06
25.07
25.08
25.09
25.10
25.11
25.12
25.13
25.14
25.15
25.16
25.17
25.18
25.19
25.20
25.21
25.22
25.23

25.24
25.24.1
25.24.2

25.25
25.25.1
25.25.2

25.26
25.26.1



99.00
99.01
99.02

Go to PBCore User Guide
Go to PBCore Element Cheat Sheet
Go to Examples of PBCore Metadata
Go to Explanation of Element Attributes
Go to Case Examples of PBCore Implementations
Go to the PBCore XML Schema (XSD)
Provide Feedback

 

A Primer for
Understanding the PBCore Metadata


What is Metadata, Anyway?
What is a Metadata Dictionary?
What is a Metadata Element?
What is a Content Class and an Element Container?
What is an Authority, Controlled Vocabulary & Structured Syntax?
What are Element Attributes?
What is an Application Profile?
What is the Dublin Core (DCMI)?
What is the Public Broadcasting Core (PBMD Project)?
How do I Review the Metadata Elements of the PBCore?
What is a XML Schema Definition (XSD)?
What is a Namespace?
Do the PBCore Elements have a Hierarchical Tree Structure?
What is PBCore Compliant?
Do PBCore Elements Map to other Metadata Schemas?
More Resources for the Curious and Compulsive

 


What is Metadata, Anyway?

“Metadata” is descriptive information about a resource. The resource may be video or audio, an image or graphic, a text-based document, or any other informational item whether electronic or not.

The primary purpose of metadata is to enhance findability and facilitate sharing...the ability to describe a resource and allow someone to discover, review, select, and retrieve an item.

Examples of metadata include the name of an item; descriptions or abstracts about its content; keywords or subject classifications; file formats; authors; producers; distributors; publishers; copyright and usage restrictions; etc.

Metadata needs to be structured in some way. The descriptions available through metadata should not be created in a random or ad hoc manner. In other words, metadata should follow a well-documented, formalized scheme. The flip side of using standardized metadata schemes is called "Folksonomy" and is described in a Wikipedia article ...

In contrast to professionally developed taxonomies with controlled vocabularies, folksonomies are unsystematic and, from an information scientist's point of view, unsophisticated; however, for Internet users, they dramatically lower content categorization costs because there is no complicated, hierarchically organized nomenclature to learn. One simply creates and applies tags on the fly.

By the way, the "descriptions" are called "metadata." However, the "thing" being described is often referred to as the "essence." Essence + Metadata yields a media asset that has value to various end-user communities.

An online Metadata Primer is available from the NSDL--the National Science Digital Library.

For additional background information on why we should be using the same metadata standards, link to these references:

    1. Mary Jane McKinven’s article "The Case for Shared Metadata Standards" in the May 13, 2002, issue of “Current.”
    2. The project background paper and "PBMD Project Progress Report" submitted to the 2003 Dublin Core Conference by the Public Broadcasting Metadata Working Group

Go to Top of Page


What is a Metadata Dictionary?

It's all about definitions. When creating a systematic method for describing media resources, you have to start by creating metadata categories. Whether created from scratch or harvested from other sources, metadata categories must be defined...thus a metadata dictionary.

According to an article in Wikipedia entitled Data Dictionary,

A data dictionary is a set of metadata that contains definitions and representations of data elements.When an organization builds an enterprise-wide data dictionary, it may include both semantics and representational definitions for data elements. The semantic components focus on creating precise meaning of data elements. Representation definitions include how data elements are stored in a computer structure such as an integer, string or date format (see data type). Data dictionaries are one step along a pathway of creating precise semantic definitions for an organization.

Basically, a metadata dictionary describes the following:

  1. common metadata meanings (semantics),
  2. common grammar and rules for expressing data (syntax),
  3. commonly defined metadata dictionary element properties (attributes)

In defining a dictionary...

IA step above a metadata dictionary is a metadata schema. According to the Moving Image Collections (MIC) description for a schema...

  • A metadata schema is a standardized structure for metadata which allows repositories or machines to share data with mutual understanding. The metadata schema defines the data elements (fields) or tags (labels) used to enable indexing, retrieval, display, and sharing of records by computer systems.

Go to Top of Page


What is a Metadata Element?

The Periodic Table of Elements contains a carefully structured visualization of the chemical building blocks of the universe as we know it. Metadata Elements are the descriptive building blocks used to verbally or visually describe the world of resources, assets, media items, or "essence."

End of metaphor (our apologies to chemists everywhere). The Public Broadcasting Core consists of dozens of Metadata Elements. Currently we have 53 Elements which can be accessed from the table on the left side of most PBCore web pages within the User Guide.

Go to Top of Page


What is a Content Class and an Element Container?

If the descriptors attached to a specific element aren't specific or expressive enough to fully identify an item, then that element needs to be further enhanced, refined or qualified by using a companion or related element.

An element and its sibling element descriptions are often bound together in a "Container," implying that the descriptions are related in nature and should be transported or conveyed to information systems intact, as a bundle. A simple example might be the contributing creators of a video program in which one "contributor container" contains an individual's name (e.g., Smithee, Alan) plus that person's role (director). A second instance of a "contributor container" contains another name (e.g., Doe, Jane) plus that person's role (executive producer). The name is "qualified" by a role or function. Another example is the title for a media item that is bundled with another metadata element, such as titleType (e.g., a series title or an episode title). Unless a name and its associated role are paired in a container, then all you achieve is a list of names and disassociated roles.

A standalone element or elements bound by element containers can be gathered together under what is called a "Content Class." Content Classes are created as "conceptual wrappers" that cluster together a list or structure of thematically-related Elements (metadata fields and their attributes and properties). These are often built in a hierarchical structure.

 

Go to Top of Page


What is an Authority, Controlled Vocabulary,
and Structured Syntax?

In addition to refining the descriptions of a media item by using related Elements combined into Element Containers, wrapped within the conceptual grouping of a Content Class, metadata descriptions can be refined further by establishing restrictions on how data is actually entered.

The grammar of a descriptor may be controlled. A good example of this type of refinement is the order in which a person's name is displayed, e.g., LastName, FirstName MiddleName, Credentials (for a very interesting discussion on the complexities of displaying names, see "Representing People's Names in Dublin Core").

Another example is the manner in which dates are represented, whether you order the data by Month/Day/Year, Day/Month/Year or Year/Month/Day (for a discussion of the representation variables in displaying dates and times, see the W3C report on "Date and Time Formats").

In order to better control the terms and descriptions used while cataloging, some metadata elements can employ ways to refine or "encode/enter" your data, using formal notations, vocabularies or parsing rules.

  • Use an "authority file" from another agency that specifies how to properly enter descriptive information for a type of metadata element. It may provide taxonomies of terms organized into logical hierarchies, such as the Library of Congress "subject" terms.

  • Use a short listing of prescribed terms, often called a "controlled vocabulary." The best practice is to select a term or terms from a picklist. The picklist insures consistency in data entry.

  • Follow a particular structured syntax, punctuation or grammar when entering data, e.g., LastName, FirstName MiddleName, Credentials or dates as 2005-02-24 (YYYY-MM-DD).

Controlling the descriptions entered for a metadata element ultimately means that end users are able to conduct successful searches for relevant media items and avoid an explosive number of irrelevant "hits."

The PBCore encourages Authorities, Controlled Vocabularies, and Structured Syntax wherever precision is needed and ambiguity is to be avoided.

For an excellent discussion on Controlled Vocabularies and Authorities, consult the MIC-- Moving Image Collections Cataloging and Metadata Portal for Standards and Tools at this web page: http://mic.imtc.gatech.edu/catalogers_portal/cat_cntrldVocab.htm

Go to Top of Page


What are Element Attributes?

The focus is on specificity and standardization. There are many specifications on how to define data elements. If one hopes to share metadata descriptions with other organizations and entities (interoperability), then it's best to follow an established set of guidelines in setting up and defining metadata elements. A commonly understood framework allows diverse groups to appreciate, even harvest, data from each other.

The PBCore has used a modified standard for describing data elements used in databases and documents. It is called ISO/IEC 11179: Specification and Standardization of Data Elements. Technically speaking, PBCore is considered to be "cognizant of ISO/IEC 11179." In using this standard, each descriptor or metadata element is identified by numerous attributes or characteristics that define and refine the meaning of the element.

Within the PBCore User Guide, each metadata element is defined and described according to the ISO/IEC 11179 specifications. The attributes we've used for PBCore are fully explained on a separate web page entitled Attributes and Properties of the PBCore Metadata Elements.

 

Go to Top of Page


What is an Application Profile?

 

An Application Profile is a set of metadata elements, policies, and guidelines defined for a particular application or situation. The elements may be harvested from one or more element sets, thus allowing a given application or profile to use pre-established, well-formed, standardized metadata in addition to other metadata descriptors that are created and defined locally (custom metadata). For example, a given application might choose a subset of the Dublin Core that meets its needs, or may select elements from the Dublin Core, another element set, and several locally defined elements, all combined in a single schema. An Application Profile is not complete unless adequate documentation is provided in order to identify and specify definitions, policies and best practices associated with an application profile's use.


http://www.ariadne.ac.uk/issue25/app-profiles/

"Application Profiles: Mixing and Matching Metadata Schemas"
by Rachel Heery and Manjula Patel

We define application profiles as schemas which consist of data elements drawn from one or more namespaces, combined together by implementers, and optimized for a particular local application.The experience of implementers is critical to effective metadata management...implementers use standard metadata schemas in a pragmatic way. This is not new, to re-work Diane Hillmann’s maxim ‘there are no metadata police’, implementers will bend and fit metadata schemas for their own purposes.

For the Public Broadcasting Core of Metadata, we have drawn metadata elements, policies and guidelines from the Dublin Core Metadata Dictionary Project, from the Video Development Project (ViDE), from many digital asset management projects underway at various public broadcasting stations, producers and developers, and other entities. Thus we are building an Application Profile for the PBCore.

With the presentation and documentation of PBCore, are we finished? Probably not. As the PBCore is used by various communities, we will undoubtedly add extensions to the existing set of metadata elements to accommodate specials needs (see PBCore Extensions).

For example, extensions that we know are important to Public Broadcasting are those related to the use of their media resources in educational venues. The Dublin Core has a draft proposal for metadata elements being assembled by its Education Working Group. The element Audience has already been folded into the PBCore. Other educationally oriented elements include Standard (academic/curriculum), Mediator, InteractivityType, InteractivityLevel, and TypicalLearningTime. These elements are under consideration and may be folded in the PBCore or treated as special extensions for use by certain Public Broadcasting communities.

Go to Top of Page


What is the Dublin Core (DCMI)?

 

http://dublincore.org

The PBCore is built on the foundation of the Dublin Core (ISO 15836), an international standard for resource discovery (http://dublincore.org), and has been reviewed by the Dublin Core Metadata Initiative Usage Board.

The Dublin Core Metadata Initiative (DCMI) is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. DCMI's activities include consensus-driven working groups, global workshops, conferences, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices.

The Dublin Core is a 15-element metadata element set intended to facilitate discovery of electronic resources. The Dublin Core has been in development since 1995 through a series of focused invitational workshops that gather experts from the library world, the networking and digital library research communities, and a variety of content specialties.

The Dublin Core Metadata Initiative is the body responsible for the ongoing maintenance of Dublin Core. The work of DCMI is done by contributors from many institutions in many countries. DCMI is a consensus-driven organization organized into working groups to address particular problems and tasks. DCMI working groups are open to all interested parties. Instructions for joining can be found at the DCMI web site under Working Groups.

Go to Top of Page


What is the Public Broadcasting Core
(PBMD Project)?

On its surface, "metadata" appears to be an arcane topic reserved for librarians or systems engineers. In actuality, in a rapidly evolving and deeply challenging media environment, a well-formed Metadata Dictionary directly addresses our core mission of serving the people of the United States, and may, to a great extent, determine our future relevance. By working hard to sensibly describe our content, and to facilitate easy access and use by teachers, scholars, lifelong learners, engaged citizens and community partners, we will reaffirm our national and local value as providers of media of the highest editorial integrity, offered for the public good.

It is our fervent hope that the Public Broadcasting Metadata Dictionary Project (PBMD) also models a process of cross-disciplinary consensus within our "industry" around critical standards – a template that will serve us in good stead as we meet the many data challenges of the future.

Within public broadcasting, the application of a shared metadata dictionary will facilitate the exchange and delivery of content and data (including both program elements and completed programs) throughout our multi-platform production teams, our system of interconnected licensees and out to our broadcast and Internet constituents. It is a critical first step as PBS, NPR, PRI, individual stations, and others begin to acquire and use asset management systems to organize their content.

The project has been extant since January of 2002, and during its first two phases of CPB Future Fund support, a team of individuals representing public broadcasting's key institutions and endeavors, along with subject matter experts has worked to:

  • Develop consensus regarding project objectives and timeline;
  • Recognize and codify the way our constituents use our content and content information. (Developed use cases based on interviews with producers, broadcast operation staff, educators, website creators, etc.);
  • Examine relevant metadata standards in the media and library communities, to ascertain their applicability to our content and constituencies;
  • Make information about the PBMD Project available via numerous conference presentations and a project website;
  • Contribute and combine the substantial metadata work already performed at key institutions in public broadcasting (PBS, NPR, WGBH, KUED, MPR);
  • Form a preliminary consensus regarding a single set of metadata protocols - the Public Broadcasting Metadata Dictionary (PBCore), Version 1.0 (enhanced as v1.1 in January 2007).

In subsequent phases of the project, PBCore is building advocacy for the metadata dictionary and its associated XML Schema Definition (XSD), creating and conducting training in the use of PBCore (live and on-demand), and recommending and providing for the long-term sustainability and support of PBCore.

For more information on the Project, please see the project background paper and "PBMD Project Progress Report" submitted to the 2003 Dublin Core Conference by the Public Broadcasting Metadata Working Group.

Go to Top of Page


How do I Review the Metadata Elements
of the PBCore?

The PBCore Metadata Descriptors are called "Elements." Currently (in version 1.1.) we have 53 Elements, many of which are related, sibling elements bound together in Element Containers (15 containers and 3 sub-containers).

All of the individual Elements and their associated Element Containers are gathered together into four "conceptual wrappers" or clusters called "Content Classes." Here is the breakdown.

  1. PBCoreIntellectualContent
    9 containers; 16 elements
    (metadata elements describing the actual intellectual content of a media asset or resource)
  2. PBCoreIntellectualProperty
    4 containers; 7 elements
    (metadata elements related to the creation, creators, usage, permissions, constraints, and use obligations associated with a media asset or resource)
  3. PBCoreInstantiation
    1 container, 3 sub-containers; 28 elements
    (metadata elements that identify the nature of the media asset as it exists in some form or format in the physical world or digitally)
  4. PBCoreExtensions
    1 container; 2 elements
    (additional descriptions that have been crafted by organizations outside of the PBCore Project. These extensions fulfill the metadata requirements for these outside groups as they identify and describe their own types of media with specialized, custom terminologies unique to their needs and community requirements)

 

The PBCore User Guide provides numerous explanations and discussions about the PBCore elements and how to best use them. Any individual element can be reviewed by selecting it from the element listing provided on the left-hand side of most pages within the User Guide.

Go to Top of Page


What is a DTD and an XML Schema?

A metadata schema, as well as the actual descriptions of media items that may use the schema, need to be presented in some logical, clearly expressed manner so that the information can be understood. More importantly, using well-formed methods to express metadata schemas and descriptions allows different parties to share data; they are communicating using the same language and the same grammar.

A language that is often used to express well-formed data is XML, Extensible Markup Language. Unfortunately, unless the party offering and the party accepting the well-formed data are using a common grammar, information is likely to be mangled as it is interpreted and validated.

This situation is where an XML Schema (also called an XSD--XML Schema Definition) is used to define the grammar and validate the data being shared. Some have stated that a XML Schema functions as a blueprint for describing the structure of the XML language in a document. These blueprints supply the...

  • Sequence in which elements appear in an XML document
  • Interrelationships between different elements (parent-child associations or nested relationships)
  • Types of data that are used to express elements and attributes (text string, number, date, timestamp, etc.)

DTDs, or Document Type Descriptions, are an alternative method for describing the blueprint. DTDs have been around longer than XML Schemas, and are very widely used. However, they have some limitations in their capacities, such as using non-XML syntax to compose a DTD, support for limited data types, inability to identify namespaces (see What is a Namespace?), and no support for extensibility or inheritance. XML Schemas, however, do not have these limitations, and also allow users to craft their own data types.

Typically, more complex data structures, with multiple data types, require the use of an XML Schema over a DTD.

PBCore is offering its blueprint via an XML Schema Definition. Use this link to access the XSD.

Get the PBCore XML Schema

Listed here are links to Primers, XML Schema Definitions and Specifications as provided by W3C

 

Go to Top of Page


What is a Namespace?

There are many metadata schemes available for use by various industries and communities, each with their own set of elements and definitions.

Metadata Schemas

The creation of a "Namespace" that is referenced by schema makers and schema users is done in order to distinguish one set of element names from another set used by a different schema. For example, the element "description" may have divergent meanings from one set of metadata to another. Two or more developers may be using an identical element name.

By declaring a formal Namespace in which a specific metadata schema declares the existence and meaning of its metadata elements and names, we avoid name collisions and confusion.

Collision of Names

A Namespace declares a "bread crumb trail" between real world applications of a schema's metadata and its humble origins...or at least it points to the party responsible for its creation in the first place.

The documentation for each PBCore metadata element includes an identifier for its Namespace. Some are based on Dublin Core (http://dublincore.org), others are of our own crafting.

Access the PBCore Namespace

 

Go to Top of Page


Do the PBCore Elements have a
Hierarchical Tree Structure?

For a more full discussion about hierarchical arrangements of metadata elements, go to our web page discussing Hierarchies and Element Interdependencies.


Does PBCore Have Hierarchies

Go to Top of Page


What is PBCore Compliant?

Of course, PBCore cannot satisfy all functions and requirements that the breadth and depth of our public broadcasting communities demand in their information systems. For many, PBCore is a starting point from which to build a metadata dictionary for their internal use. Local customizations, such as additional metadata elements or additions to picklists of terms, may be implemented. For others, interoperability and data sharing between different information systems (metadata islands) is of the greatest importance.

When is the use of metadata considered to be PBCore compliant? There are two possible perspectives.

COMPLY WITH THE PBCORE DICTIONARY...
One form of compliancy is to implement metadata by adhering to the PBCore Dictionary of metadata elements as documented in our User Guide. That means a metadata implementation must match the PBCore Dictionary with...

    • common metadata element definitions and meanings (semantics),
    • common grammar and rules for expressing data (syntax),
    • commonly defined metadata dictionary element properties (attributes)

If you refer to the documentation for a typical PBCore metadata element, you will see that for compliancy, an element must match the following attributes...

  • Name of the element (although what you call an element can be customized)
  • Definition
  • Refinements & Encoding Schemes
  • Guidelines for Usage
  • Obligation to Use
  • Repeatability
  • Type of Data Entry
  • Language of the Element

These attributes are more completely explained on the web page PBCore Element Attributes.

If a local implementation of PBCore varies from the PBCore Dictionary, then it is vital that the variance be documented.

 

COMPLY WITH THE PBCORE XML XSD FOR DATA SHARING...
Another form of compliancy emerges when interoperability betweeen information systems is required. If your metadata cataloging system is never intended to share data or descriptions with other systems, then any variations and changes you make to the PBCore are confined to your own instance. If you need to export your data to another information system, then the manner in which you cataloged media items and assets must be transformed into a standard framework that other information systems can interpret correctly.

We have created what is called an XML Schema Definition document (XSD) for PBCore v1.1. It is a standard framework upon which data exported from one information system can be transformed into PBCore compliant structures. It is a standard framework with which data can be interpreted in a known fashion by another information system, and imported into its metadata structures. Below is an illustration of the process.

PBCore XML compliancy

A thorough discussion of XSD can be found on our web page for the PBCore XML Schema Definition (XSD).

When PBCore elements are declared "Mandatory" in the listings outlined in our web page PBCore Elements Viewed by Obligation to Use, compliance is significant when metadata is being shared between systems. The PBCore XSD breaks if the obligations specified are ignored. If the PBCore XSD breaks, then data cannot be properly exported or imported between information systems.

 


Do PBCore Elements Map to
Other Metadata Schemas?

Many PBCore metadata elements will crosswalk or map to metadata elements in other established dictionaries and schemas. To review documents that identify these mappings, go to our separate web page discussing and illustrating PBCore Mappings & Crosswalks.


Got to mappings of PBCore Elements

Go to Top of Page


More Resources for the Curious and Compulsive

For additional background on Digital Asset Management, the PBCore, and Metadata, please refer to these web-based materials:

CPB Asset Management
http://www.pbcore.org/cpbasset/

Public Broadcasting Metadata Dictionary Project
Papers & Presentations, Resources and Links
http:/www.pbcore.org/resources/

Go to Top of Page

 
 
 

 

 

 

 

PBCore in Use

 

 

 

Go to Feedback

 

Go to CPB website

© 2005 Corporation for Public Broadcasting
- CPB Privacy Policy -
- PBCore Licensing via Creative Commons -