Home Page for PBCore Metadata

PBCore Draft
v1.1
03 Feb 2004

Project Background
QuickStart Guide
Glossary of Terms
RFC & Test Implementation Docs

This page summarizes the PBCore Elements
that received the lowest scores in the RFC
regarding their "refinements"


THE ELEMENTS Help
LAST NEXT

Descriptions about the
CONTENT...

01.00
01.01
01.02
01.03
01.04
03.00
04.00
04.01
04.02
04.03
08.00
08.01
08.02
11.00
13.01
13.02
14.01
14.02
16.01
16.02

Go to Top of Page

Descriptions related to
INTELLECTUAL PROPERTY...

02.00
02.01
05.00
05.01
06.00
06.01
15.01
15.02
15.03

Go to Top of Page

Descriptions identifying
a media asset's
INSTANTIATION...

07.01
07.02
07.03
07.04
09.01
09.02
09.03
09.04
09.05
09.06
09.07
09.08
09.09
09.10
09.11
09.12
09.13
09.14
09.15
09.16
09.17
09.18
09.19
09.20
10.00
12.00
12.01
18.00
19.00

Go to Top of Page

Descriptions beyond
the PBCore Metadata

99.00

Go to Top of Page
 
PBCore Elements with Low-Scoring Refinements


Organizing Principles from the Metadata Experts
Content Releated Elements (lower scores)
Intellectual Property / Rights Related Elements (lower scores)
Instantiation Related Elements (lower scores)

 


Organizing Principles from the
Metadata Experts

1. Keep It Simple
  • Develop a core set of questions for each workflow area, decide what is truly “mandatory” versus “desired,” eliminate terms that don’t apply in the broadcast/media environment. “Remember, this [is] to be [a] real-world tool, not an arcane philosophical model.” Develop a “lay-person’s guide.”

Whatever PBMI comes up with should do the following: 1. Be as intuitive as possible to the end user. Try to use industry terms as much as possible. If these can be coded through XML mapping in the background to link to other standard terms, fine. But you'll save yourself a lot of grief in the beginning if the user doesn't feel totally lost and can recognize familiar terms. 2. Don't be wedded absolutely to standards. You are creating this to be a tool for PB stations. If it makes more sense to display notes next to the item they describe, do it. You want the record to be as easy to follow as possible. Again, field names in forms can be linked in the background to other standards terms and rearranged by the computer in the background. 3. Allow for flexibility and station individuality. Standard lists to pick from are okay but allow for manual entry for exceptions or as needed. 4. Remember stations have been exchanging programs without the assistance of computers for a long time. Being able to set it up online is a convenience, but it is still a tool, not an end in itself. It is not supposed to be an arcane model of philosophical perfection, this is something that is meant to be used. Don't make it so complicated it makes the IRS tax code look like a picnic.

 

2. Don’t Do It Alone.

  • Continue to test your definitions with vendors and other broadcast organizations. SMPTE (MXF, RP210), MPEG (MPEG7), and the Library of Congress (METS, MODS) can all offer some guidance. The U.S. Department of Education’s “Gateway to Educational Materials”™ (GEM) metadata initiative can provide a useful “extension” for educational data elements.

3. Rights Management will require its own full schema.

  • PBCore can keep its classifications simple, but link to a more complex set of rules (such as MPEG21) being developed by media owners and distributors.

4. PBCore need not follow Dublin Core’s “one record per item” rule.

  • While two experts said, “stay with DC’s approach,” as discussed earlier, six said that in the world of computer searches and multiple formats of media content, adhering to DC was a step backwards, or worse.

 


Content Related Elements
Lower Scores

 

Name

All Mean

Low Mean

Low-Scoring Subset

Description.ProgramRelatedText Refinements

3.6

2.5

CONTENT ROLE

2

Description.Abstract Refinements

3.5

2.7

CONTENT ROLE

3

Source Refinements

3.4

2.8

CONTENT ROLE

4

Relation.Identifier Refinements

3.5

2.8

CONTENT ROLE/EXPERTS

5

Coverage.Spatial Refinements

3.5

2.9

DISTRIBUTION/OPS ROLES

6

Coverage.Temporal Refinements

3.7

2.9

DAM EXPERTS/EDUCATION

Discussion:

Note: the following discussion is not designed to find the “answer,” but to guide the next PBCore team in refining the next version of the Dictionary.

Of greatest concern is the fact that four “content elements” received a 2.8 or lower mean score from the respondents who identified themselves as primarily working with program content. The two “coverage” elements received a 2.9 rating by at least one cluster of respondents.

Some of the comments associated with these fields (element, refinement and “confusing” entries) follow in an edited form. Where an “expert” responded to a question about this element, it too is included.

1. Description.ProgramRelatedText Refinements

  • I don't see the usefulness of this as a field element. Better would be to assume related program text as content not metadata. Individual systems might handle textual content as metadata but it serves no purpose in a metadata dictionary.
  • Metadata seems like the wrong place for Program Related Text (PRT). PRT is itself an object that should be linked to (or part of) the main object it’s related to and described with its own metadata.
  • Clearly identifying related text (and its language and usage) will be of immense help in assisting end-users in locating materials they want.
  • Good thing this is repeatable because with the automated text/speech extraction tools coming into use, this is going to be a popular field.

2. Description.Abstract Refinements

  • This is a key thing, and should be highlighted: "why an asset or media file is important at all or within certain contexts."
  • It's important but needs better definition. Vague (to me) line between this and description.

Experts’ Comment on “Description” Elements

  • I like the idea of minimizing the number of fields, but it seems to me that DESCRIPTION.TABLEOFCONTENTS, and DESCRIPTION.PROGRAMRELATEDTEXT are good stand-alone elements. DESCRIPTION.ABSTRACT might be rarely used but I see why it should be separated out. I would be inclined to leave it the way the scheme is but identify DESCRIPTION as part of the minimal elements within the core of the Core and make DESCRIPTION.ABSTRACT, DESCRIPTION.TABLEOFCONTENTS, and DESCRIPTION.PROGRAMRELATEDTEXT optional. In a situation like mine, we are rarely going to have the information for DESCRIPTION.ABSTRACT and DESCRIPTION.TABLEOFCONTENTS. But we will want DESCRIPTION and DESCRIPTION.PROGRAMRELATEDTEXT to be different fields.
  • Description.Abstract seems like a useful field for most types of programs. That said, I think a picklist gives you more flexibility for choosing types of description, rather than having to hard code specific types in a field name. Also it allows the individual station to use only the terms it wants to use. As long as these are common industry terms, I don't see a problem with a picklist.

 

3. Source Refinements

  • Though we use this element internally for capturing legacy metadata, I am not convinced of its usefulness for metadata change. Too similar to Relation.
  • I think that this part needs to be fleshed out more. There needs to be a clear definition of entities-- programs, producers, etc, that can be a source.
  • Source should have derivation categories. (e.g. books, film, program).
  • I think an example would help; I assume it means something like the underlying literary work of a broadcast play or musical, but it was not clear in the definition.

4. Relation.Identifier Refinements

  • What's confusing is the bond between this and Relation Type.
  • It’s probably good to specify Identifier but confusing when one looks at theRelation Type values: not all suggest an Identifier value.
  • If people can wrap their heads around the concepts involved in Source and Relation.Type, then this field is a piece of cake and highly relevant if people use it.
  • I think more emphasis needs to be put on uniquely identifying a related resource. Saying 'x is version of y' and only providing a shelf location for y could be a problem when someone decides to reshelve....

 

5. Coverage.Spatial Refinements

  • It could be a bit clearer that this concerns "spatial" elements within the program; at first, it almost seemed like an archival determination of the physical location of the program.
  • The examples add to my confusion in they list both descriptive and geospatial metadata, which seems like apples and oranges, subject vs. format data.
  • Teachers really like to localize resources. I would suggest, at a minimum, using the ISO 3166-1 and -2 country and state codes to provide some searchable uniformity here.
  • Really need some sort of thesaurus or at least rules for entering information. Need to develop an authority file outlining form and definition.

 

6. Coverage.Temporal Refinements

  • I think you will be sorry not to provide uniformity here. MARC guidelines can provide uniformity and are readily available at loc.gov if you don't like the ISO standard.
  • Dates are messy. Allowing free text dates keeps them that way. Note that without applying not-yet-invented artificial intelligence, searches for 1863 won't find the asset labeled '1861-1865'. No good solutions here, unfortunately.
  • Time periods should be standardized and not made a free-form text entry.

 


Intellectual Property & Rights Related Elements
Lower Scores

 

Name

All Mean

Low Mean

Low-Scoring Subset

1

Rights.Usage Refinements

3.8

2.8

EXPERTS

2

Rights.Reproduction Refinements

3.8

3.0

EDUCATION

3

Contributor.Role Refinements

3.8

3.1

EDUCATION

4

Rights.Access Refinements

4.0

3.4

EDUCATION

Discussion:

Note: the following discussion is not designed to find the “answer,” but to guide the next PBCore team in refining the next version of the Dictionary.

Some of the comments associated with these fields (element, refinement and “confusing” entries) follow in an edited form. Where an “expert” responded to a question about this element, it too is included.

1. Rights.Usage Refinements

  • Free text: you are doing the best you can. But a suggested (not enforced) vocabulary might be warranted.
  • I strongly recommend creating a standardized value list to enable interoperability across stations and standard information for end users via public portals. A more formal set of rules than free-form text would be useful.

 

2. Rights.Reproduction Refinements

  • How does this element differ significantly from the Rights.Usage element? There needs to be greater emphasis on the distinction between use (as in what can you do with this item) and reproduction (making copies) Can stations choose to put.
  • I think the whole Rights Elements domain needs to be re-considered and made clear. Each asset has a group of usage rights and each usage has terms and restrictions.

 

3. Rights.Access Refinements

  • Access is not an on/off switch. Access should be associated with Groups. Again, we may want to combine the simple drop-down list with a free text notes field.
  • Could there be a "conditional access" if triggered or would this be set-up at another level?
  • I think the key with this one is that this is the field used for mining. Clarity or highlighting this purpose of this element might be helpful because otherwise people are going to tend to want to lump all 3 rights elements into one field.

Experts’ Comments on Rights Elements:

  • Formal data models for expressing rights information are *Very* difficult to create, and the environment in which public broadcasting operates strikes me as more prone than many to creating unusual rights situations. I would leave these elements free-text at the moment. You may wish to consider whether the public broadcast community needs a separate rights expression language, or whether one of the existing rights languages, such as ODRL, could be adapted to more specifically delineate rights & permissions covering various assets.
  • Given that rights are such a critical issue for PBS resources, I'd suggest developing a separate rights schema, utilizing MPEG21 (XrML)or ODRL, and reference the rights metadata from the PBCore record.
  • I would prefer that all values be combined into a rights statement placed into a single metadata element with a standardized way of entering the data at least or a controlled vocabulary at best. However, this is going to be hard to implement too. Are you going to have some place for people to put rights statements that are unusual?

 

4. Contributor.Role Refinements

  • Even more than Creator role, Contributor roles need to be accurate because it may include anyone from a cameraperson to an intern.
  • The enumerated list is pretty much focused on "creative" aspects and (not so much) on copyright-ownership aspects.
  • Definitions seem crucial here. Also some policy might be established on what text string is used for these roles: Official job titles? On-screen credit? We also must recognize that the list is too long for a drop-down.

Experts’ Comments on Contributor/Creator Elements

  • No one outside of people who have been trained by PBCore is going to be able to figure this one out, and even then it's murky. Can there be more than one creator? What if they're not all at the same creator level? What if someone is somewhat linked to the creating process but not a full-blown creator? Where is the cutoff? Creator.role helps, but deciding who's a creator to begin with is the biggest problem.
  • I think that this highlights the fact that having separate creator and contributor elements is not particularly valuable. It my opinion, Dublin Core made a mistake in asserting that distinction.

 


Instantiation Related Elements
Lower Scores

 

Name

All Mean

Low Mean

Low-Scoring Subset

1

Annotation Refinements

3.5

2.5

EXPERTS

2

Format.Encoding Refinements

3.7

2.9

EXPERTS

3

Identifier Refinements

3.9

2.7

EXPERTS

4

Location Refinements

3.9

2.8

EXPERTS

 

Discussion:

Note: the following discussion is not designed to find the “answer,” but to guide the next PBCore team in refining the next version of the Dictionary.

Some of the comments associated with these fields (element, refinement and “confusing” entries) follow in an edited form. Where an “expert” responded to a question about this element, it too is included.

1. Annotation Refinements

  • Notes will ultimately make or break a metadata exchange initiative. I recommend the addition of an AnnotationType Element with a list that includes other top-level Dictionary Elements (Publisher Notes, Creator Notes, etc).
  • Risky to give people an unstructured notes space - they could get lazy and just use this instead of properly using the other elements. Also would be difficult to search/index. All necessary metadata should be capturable in structured elements.
  • Although this is an excellent tool, I know from experience it can be overused; perhaps add wording to indicate that it should not be used for all fields.

Experts’ Comments on Annotation

  • At some level, I think an annotation/note facility can become overkill, and part of the point of a metadata standard is to force people to express information within a particular structure, instead of a allowing free-text everywhere. I think a single annotation element provides the flexibility to give additional information not covered within the main metadata element set; separate annotation elements for every other metadata element would be unwieldy and probably pointless. I doubt many people are going to have the time to put in that much annotation information.
  • I think the possibilities of adding ANNOTATION as a qualifier of elements is a useful idea, but in reality, I don't know that they would actually be used as intended. I would think that in the rush to get the "paperwork" done, these elements would remain empty and useless while if something really remarkable stood out that had to be presented, people would want to look for a general notes field. I think that PBC should not support individual Annotation Elements for each major element, but leave the option open to see if any agencies actually do want to make use of fields like these.

 

2. Format.Encoding Refinements

  • I wonder if a Compression Standard Element with a Compression Rate would be more understandable.  Why repeatable?
  • The definition is confusing regarding what information you expect to see in this element. It's only when you view the examples that you know what kind of information you are supposed to place in this field, but still not understanding the definition.
  • NEED CONTROLLED VALUE LIST. Encoding needs to be pre-defined from an authority based on format.type.

Experts’ Comments on Format.Encoding

  • I think that format.standard and format.encoding are, fundamentally, trying to express the same information: formally identifying the technical standard/specification that defines the data format used for the asset. I think they can be collapsed into a single element which *should* be more carefully defined, and perhaps employ a controlled vocabulary. Format.type, on the other hand, seems to be expressing a bit different from the other two elements, a more 'high-level' description of the nature of work's format.
  • I'd look at what is required for interoperability with schemas like MPEG7 and SMPTE and also ask what purpose these data elements serve, and who benefits from their existence. Are they important for migrating to newer technologies to support digital permanence? Are they important to end users who play back the files? Is this important for a station considering the purchase of the resource, or preparing to download the resource? The data elements should serve a purpose, perhaps tied to the 3 FRBR core user information needs--find, identify, select or obtain, or they should serve the purpose of maintaining the intellectual content in perpetuity.

 

3. Identifier Refinements

  • Of course, this Element as defined is imperative, but the examples seem all over the map. I emphatically do not think shelf location should be used as an identifier. I recommend distinct elements for Identifier.Barcode and Location.PhysicalLocation.
  • Definition is confusing, not until you see the examples is it understand it's along the lines of "tape location."
  • I recommend identifying the scheme used (UMID, NOLA, et.)

 

4. Location Refinements

  • Clarify re identifier elements; this is what a non-expert would look for first. In a particular case, if I can use either Location or Identifier should I user one, the other or both?
  • It is too similar to other Elements like Identifier and Format.Identifier to nail down its purpose.
  • Note that this is the same as the MODS metadata schema's location element (comes from MARC). Also used in DC-Library application profile.

Experts’ Comments on Location

  • How about multiple locations? How about electronic storage locations and physical storage locations. Should this element be broken down to those levels?
  • In the case of multiple manifestations for an item, the assets in different locations will very likely have different format characteristics as well (bit depth, data rate, frame size, frame rate, encoding, etc.). You'll need some way to associate all of the other formatting metadata elements with a particular location(s).
  • This is a good idea. Using this element to note the several copies ofsomething. However, how are you going to distinguish, what version is kept in which place. Also, how do you identify which is your primary item and what FORMATS your other versions are in?

 

 
LAST NEXT
 


Go to CPB PBCore website...

© 2003 Corporation for Public Broadcasting
- CPB Privacy Policy -