Survey: PBCore Supplementary Survey & Unresolved Questions

Author: CPB
Filter:
Responses Received: 9


Email NameFirst NameLast Organization RFC_WG RFC_PMX RFC_PMNX
mail@makxdekkers.com Makx Dekkers Dublin Core No Yes No
cfle@loc.gov Carl Fleischhauer Library of Congress No Yes No
aiz@unl.edu Art Zygielbaum Nebraska Educational Television Yes Yes No
jzm10@psulias.psu.edu Jin Ma Penn State University Yes Yes No
jerome.mcdonough@nyu.edu Jerome McDonough NYU Libraries No Yes No
shawn.rounds@mnhs.org Shawn Rounds Minnesota Historical Society,
Library and Archives
No Yes No
smohn@mpr.org Sylvia Mohn Minnesota Public Radio Yes Yes No
lisac@uky.edu Lisa Carter University of Kentucky/KET No Yes No
gagnew@rci.rutgers.edu Grace Agnew Rutgers University Yes Yes No


X_1.1 PBCore App Profile Enough Information

Mean = 4.00, Standard Deviation = 0.76

Response Count Percent
(1) 1 0 0.0%
(2) 2 0 0.0%
(3) 3 2 25.0%
(4) 4 4 50.0%
(5) 5 2 25.0%

X_1.2 Suggest Additions To PBCore


My comments in the main questionnaire list a number of specifics, and make the general comment that for technical metadata, there are elements in the SMPTE RP-210 set that may be worth considering. (I have not the time to do that crosswalk today.) In addition, I wonder about the lack of structural metadata, in the sense that digital librarians use the term. It is unlikely to be needed for simple programs but when PBS gets more into packaging and disseminating interactive multi-part content, it may be needed.
Given the attention to frame rate, data rates, bit depth, aspect ratio, etc., I was somewhat surprised to see that there wasn't anything regarding image sampling (4:2:2, 4:2:0, etc.) separated out. Also, nothing about the technical side of rights (is this DRM-protected? with what mechanism?). And in my area, getting details on the *exact* compression mechanism used is fairly important. It isn't enough to say a file is MJPEG2000; I need to know whether it's uncompressed MPEG2K or using wavelet compression.
Perhaps provide some examples of fully cataloged resources so that people can see how all of the elements work together for specific instances.
While I found the element information and guidelines comprehensible, other people with less cataloging background and experience might find this too intimidating and too much. So just be aware it might be useful to rewrite the guidelines using less jargon and more ordinary words. Or at least have an alternate available. This is kind of OT, but on my pc, for the first page, I couldn't get the attribute definitions to fully display. They were cut off on the right margin. Even after I adjusted the scrollbars as much as I could, and tried viewing it fullscreen and not fullscreen. My screen resolution is 800 x 600. I could view the rest of the material okay since it wasn't as close to the righthand margin.
The main things that I think are needed are: Come up with a smaller core set of elements and guidelines on how to use them if you choose not to use the others. For example, if Title appears alone in the core set, you need to be specific about how people should generate the title and how it should look. I don't think that in a normal television setting, you are going to get librarians or producers to wade through all those elements. If you don't get it down to 15 or 20, people will just not fill out the ones they never get to. They might spend so much time being frustrated about Title and its qualifiers and Creator and related elements that they never get to Location. Also, you need to work with some vendor or something that can come up with an application so that people can just pick up this application and use it and wa la, they have metadata that can be mapped to PBC. If they have to seek out vendors and work with those vendors to fix the software to map to PBC, they are never going to do it. Finally, I don't see how a full and effective implementation of PBC will take place without incentive or at least funding opportunities for stations to do it. Stations have been ignoring the cataloging of their assets for 30 years, not because they don't think the assets are valuable and they want to find them again, but because money and time goes to keeping the station on the air and producing programming that means something to the audience.
My main concern with PBCore is that it is not really "core". The mandatory elements seem pretty exhaustive. The work is quite laudable, and I think you are almost there, but I suggest several steps: 1. Select a range of items that represent the different types of materials to be described and managed by PBCore. If (as it appears) both digital and analog materials are included, your sample set should include both. This will give you a real world test of PBCore. 2. Target the metadata creators (PBS stations) with this evaluation. I think your explanation of data elements and application profile are excellent and very clear for someone who is familiar with the principles and concepts of metadata. Unfortunately, I think the presentation is way too complex for people who are new to metadata. You need another presentation for those folks. Terms like application profile, registry, qualifiers, etc. will not be meaningful to the users you are targeting.


X_1.3.1 PBCore Reliance On DC Appropriate

Mean = 1.11, Standard Deviation = 0.33

Response Count Percent
(1) Yes 8 88.9%
(2) No 1 11.1%

X_1.3.2 Explain DC Reliance Concerns


Dublin Core, even qualified DC, always seems a little "soft." One need not go all the way to MARC for better identification of fields and association of elements, and many look to MODS as a middle ground. The problem is that simple is simple, and complex is labor intensive. Sigh.
It is a good place to start since Dublin Core is a widely used standard. On the other hand, it is limited. PBCore does a great job addressing unique needs for the specific community. Please note: the comments provided in this survey is based on general metadata expertise.
None.
It seems like there may be too much rigidity in everyone adhering to all aspects of the code. There need to be allowances made for local variations.
I like how you have taken DC and ViDE and worked with them to come up with something appropriate to PB. I think this shows consideration of the necessity of broader interoperability.
Yes and no. The Dublin Core element set insures that you can interoperate with a range of organizations and initiatives that employ metadata. The fundamental issue is DC's lack of support for multiple media manifestations--from source objects through different digital transcodings. I have seen lots of attempts (including my own) to accomodate multiple media manifestations within Dublin Core, and they simply don't work very well. I much prefer the METS concept of separating technical (preservation) metadata from descriptive or the MPEG7 approach, which supports multiple media profiles. Making DC work for multiple manifestations is very difficult.


X_1.4.1 PBCore Favor Type/Size Asset

Mean = 1.63, Standard Deviation = 0.52

Response Count Percent
(1) Yes 3 37.5%
(2) No 5 62.5%

X_1.4.2 Explain Favor Type/Size Concerns


I'm not sure it's a concern, but the PBCore clearly is oriented towards digital data in computer files. From my perspective, that isn't a problem, although in areas like format.physical/format.digital, I think some in appropriate choices were made (to wit, digital formats should include things other than MIME types, and physical formats should include computer file on disk/HSM/etc.
None.
Produced programs with a self-assigned title and credits. Also non-print/electronic formats. It doesn't look very adaptable to free-form content , like photos, audio diaries, blogs, home movies.
I think that to your TV people it will seem like there's radio stuff in the way, while to the radio people, there will seem to be TV stuff in the way. I think you've done a good job so far in trying to take all media in PB arena into account.


X_1.5.1 Does 1:1 Relationship Work

Mean = 1.63, Standard Deviation = 0.52

Response Count Percent
(1) Yes 3 37.5%
(2) No 5 62.5%

X_1.5.2 Explain 1:1 Relationship Concerns


My group is in the multi-part object business (we are a library not a TV station). This question brings me back to my repeated concerns in the main questionnaire about "structural metadata," especially needed if the metadata is packaged with the content bitstreams. You always end wanting some metadata for the whole object, and them distinguishing data for the parts, both "bibliographic" and technical.
In the library cataloging practices, we need to make similar decisions about whether one record or multiple records should be created for each instantiation. Similarly, OCLC Cataloging supports 1:1 relationship Pros: 1) fully describe each instantiation of the resource. 2) Separate records may better facilitate retrieval 3) There is no ambiguity about how to catalog something. If I get an electornic version of a book, just catalog it as electronic resource. Do not have to ask if there is a book available, etc. Cons: 1) ItÂ’s less efficient in terms of cataloging workflow
It is typical practice in digital libraries to have multiple instantiations of a single logical 'item'; for a video work, we may hold a digital betacam master copy, a DVD use copy, and multiple streaming forms (high/low bandwidth versions, etc.). Having to replicate descriptive metadata is somewhat wasteful of space (a minor issue given the relative storage requirements of metadata vs. video), but more importantly, represents a serious data management problem, as any changes to the metadata would need to be written against numerous copies. It might be possible to work around this problem within our context using METS as a wrapper around PBCore metadata records, but on its face, succeeding in this would probably require ignoring some of PBCore's "Obligation to use" requirements for certain elements.
The 1:1 relationship could work, but would involve significantly more time and effort in examples such as you cite. The 1:many seems more flexible and desirable.
Aboslutely not. This is a terrible idea. It's a total waste for each format version to be its own record. It's impractical in that it will artificially inflate search results and give you a lot more to look through and winnow out. It's much more economical to have one listing of an item, with all the variations (format and content) listed with it. For example, if a news story runs on shows A, B, and C, there should be one entry for the day for the story, and then a subdescription for Show A version, then subdescription for Show B, etc. Same for format. People are going to want to know what formats the item is available in all in one location. A record should give as much information as possible about an item in one spot, rather than having to hunt and peck through the database hoping you find everything. Even with all the relationship notes you want, this is unnecessarily cumbersome. It's important to create a system that's practical, not just fits some ideological plan in someone's head. If the data is set up in a clumsy way people aren't going to want to access the tool much.
This is going to be a problem for us. Our digitization project is going to double as a description project for the analog tapes. And yet, the way the Core is set up, the analog tapes will not really be accounted for except for that Relation Element. We will probably add something to the system to account for this, so that there is only one record for each "program" rather than each media item. Otherwise the effort will become unweildy
1:1 is falling into disfavor generally for digital assets. It simply doesn't work, particularly as technology changes to support dynamic transcoding according to user preferences. Even MARC, the "granddaddy" of metadata, is moving toward the integration of multiple physical manifestations with the intellectual content or "work."


X_2.01 Element Title Expert Response


Dublin Core allows for "local refinements" that can be defined and maintained outside of DCMI.
Aha! Good idea to overcome the softness of DC! But I am not a cataloging specialist and therefore am not providing many responses re: DC. I skipped the "Content Related Elements" part of the main questionnaire.
I like sticking with the Dublin Core and the body of work that surrounds it. I suspect it is more effective to make use of a standard that has a body of supporting description and software than head off on your own. But someone would have to make a business case for either view to decide.
Promoting the use of refinements or qualifiers for the "Title" element is a reasonable extension of Dublin Core. It is more informative to specify title such as "Title.Packaging" than just to say "Title.Alternative". To implement, we should ensure that users search on the first level of the hierarch ("Title"), they do not have to worry about where to search for a specific title ("Title. Excerpt”).
I typically try to take a user-oriented approach to questions like this, and it seems to me that there will probably be quite a number of users in the content creation arena, and a certain number of consumers, who will benefit from the finer granularity of information available using refinements/qualifiers. I would probably vote for going in that direction.
I would use the hierarchical approach you suggest rather than trying to force your users to work with an unfamiliar system. Terms such as Title.Packaging are self-descriptive and therefore add more value for your labor than an undifferentiated list of alternate titles. Also easier than multiple instances of Relation. I think it's more important to consider usability here than strict adherence to DC rules.
This is not a one-size-fits-all situation. In some cases you'd want to keep track of all the titles in a heirarchy because it would help you identify the components of the item at a glance without wading through a lot of relation notes. In other cases simply calling everying a variant title is enough information. Hierarchical titles will be hard to use because they will e meaningless. Depending on the material you work with one approach will be better than another. Can't the list of title options incorporate both?
I would not add any more complexity to Title. Personally, I still have problems with how to distinguish between what information actually goes in Title vs Title.Program vs Title Episode. The more elements you have the less likely people are to use them correctly or consistantly fill them out. I think that Title.Alternative is adequate to handle most of these things and that some attention should be given to recommending guidelines for how to fill out the Title elements you already have most effectively.
While MPEG7 has its own problems, it does use a recursive approach to incorporating "segments" within a title. What it does not do well is support a descriptive, text-based "table of contents." MARC also does not deal with this problem well. Series and collections can receive separate metadata records from the component items. Only the series or collection title provides a link, and this link does not show relationships, which can be important when items should be viewed in sequence. METS provides a structure map that can be used to link items within a resource. If you are going to use a "flat" schema like DC, you should probably separately define a "relationship" record that basically serves as a structure map to document relationships among the relevant resources documented within the structure map. Relationships should include whether the organizing principle for the relationships is a collection, series or program, and then structure the relationships in a sort of "structure map" or "digital manifest" that indicates relationships among the members (e.g., versions, consecutive segments, etc.) Then you can use the relationship field to simply link to the relationship record, rather than burdening the poor related material data element with qualifiers, etc. Let the relationship record document the relationships instead. Just an idea.


X_2.02 Element Title.Alternative Expert Response


No comment.
Any name for the creative object that uniquely identifies it should be attached to it. Alternative titles may help people who are searching for particular content or approaches.
I would encourage the PBCore developers to remember that there is already a very large community of metadata creators out there (librarians) who already have established definitions of terms like 'alternative title', and it is not the same thing as a 'working title.' The content creators are *not* the only people who'll be using your metadata (at least, I'm assuming you hope to achieve that goal), and maintaining some degree of harmonization with other metadata work would be a good thing. I'd argue for maintaining title.working and title.alternative as two separate options available for recording title information, with title.alternative reserved for the "aka" title.
Perhaps the answer depends on when the metadata is being created. If an item is catalogued at the time of its creation, AltTitle could be used for working title. If not until a later stage, just reserve the metadata for sharing purposes.
The titles assigned to material should be useful to the primary audience, that is, the institutions that store them. PBCore is not a commercial catalog, it is a set of rules for information exchange and as such can be designed in the way that suits its members best. As I see it, each institution is free to store the information in whatever way suits them (full shows, Content Management pieces). The titles should reflect what the item is. If you ordered this item, what would you get? It doesn't matter if its a full program or smaller segments or objects, just so you know what you're getting. It's up to the requestor to decide how to use it. For example, suppose a program has a longer or shorter version, you'd want to know that via one of the titles (full version, abridged version, pledge version, whatever). If a member wants to publish a catalog of programs for a broader audience it can strip out/not display any irrelevant title types at that time.
I think effective definition of the possibilities of how Title.Alternative should be used is the best place to place your energies. I think that some guidence towards agencies developing their own metadata feild might be helpful, but don't place them in the Core. I think that flexibility should be offered with effective guidence but that additional Title elements should not be added to the Core.
You may be mixing apples and oranges here. Alternative is really intended for a title by which the resource may be known. It may be common that working titles are different from the final title. However, in a workflow sense, prepublication, or working, is a lifecycle status for the resource. I would suggest that you consider a status field to document the lifecycle stage of the resource, if this is important to capture in descriptive metadata. At the point that the resource has the status of "prepublication" the title is the title. When the resource is published, if the title changes, would tend to qualify that title as Title.Working, which is more meaningful than Title.Alternative. The status of the resource would change to "published" and the final title would become the main title. When you are sharing metadata with other initiatives, Title.Working, etc. become Title.Alternative


X_2.03 Element Title.Series Expert Response


No comment.
See Title Comments
See Title Comments
I'd use the refinements/qualifiers. See previous comments on this issue.
See title comments
See Title comments
SEE TITLE COMMENTS But I like Title.Series, I think it will be a very useful feild if it's defined so that we are all using it the same way.
See my comment about a relationship or context metadata record that provides a separate structure map for meaningful relationships. I am giving serious thought to this approach, because you can support multiple contexts, external to the resource. Consider, for example, that several Arthur episodes dealing with Arthur learning mathematics are packaged for distribution and resale as "Arthur does mathematics" An episode in this mini-series is also an episode in the larger, complete series. You want to avoid having to edit the record for the epispde every time some externally repackages the episode for resale. At the same time, you want to document this new relationship in a fairly lightweight manner. You also want to be able to flexibly document new relationships, such as a digital video excerpt from the episode, "Arthur learns long division" on a PBS mathmagic website.


X_2.04 Element Title.Program Expert Response


No comment
See Title Comments
See Title Comments
SEE TITLE COMMENTS.
See title comments
I don't understand how Title.Program is different from Title.Episode if Title.Series is the show name.
SEE TITLE COMMENTS My main problem with Title.Program and Title.Episode is that it is unclear about why these have to be two seperate feilds especially in the environment where programs have either a program title or an Episode title. How are agencies supposed to handle consistancy issues when they have both feilds, whats to keep an AP from putting an episode designation in program or vice versa? And which shows up in the main Title element or do both?
See previous comments


X_2.05 Element Title.Episode Expert Response


No comment.
SEe Title Comments
See Title Comments
SEE TITLE COMMENTS.
See title comments
I think whatever PBCore decides it should go along or be compatible with other common practice in the broadcast industry. For example, for most ongoing tv shows, the series title is the show title, but then the individual episodes are called episodes. PBCore doesn't exist in a vacuum. What is common broadcast terminology for these show divisions? What do broadcasting textbooks say?
SEE TITLE COMMENTS and SEE TITLE.PROGRAM COMMENTS
See previous comments


X_2.06 Element Subject Expert Response


This should be done through the identification of the encoding scheme, which is a feature of DC. Defining an element refinement is the wrong solution
Yes, encourage people to identify the source of terms.
Interesting questions -- what technology can you use to insure the richness and appropriateness of the entry? Latent Semantic Analysis? As with any standard, most people will, unfortunately try to avoid work. The result is a short range optimization and a long range failure. A guided free-form including sample sentences and fill-in the blank might be best.
Words have different meanings in different contexts. It is important to use controlled headings and also identify the specific subject classification scheme being used. Picking up a scheme is the first step. The next step is to develop data dictionary within certain domains or fields.
I think the ability to specifically indicate the classification scheme used would be a good thing. In cases where an institution might want to use terms from different classification schemes for the same asset, Dublin Core's 'flat' approach, where every metadata element is discrete, makes it hard to associate a classification scheme with the subject terms from that scheme. You'll need to dictate certain best practices on how to order metadata elements in cases of multiple classification schemes, e.g. Subject.ClassificationSchemeUsed Subject Subject Subject Subject.ClassificationSchemeUsed Subject Subject Subject so that encoders are all on the same page regarding how to associate a scheme with its subject terms.
If an item is tagged using a classification scheme, it would be helpful to note which one was used. Subject.ClScUs could be added and automatically populated to default to whatever a particular agency, etc. normally uses.
Is using different subject classification schemes problematic? NO. This is real life. This will happen. No one subject classification suits everybody. Should there be a field to ID the scheme used? Perhaps. Is this going to be a standardized list of options? And then how detailed is someone going to determine this has to be (i.e. IPTC SRS v.9 or v.10 or v.11) Isn't one of the choices going to be "local scheme" or "other" anyway? The idea, as I see it, is to get a rough idea of the chosen coding scheme. If you know what the scheme the subjects are based on you can incorporate this into searches if it seems useful. I'm thinking this field would be useful if you wanted to search across more just than one database that uses the same subject scheme. How useful would this field be in practice? Is it just for the limited use of a few scholars? Would it allow search engines to automatically link separate databases using the same classification schemes for a given search? Is there any advantage to this field in every record? Or can the same thing be accomplished through a link to a general information page about the collection?
I think that there might be an inclination towards using a formal classification scheme. However, I also think that people might not take the extra step towards actually noting the classification scheme in the metadata. I don't know if adding another element is going to help this problem. More likely education about why its important and implementation software that makes you fill out that info before you go on will be more helpful. I am loathe to recommend adding another element, but it seems to make sense. I still don't think people will fill it out.
I think it is really critical to document the subject scheme being used. In fact, I am a great believer in controlled vocabularies whenever possible to provide authoritative headings. I suggest that you develop authoritative controlled headings before rolling out PBCore to stations. There is not much value in following a standardized metadata schema when the values that are actually searched are all over the map.


X_2.07 Element Description Expert Response


The solution with Description.Type is contrary to the philosophy of DC.
No comment, but some form of qualification seems useful.
Stick with Dublin Core
Either way works. The second approach is more flexible. The "Description.Type" must be associated with a particular instance of "Description".
Using a separate description.type element is obviously somewhat more flexible than a separate element for every type of description, but you will have to specify encoding guidelines for how type will be associated with the description in cases of multiple descriptions for one object.
The first alternative seems better, less cumbersome, self-descriptive, and easier to use, although the "type" scheme probably provides greater flexibility and extensibility.
It depends what you want to do with the information. You may just want to search a certain type of description. I'm not sure exactly what Type description would look like in a record. If you're able to divide up the description in a way that's meaningful to you, it doesn't really matter if you call it Description.TableofContents or Description.Type.TableofContents. Without examples it's hard for me to see the difference.
If you put in DESCRIPTION.TYPE then you would have to make it repeatable. And how would you map the TYPE to the data dumped into all one feild. I like the idea of minimizing the number of feilds, but it seems to me that DESCRIPTION.TABLEOFCONTENTS, and DESCRIPTION.PROGRAMRELATEDTEXT are good stand alone elements. DESCRIPTION.ABSTRACT might be rarely used but I see why it should be seperated out. I would be inclined to leave it the way the scheme is but identify DESCRIPTION as part of the minimal elements within the core of the Core and make DESCRIPTION.ABSTRACT, DESCRIPTION.TABLEOFCONTENTS, and DESCRIPTION.PROGRAMRELATEDTEXT optional. In a situation like mine, we are rarely going to have the information for DESCRIPTION.ABSTRACT and DESCRIPTION.TABLEOFCONTENTS. But we will want DESCRIPTION and DESCRIPTION.PROGRAMRELATEDTEXT to be different feilds.
I don't really know. I think that you will probably want fairly tight formatting for these data elements, so whichever strategy you choose should result in formatting which clearly delineates the separate subelements within a Table of Contents, ComposerList, etc.


X_2.08 Element Description.Abstract Expert Response


No comment
See Description Comments
See Description Comments
SEE DESCRIPTION COMMENTS
see description comments
Description.Abstract seems like a useful field for most types of programs. That said, I think a picklist gives you more flexibility for choosing types of description, rather than having to hard code specific types in a field name. Also it allows the individual station to use only the terms it wants to use. As long as these are common industry terms, I don't see a problem with a picklist. The description is divided into specific types, the question is just the name for the field display.
SEE DESCRIPTION COMMENTS I think this element is a bit duplicative of DESCRIPTION. If any of the DESCRIPTION elements can be folded in I think this one should be.
See previous comments.


X_2.09 Element Description.TableOfContents Expert Response


No comment
See Description Comments
See Description Comments
SEE DESCRIPTION COMMENTS
see description comments
My concern with preset categories like Description.Table of Contents is that if you have description that doesn't fit a defined category you have to shoehorn it in someplace. It's not very flexible. So what actually goes in the category may be different from station to station because the model doesn't suit their needs.
SEE DESCRIPTION COMMENTS I see the reasoning of keeping this element seperate for data entry, but not for searching. I don't see that this element will get used much anyway.
See previous comments


X_2.10 Element Description.ProgramRelatedText Expert Response


No comment
See Description Comments
See Description Comments
SEE DESCRIPTION COMMENTS
See description comments
I don't see "script" listed in your examples. This is the most common text format for live productions (i.e. news shows). The talent starts with a script but there may be adlibs, live shots and unscripted material contained within the program. I'm guessing script would go in this description category but I don't see it on the "list of permutations". Unless it's called "speech to text", but to me this means running the audio through a speech-recognition program that may or may not be accurate depending if a human reviews it or not. And what the heck is "an electronic file with timecode synchronization data". Does this mean a digitized script with running times? Or what exactly?
SEE DESCRIPTION COMMENTS I think this one should remain seperate. Many agencies may be using software that will harvest the CC, lower thirds, speech recognition from it's programming and will need somewhere to dump it. I don't think it's useful to dump it into a generalized DESCRIPTION.
See previous comments


X_2.11 Element Creator Expert Response


Creator.Role is contrary to the philosophy of DC. Several refinements are being defined to qualify Contributor.
Yes, role would be good here.
I like the idea of a creator hierarchy.
Creator.role must be associated with particular instance of Creator.
I think this is a good idea, but I think that this highlights the fact that having separate creator and contributor elements is not particularly valuable. It my opinion, Dublin Core made a mistake in asserting that distinction.
I've seen Creator and Creator.Role used elsewhere with success. Offers a great deal of flexibility to the user.
This is very confusing. Especially creator vs. contributor. No one outside of people who have been trained by PBCore is going to be able to figure this one out, and even then it's murky. Can there be more than one creator? What if they're not all at the same creator level? What if someone is somewhat linked to the creating process but not a full-blown creator? Where is the cutoff? Creator.role helps, but deciding who's a creator to begin with is the biggest problem.
Initially, I find it strange that the Creator and their Role are in two different "feilds". How is this supposed to work in a simple database? Especially if you have several Creators with several Roles, what's to pull a Creator together with their Role and seperate it from the next Creator and their Role. I think this is going to be a tough element (with Publisher and Contributor) to actually implement in a non-complex database that may not be set up properly to get this metadata to work properly.
I think you should follow the approach that MPEG-7, MODS and MIC employ and separate Creator into two subtypes--name and role. That way, you can have variant name forms. Variant names are pretty common for performers, such as Charlie "Bird" Parker, etc.


X_2.12 Element Creator.Role Expert Response


Yes, more qualifications.
See Creator Comments.
See Creator Comments
SEE CREATOR COMMENTS
See creator comments
I have one problem with Creator.Role. One thing I would find immensely stupid is to have a separate entry for each person for each separate role they perform. So if someone did multiple roles, there'd be an access entry point for each role. For example: Blow, Joe. Creator.Role: Illustrator. Blow, Joe. Creator.Role: Artist. That would be a totally redundant waste and artificially inflate the search results you have to wade through. What would make sense is to do this: Blow, Joe, Creator in these capacities: (check those that apply) artist, illustrator, set designer, title graphics, other. Then it's all in one place by each creator's name. Unless you have a huge program like a Hollywood movie where you have more traditional credits grouped by production role. But I doubt most PBCore productions are on that grand of a scale.
SEE CREATOR COMMENTS I do think it's important to designate a Creator's Role though. The challenges inherent in getting CREATOR and CREATOR.ROLE to work together and be differentiated from the next CREATOR and their role are secondary to the unique problem that there are several creators with different roles in these resources.
See previous comments.


X_2.13 Element Date.Created Expert Response


No comment
I like the Date treated like an object...hierarchy.
Either way works. The way with "Date.Type" is more flexible. Only qualified date element such as "Date.created " might be more interoperable.
I prefer the alternative approach. Again, as in all these cases, you'll need to specify a best practice for relating date.type with the appropriate date element in cases of multiple dates for a single work. A recommended controlled vocabulary for date.type would also be a good thing.
The first alternative seems better, less cumbersome, self-descriptive, and easier to use, although the "type" scheme probably provides greater flexibility and extensibility.
I like Date.Type because it sounds like you'd have a picklist for relevant date choices. Again it depends on the needs of the institution. Some places will need lots of date indicators and some won't. The Date.Type sounds like it has more flexibility. In both cases the information being stored is the same, it's just a matter of what the fields are called.
I think the use of qualified Date elements is fine. I would leave it the way you have it, except, I'm still confused about the difference between DATE.ISSUED (or aired) and DATE.AVAILABLESTART. So more attention needs to be given to the definitions of these elements.
You may instead want to consider a resource status data element, that includes both a status type and a date. This isn't Dublin Core, of course, but you appear to be trying to accomplish more than description with PBCore--you appear to be wanting to support production workflow.


X_2.14 Element Date.Issued Expert Response


No comment
See Date.Created commments
See Date.Created Comments
SEE DATE.CREATED COMMENTS
See Date.Created comments
See Date.Created comments.
SEE DATE.CREATED COMMENTS
See previous comment


X_2.15 Element Date.AvailableStart Expert Response


No comment
See Date.Created commments
See Date.Created Comments
SEE DATE.CREATED COMMENTS
See Date.Created comments
This field looks like it's designed for web production, or booking availability for program syndication. It looks like a content management field. With the items I'm working with at the moment I wouldn't have any use for it.
SEE DATE.CREATED COMMENTS
See previous comment


X_2.16 Element Date.AvailableEnd Expert Response


No comment
See Date.Created commments
See Date.Created Comments
SEE DATE.CREATED COMMENTS
See Date.Created comments
See previous
SEE DATE.CREATED COMMENTS
See previous comment


X_2.17 Element Format.Identifier Expert Response


I do not understand the need for this refinement.
Sort out the TYPE of identifier, the VALUE of the identifier, perhaps the OWNER/ROLE (or something) for the implementation of the identifier, and allow for an identifier note or comment.
Leave them separate. The number of formats and the number of variations in a type of format will probably increase.
Agree.
I can equally envision many instances where you would want to record identifiers that represent the 'logical' work and not a particular physical instantiation, in which case identifier and format.identifier would be very distinct. I strongly recommend against merging these elements.
I think it's reasonable to combine the two since Identifier can be repeated, if necessary.
10.00 Identifier doesn't have any qualifiers in it. How do you distinguish between identifiers for multiple digital formats? For example, you have a show that's stored as a video file, a WAV file, a Real Audio file, CD, DVD. How do you distinguish between them using just Identifier? But if Identifier can be qualified it would be okay to use it instead.
I agree that there is little practical difference between these two elements. I would be for merging them.
I think there is a distinction between the entity identifier and the identifier for each physical manifestation. Every physical copy or manifestation must exist in a separate location, so you will have separate identifiers. However, the intellectual content or source object usually has an identifier as well


X_2.18 Element Identifier Expert Response


See preceding comment, and comments made in main questionnaire.
See format.identifier comments.
SEE FORMAT.IDENTIFIER COMMENTS
I think it's reasonable to combine the two since Identifier can be repeated, if necessary.
see Format.Identifier comments
SEE FORMAT.IDENTIFIER COMMENTS
See previous comment


X_2.19 Element Format.FileSize Expert Response


Fine with me, but I always wonder if the units of measure ought not be an attribute. This is less critical here since everything is bytes, but why then type "bytes," as in your examples?
No thought
Nice.
None
I think a file-size field should be included, but specifying that it must be expressed in bytes seems very strange in this world of KBs, MBs, etc.
Why can't you use KB or MB if that's more meaningful and what the industry uses for file size measurements? This seems silly and less intuitive and user-friendly.
My problems with this element are: When you are dealing with video, it is likely that the file size will be measured in MB, not bytes, so the conversion seems unlikely to happen. How are you going to get people to take that extra step? I think it would be valuable to share why entry in this element would be useful, how it would be used. I think people are going to get tired when filling in the FORMAT elements and likely to get sloppy because of all the technical information that has to be recorded and where people might not think there's much use to having them.
This data element could be usefully autogenerated.


X_2.20 Element Format.TimeStart Expert Response


DC.Identifier.Timestamp never made it into the DC standard. It was intended to support timestamped URIs, and not as a media dependent property
No comment
I like keeping to Dublin Core for all the reasons I stated earlier.
Yes, the format.timestart approach seems more appropriate to me.
I agree that it makes more sense under format, since the same resource on different formats may have different Timestart while having the same Identifier.
Where is a user going to look for this information? It makes more sense to put all digital file information in one spot than to sprinkle it around arbitrarily in this field, that field, Identifier fields, etc.
Yes this seems to make sense to me.
I'm not sure I understand what is going on here.


X_2.21 Element Format.Duration Expert Response


No comment
See earlier comments.
OK
I don't see the use of the term format.duration as particularly problematic, although since the meaning is identical to format.extent, you might wish to consider whether having a comfortable terminology is more important than insuring compatibility with other metadata communities.
I agree that "duration" is the more commonly used term in this context.
I agree. Duration is better than Extent. The more you can match common industry terminology the less confused users will be and the more compliance you'll get. Duration makes more sense. At on place I worked they used the term "length"; this is also a common term.
I agree. Extent has always been a catalogers word for books.
I agree


X_2.22 Element Format.Standard Expert Response


Well, if you let people choose different forms of timecoding, then you ought to have the three elements.
Controlled vocabularies are fine. Don't collapse. It's easier not to use than to create elements later. The marketplace will make the choice.
Could be a single element.
I think that format.standard and format.encoding are, fundamentally, trying to express the same information: formally identifying the technical standard/specification which defines the data format used for the asset. I think they can be collapsed into a single element which *should* be more carefully defined, and perhaps employ a controlled vocabularly. Format.type, on the other hand, seems to be expressing a bit different from the other two elements, a more 'high-level' description of the nature of work's format. I think format.type should probably be maintained as a separate element.
I think Format.Type is not needed given that you have Type as a separate element. However, Format.Standard and Format.Encoding seem useful to have. Rather than a controlled vocabulary, though, I would use a pick list of the common schemes in use.
Since Format is an area subject to fast technological change, there should be a way to describe the file so you know exactly what you are requesting/getting in data exchange. You want to know the standard you want (ex: PAL vs NTSC), the type (audio version rather than the video version), and any encoding you need to exchange the format. This is technical exchange information, and as such I'd consider all three format qualifiers and not too granular. The underlying consideration is: What is the essential information that will be needed to exchange program elements? If encoding is critical put it in. If not, leave it out.
I am not as familiar with how the media resources will be exchanged and the requirements necessary for that activity. I think you should collapse these three elements into a single metadata element. But I do think that some consideration should be given about how this one element would be used and what controlled vocabulary would be required to have it used most effectively.
I'd look at what is required for interoperability with schemas like MPEG7 and SMPTE and also ask what purpose these data elements serve, and who benefits from their existence. Are they important for migrating to newer technologies to support digital permanence? Are they important to end users who play back the files? Is this important for a station considering the purchase of the resource, or preparing to download the resource? The data elements should serve a purpose, perhaps tied to the 3 FRBR core user information needs--find, identify, select or obtain, or they should serve the purpose of maintaining the intellectual content in perpetuity.


X_2.23 Element Format.Type Expert Response


Keep the three.
See format.standard comments.
SEE FORMAT.STANDARD COMMENTS
See Format.Standard comments
I disagree with a number of the text picklist choices being put in Format. To me these are belong in Description. I think it's going to be confusing to users to see text information in two places. Also Moving Image/ Screening Tape and Stock Footage are listed twice. Otherwise the Format/Type choices make sense, although I wouldn't use most of them.
SEE FORMAT.STANDARD COMMENTS
See previous comment


X_2.24 Element Format.Encoding Expert Response


Important, will have lots of nuances, e.g., MPEG-2 and -4 profiles and levels, etc.
See format.standard comments.
SEE FORMAT.STANDARD COMMENTS
See Format.Standard comments
This seems to be critical information for the successful transmission and delivery of exchanged programming. Keep.
SEE FORMAT.STANDARD COMMENTS
See previous comment


X_2.25 Element Annotation Expert Response


As part of the emerging DC data model, the use of the same refinement for more than one element is not allowed.
You may be glad for the "bound to" annotations, to sort things out later.
See earlier comments.
Annotation is a good idea to provide supplementary information for certain data elements.
At some level, I think an annotation/note facility can become overkill, and part of the point of a metadata standard is to force people to express information within a particular structure, instead of a allowing free-text everywhere. I think a single annotation element provides the flexibility to give additional information not covered within the main metadata element set; separate annotation elements for every other metadata element would be unwieldy and probably pointless. I doubt many people are going to have the time to put in that much annotation information.
As I said in my main survey comments, I would strongly recommend against having any sort of "annotation" elements. They tend to be catch-alls for all sorts of detail that either isn't needed or that should more properly be captured elsewhere in the metadata. There's no way to ensure consistency and these fields can be difficult to index and search. I think if information is worth capturing in the metadata, it deserves its own element.
It doesn't matter. The information is the same regardless if all annotations are consolidated down in an annotation area, or separated out under the pertinent element. My preference is to have information about an element next to the element so you don't have to go hunt and peck for it. It seems more efficient, intuitive, and user-friendly.
As it stands now, it is unclear how the element is actually to be used. But I also think that you do need a general notes feild to dump stray information into. I think the possibilities of adding ANNOTATION as a qualifier of elements is a useful idea, but in reality, I don't know that they would actually be used as intended. I would think that in the rush to get the "paperwork" done, these elements would remain empty and useless while if something really remarkable stood out that had to be presented, people would want to look for a general notes feild. I think that PBC should not support individual Annotation Elements for each major element, but leave the option open to see if any agencies actually do want to make use of feilds like these.
This is a really good idea, actually.


X_2.26 Element Location Expert Response


I'd like to see this connected to identifier somehow.
How about multiple locations? How about electronic storage locations and physical storage locations. Should this element be broken down to those levels?
I basically agree with these comments. I would note that as in many other aspects of PBCore, you're going to need to develop encoding guidelines to help your users. In the case of multiple manifestations for an item, the assets in different locations will very likely have different format characteristics as well (bit depth, data rate, frame size, frame rate, encoding, etc.). You'll need some way to associate all of the other formating metadata elements with a particular location(s).
I agree with the above assessment.
It looks like this is intended to cover two things. 1. Gatekeeper: Who to contact to use the material 2. Location of formats. Should contact information go in a separate area of the record? Like Annotation.Contact? An aside: Often in libraries with an ongoing publisher series, there will severak kinds of records. There will be one (series title) record for the overall holdings, and then individual records for each work (episode). For example, if there's an ongoing series "Hearts of Space", there could be one "holdings" record listing all the episodes, the titles for each episode, the author/contributor and brief descriptive abstract. This acts as the major inventory listing for all possible holdings for a series. Then the indidual records will be more specific. Example: Music of the Spheres by R.U. Kidding, Episode #703 in the Hearts of Space series. Looks at intergalactic tintinnabulation across the closest 10 light years. Available in WAV file, RealAudio, with accompanying PR text and image material. Don't think one episode or program can only be listed or displayed one way in the database. Work with the advantages of a database, not against it.
Ooo, this is a good idea. Using this element to note the several copies of something. That is how we have traditionally done it in the archives I work for normally. However, how are you going to distinguish, what version is kept in which place. For example, if you have a 3/4" tape and a 1" tape and a digital file, it will be pretty clear which is the digital file, but what about the others? Also, how do you use the other elements (especially INSTANTIATION elements), how do you identify which is your primary item and what FORMATS your other versions are in?
Consider adding location as a subelement to a status element, particularly if you want to use PBCore to document the lifecycle of a resource.


X_2.27 Element Rights.Usage Expert Response


You may wish for parse-able data but that may be premature. Can you build this so that the standardization of usage-expression language develops in a way that will allow future parsing?
I like more elements. It gives more flexibility. Remember, you are going to use search engines. Keyword searches are more likely to be successful with a rich set of elements than with prose.
Prefer to use a hierarchy of rights categories. This is an area that cries out for standards because you will want to base system functions on this. The structure and standards are important to facilitate interoperability.
Formal data models for expressing rights information are *Very* difficult to create, and the environment in which public broadcasting operates strikes me as more prone than many to creating unusual rights situations. I would leave these elements free-text at the moment. You may wish to consider whether a separate rights expression language is needed by the public broadcast community, or whether one of the existing rights languages, such as ODRL, could be adopted to more specifically delineate rights & permissions covering various assets.
I think it makes more sense to break in down as you have rather than combined into one element.
I don't see how all circumstances can be accommodated in a hierarchical grouping of successive picklists, where each window that appears depends on what was chosen before. Although this might handle 95% of the cases and any exception could be entered manually in a separate rights statement. Even within a specific collection there might be restrictions on specific episodes. There should be an ability to enter the information both ways, with a preference given to hierarchy but where this isn't meaningful, a rights statement is acceptable. Also it looks like this element is supposed to be a freeform text entry. Is this where legal rights are spelled out in long legalese paragraphs, or is it designed to link to a site that contains the long legalese? I'm thinking here that terms of use could change rapidly and it would be a nuisance to have to update a record every time this happened.
I would prefer that all values be combined into a rights statement placed into a single metadata element with a standardized way of entering the data at least or a controlled vocabulary at best. However, this is going to be hard to implement too. Are you going to have some place for people to put rights statements that are unusual?
Given that rights are such a critical issue for PBS resources, I'd suggest developing a separate rights schema, utlizing MPEG21 (XrML)or ODRL, and reference the rights metadata from the PBCore record.


X_2.28 Element Rights.Reproduction Expert Response


Same as last comment
see rights.usage comments.
See Rights.Usage Comments
SEE RIGHTS.USAGE COMMENTS
See Rights.Usage comments
SEE RIGHTS.USAGE COMMENTS
SEE RIGHTS.USAGE COMMENTS
See previous comment


X_2.29 Element Rights.Access Expert Response


Same as last comment. I trust that you are looking at the MPEG right language effort.
see rights.usage comments.
See Rights.Usage Comments
SEE RIGHTS.USAGE COMMENTS
See Rights.Usage comments
SEE RIGHTS.USAGE COMMENTS
SEE RIGHTS.USAGE COMMENTS I don't understand how this question applies. As it currently stands there are only two choices for ACCESS: Open Access and Restricted Access I thought the purpose of the generalized and restrictive nature of the values for this element were for a reason. If this is still true then I would leave the element the way it is with limited choices. Otherwise, what is the different between this element and the other two?
See previous comment


X_2.30 Special Extensions Expert Response


No comment
Get it intouse. Learn how it works. Then be ready to make changes. A "gedanken" experiment is very useful but only in so far as it models the real world. You need the real world experience to answer these questions.
Agree.
Public broadcasting will not be able to create the metadata needed by all potential users of your content. You won't be able to provide the level of detail librarians require for cataloging. You probably won't be able to provide all of the detail the educational community wants on learning objects. This being the case, make sure your first concern is serving the needs of *your* community. Extensions are potentially useful, but can complicate interoperability. I'd recommend that in any future changes, you also start with the question "how exactly does this benefit us?"
I think the PBCore set is a well thought-out first set -- an excellent starting point for metadata collection. As you say, with time will come refinements. As long as you don't get so detailed that you lose flexibility to respond to changing needs and differing environments, this set should be useful over the long-term (or at least mappable to the "next big thing" that comes along).
It's good that you stress that the PBMI is not going to be all things to all people. The major focus on storing essential information for program exchange is solid. What I would guard against is any assumption that you can get all the information you need from the online databases. You won't. Stations will still be calling each other with questions about program content, whether it can be used in a specific context, whatever. The metadata records are just a starting point for discovery of program material, descriptive information about the material, the formats it comes in, and terms of use. The records form a comprehensive catalog, but that's all it is, a catalog. It is not the be-all and end-all of information exchange. Whatever PBMI comes up with should do the following: 1. Be as intuitive as possible to the end user. Try to use industry terms as much as possible. If these can be coded through XML mappping in the background to link to other standard terms, fine. But you'll save yourself a lot of grief in the beginning if the user doesn't feel totally lost and can recognize familiar terms. 2. Don't be wedded absolutely to standards. You are creating this to be a tool for PB stations. If it makes more sense to display notes next to the item they describe, do it. You want the record to be as easy to follow as possible. Again, field names in forms can be linked in the background to other standards terms abd rearranged by the computer in the background. 3. Allow for flexibility and station individuality. Standard lists to pick from are okay but allow for manual entry for exceptions or as needed. 4. Remember stations have been exchanging programs without the assistance of computers for a long time. Being able to set it up online is a convenience, but it is still a tool, not an end in itself. It is not supposed to be an arcane model of philosophical perfection, this is something that is meant to be used. Don't make it so complicated it makes the IRS tax code look like a picnic.
I think you are on target with this approach.
This is an acceptable strategy.



Generated: 3/3/2004 5:44:07 PM