[RSS] [Google]

homepage

contents

contact us

Library Philosophy and Practice 2010

ISSN 1522-0222

Embedding Semantic Markup in Web Pages

Virginia Schilling
Libraries
University of California, Riverside
Riverside, California

Introduction

The World Wide Web first revolutionized the presentation of text and data. People thousands of miles away from each other could suddenly see the same exact text or data at the same time in the same format. The second wave of data handling has come with the collaboration technologies: social tagging, networking websites like Facebook and the interactivity of Wikis and blogs. Both of these technology sea changes have been aimed at making data accessible to people. Both have improved a person’s ability to read the text or data presented and interpret meaning from it, for good or for ill. The next wave of technology will make data accessible to computers as well as people. Instead of undifferentiated text presented on a web page, each data point will be coded in a way that computer programs will be able to understand and interpret. This next wave of technology change will lead us into the semantic web.

Web pages are generally coded using either HTML or the stricter XHTML markup languages (collectively known as X/HTML). However, these languages only tag data on the web page for presentation purposes (i.e. they say things like “make this word bold”), not for the actual meaning delivered by the content (they don’t say “this word is the name of a city”). Using markup languages that code for meaning in addition to presentation will allow software to find and use specific bits of information on the web page, such as a date or a person’s name, rather than just understanding everything on the page as one gigantic mass of text. Each bit becomes a separate piece of information with its own individual meaning. In some ways, the concept is like taking everything on the Internet and putting it into a gigantic database.

The real power of semantic markup, however, is that implicit relationships between bits of data can be established by the computer. People can read the text of two different web pages, for example, and be able to interpret implicit relationships between the data in each one. A computer cannot do this. If on one web page, a city is stated to be in a particular country and on another separate web page, a person is stated to be in that same city, then the implicit statement that the person is located in that same country can be understood easily by a person. Semantic markup will allow that implicit relationship to be also understood by a computer.

Underpinning the concept of the semantic web is the Resource Description Framework (RDF). RDF provides a structure to define an explicit connection between any two things. Currently, there are two basic methods to make RDF data available on the Internet. The first method is to create an external RDF document written typically as XML and that can be accessed and read by RDF-aware software. The second method is to extend the existing X/HTML coding of a web page using semantic tag attributes defined by one or more profiles, such as eRDF, RDFa or Microformats, and one or more descriptive vocabularies, such as Dublin Core (DC) or Friend Of A Friend (FOAF).

A software package designed to “understand” what the markup means will then be able to extract and use this tagged information. For example, a search engine designed to “read” the tags that indicate a person’s name versus those that indicate a corporate entity will be able to distinguish between a web page containing biographical information about the person Abraham Lincoln and a web page for an elementary school named Abraham Lincoln. The embedded semantic markup is not visible to the naked eye. The person reading the marked up pages still sees just the text. The only way to see the semantic markup is to look at the source code for the web page. \

Semantic markup technology is still in its infancy. At this time, semantic markup of web resources is typically undertaken for a particular purpose and meeting particular requirements, such as a project to make specific resources available to OAIster. It is not undertaken as part of the “normal” development of any given web page. There are competing methods for using the markup in X/HTML coding with no clear indication yet of a general acceptance of any one for general purposes. The software available for use with RDF, when it is available, is lacking somewhat in robustness and maturity (or is available only at the enterprise level).

The two methods of markup require different levels of sophistication. Creating an external RDF document currently requires detailed knowledge of how to use RDF and the infrastructure on the server to handle requests for that document, something that may or may not be available to anyone not serving their own pages. Perhaps the web developers of the future will all have in depth knowledge of RDF, but people seem to most often follow the path of least resistance, so it seems likely to be something most web authors will never acquire. Embedding the RDF into existing X/HTML markup seems like a more logical choice for many web authors, but is it really?

This is essentially a small project to explore the second method of adding semantic markup to the web pages of the Center for Bibliographical Studies and Research (CBSR). Embedding RDF into X/HTML seems like a fairly straightforward concept in theory but how does it work in the real world? Will individual web site authors embrace and use semantic web practices of any kind? Presumably, with enough support from the community, eventually there will be fully developed mainstream software tools to help create semantic markup. But some authors never acquire more than the minimum level of skills required to code simple pages by hand (or rather to be able to use a wysiwyg editor). Will these web authors also embrace semantic tagging or will an entire segment of the web be ignored by this technology because users find it too difficult to implement?

There is no guarantee that the markup standard chosen for this project will actually present the data on CBSR’s website to any semantically-aware software that currently exists or that it will do so in the future.1 However, it is not completely without precedent to add the semantic markup to web pages. A study by Kolbitsch and Krottmaier in 2006 looked at the use of Dublin Core (DC) in HTML markup worldwide. Out of 48 web sites examined, they found that 13 sites used no DC at all. This means that, despite overall implementation being low, nearly three quarters of the websites examined used DC at least once in their pages (section 5.1).

This project will involve the encoding of DCMI’s Dublin Core metadata vocabulary into CBSR’s web pages according to the profile defined by RDFa. The choice of descriptive metadata vocabulary and profile will be discussed in detail below. Appendix A lists the DCMI vocabulary terms (or elements), their definition and a very general statement about their use for this project. Appendix B lists the CBSR web pages to which semantic markup will be applied.

Literature Review: Standards Development and Current Projects

Research in this field is led by the World Wide Web Consortium (W3C). The actual technology to carry out the goals of the semantic web is still in its infancy, where it exists at all. Current research is being directed primarily towards establishing standards and developing basic specifications to ensure interoperability in the future and to allow the construction of the tools and components that will form the invisible backbone of the semantic web.

W3C has developed standards/specifications for an abstract model to describe relationships between “things”, expressed as Resource Description Framework (RDF) (2004a), a semantic schema to allow the description of other vocabularies in RDF (RDFS) (2004b) and a syntax for RDF in XML (RDF/XML) (2004c). Gleaning Resource Descriptions from Dialects of Languages (GRDDL) is a specification for extracting RDF content from marked up XML or XHTML pages (2007). Simple Knowledge Organization System (SKOS) is a specification for converting existing controlled vocabularies into an RDF-compliant form (2009e). SPARQL Query Language for RDF is designed to do exactly what it says: query RDF-compliant data (2008c). The Web Ontology Language (OWL) is yet another extension of semantics for RDF, allowing for much more sophisticated use than that supported by the basic model and RDFS (2009d). RDFa is a specification for representing RDF in XML and XHTML documents (2008a). The specification for Protocol for Web Description Resources (POWDER) builds on these other specifications to allow the description of groups of web resources for purposes such as customized retrieval of resources or the identification of resource authenticity (2009c).

Completely separate from W3C but with the same idea in mind, the open source community has developed a set of formats called “Microformats” (About microformats, n.d.). Just like RDFa, these allow the use of existing XHTML tags to add meaning to the data they mark up. hCalendar allows events to be tagged in such a way that the information can be extracted and, for example, added to a calendar somewhere else. hCard allows contact information to be marked up in the same way. Formats exist to describe resumes, reviews and Atom feeds. Other formats are under construction to describe audio, recipes and citations (Microformat, 2009). Talis provides a third way of adding RDF-compliant tags to a web page with eRDF: Embeddable (or Embedded) RDF (Talis, 2006).

There are organizations with projects contributing to development all over the world. Library of Congress has made its subject headings data available in RDF/XML (n.d.). The DCMI/RDA Task Group has started a project to convert Resource Description and Access (RDA) into RDF (2008). The International Federation of Library Associations and Institutions (IFLA) is busy translating its Functional Requirements for Bibliographic Records (FRBR) into RDF (2008).

The National Archives in the United Kingdom has developed PRONOM, an authoritative registry of digital file formats for use in the RDF/XML environment (n.d.). The Global Digital Format Registry (GDFR), developed by Harvard (n.d.), is merging with PRONOM to become the UDFR or Unified Digital Formats Registry (2009). The Dublin Core Metadata Initiative (DCMI) has a registry of metadata schemes, The Dublin Core Metadata Registry (2008a), as does the National Science Digital Library (NSDL). The NSDL Metadata Registry “provides services to developers and consumers of controlled vocabularies and is one of the first production deployments of the RDF-based Semantic Web Community's Simple Knowledge Organization System (SKOS)” (2009, Welcome to The Registry! section). The JISC IE Metadata Schema Registry (IEMSR) “will act as the primary source for authoritative information about metadata schemas recommended by the JISC IE Standards framework” (2009, About IEMSR section).

There are numerous descriptive vocabularies that can be used for semantic markup of resources. DCMI designed Dublin Core (2005) specifically with web resources in mind. The Gateway to Educational Materials (GEM) describes web-based educational resources (2009). The Public Health Information Network (PHIN) vocabulary developed and maintained by the Centers for Disease Control and Prevention (CDC) “enables data from different programs to be consistently documented” (2005, p. 4). Hundreds of other vocabularies exist both within the focus of library work and completely outside of and unrelated to it: TEI, EAD, FOAF, DOAP and so on.

There is a myriad of various projects documented in the literature testing the possibilities of this semantic web technology. Tonkin & Strelnikov (2009) discuss the JISC metadata registry and Heery & Wagner (2002), the DCMI metadata registry. Hildebrand et al. (2009), Angjeli et al. (2009) and Guzmán Luna, Torres Pardo & López García (2006) each discuss projects to develop or implement specific thesauri. Talantikite, Aissani & Boudjlida (2008), Arch-int & Sophatsathit (2003) and Uddin & Janecek (2007) all discuss the general development of ontologies. Chavarriaga & Macias (2009) look at modeling a semantic web-based interface. Damiani & Fugazza (2007) discuss the management of intellectual property rights using semantic web technologies.

A variety of tools exist to either generate or use semantically tagged data. DC-dot (n.d.) generates some semantic markup in X/HTML, RDF or XML for an existing web page without any. Other projects, like GEM and PHIN, have implemented search interfaces for their own vocabularies. Extensions for Firefox, like Operator (n.d.) and Piggy Bank (2008), can extract data from web pages tagged with Microformats markup. Tools run the gamut of sophistication from simple scripts like eRDF detector (Alexander, 2007) to full-fledged data processors like Altova’s Semantic Works (n.d.).

Despite the seeming multitude of tools available, this is where the infancy of the semantic web technology is most obvious. No one standard dominates the industry. There are still a variety of ways to have data semantically encoded without any correspondence, necessarily, between them. Most of the tools available are aimed at programmers or developers and not end-users who just want to code a web page to provide access for other users, not write an entire customized suite of scripts to create and process the metadata generated. SearchMonkey (Yahoo! Developer Network, 2009) crawls semantic markup primarily to provide data sets for developers. \Finally, there are no projects in the literature investigating the general use of semantic markup in web pages. Two studies of the use of Dublin Core in web pages both revealed low rates of implementation. Vinyard (2001) examined 299 pages but found that only 2.34% of them used Dublin Core (p. 13). More recently, Kolbitsch & Krottmaier (2006) used software to crawl 118,900 pages and discovered that 11% use Dublin Core. They note, however, that “only four websites account for more than two thirds (69.87%) of web pages with DC elements” (section 5.1). The relative newness of the technology is one probable reason for the lack of implementation, but another reason might also be that “adding Dublin Core to your website or weblog will most likely have little effect on the number of visitors” (Metadata, 2009, Dublin Core Metadata section, para 2). In a world where everything has to be justified for budgeting purposes, an idea that has no short-term benefits (and no proven long term ones either) would be a hard sell.

Technical Analysis: How to Add Semantic Metadata to Web Pages

The goal for this project is to investigate the ways to embed semantic markup into existing X/HTML pages in a way that meets the RDF standard and to implement one of them. Since RDF itself cannot be used in this way, “profiles” are created to define how the markup actually used can be translated into RDF (by another software program using the GRDDL standard). It is important to remember that the semantic markup describes the resources represented by the web pages and not the web page themselves. In the context of describing each of the various projects at CBSR, the web page URLs are treated as the unique identifiers (URIs) representing each of the individual projects and not as resources in themselves.

image1

RDF breaks relationships between all things down to the basic level of "subject – predicate – object" or "thing1 is related to thing2." The eventual idea is to relate everything using URIs:

The subject is always a URI. For each of the profiles described below, it usually equals the base URL of the web page in which the semantic markup is encoded, however, each of the profiles also includes a way to change the subject to some other URI. The predicate is also always a URI. This URI is where the descriptive vocabulary is encoded for use in RDF. A vocabulary cannot be used unless it has been defined in an RDF schema and the terms given a URI. Finally, since there is simply no way, currently, to express all things as URIs, text is also acceptable as the object. URIs are called non-literals, text is called a literal. RDF places no other constraints on relationships between things (W3C, 2004a).

Choice of descriptive vocabulary.

A multitude of vocabularies exist for describing resources. Some have developed within the “traditional” cataloging environment found in libraries, archives and museums, like MARC21, Dublin Core (DC) and EAD. Others have been generated by communities completely separate from this environment, such as PHIN and DDMS. Still others have come specifically from the online community, like FOAF and DOAP. Most of the vocabularies exist to describe a particular kind of resource.

This is only a very small sampling of existing vocabularies. Because most of them serve a specific niche, they are not really translatable to other types of uses. The other limiting factor is whether a schema exists to describe how the use of the vocabulary should be interpreted by RDF-compatible software. MARC21, for example, doesn’t really seem to be defined this way yet. Crosswalks between MARC and other vocabularies exist and a schema for MARCXML has been developed, but neither of these really addresses the use of MARC with RDF.

On the other hand, DC is well known and commonly used both for web resources and physical objects. It is simple, with a much smaller set of elements than that found in MARC or some of the other vocabularies. This also works against it. While it is simple to apply to the resource of choice, some of the granularity inherent in MARC is lost. A date can be described with no more specificity than that it is a date. Whether it is it the date of creation, the date of publication or the last date something was updated is not known (DCMI, 2009). For this project, the simplicity and flexibility of DC are points in its favor, as the timeline doesn’t allow for the thorough investigation of another more intricate vocabulary.

Choice of encoding method.

With the choice of descriptive vocabulary in hand, the question becomes how can it be encoded into the CBSR web pages? Basically, there has to exist somewhere on the web a “profile” that defines the rules for marking up the data for both the person creating the semantic markup and for the software that wants to then extract the markup. Anyone can create a profile and post it on the web; however, it can be dangerous to use just any profile. Web pages are the epitome of ephemera: here today, gone tomorrow. Using a profile that could disappear on the whim of a website owner means running a serious risk of losing the rules that define a page’s semantic markup and thus rendering it unintelligible to semantically-aware software. There are currently four major established profiles available for marking up X/HTML semantically.

The first and most basic profile, called DC-HTML, uses only the meta and link tags available in the X/HTML header. First, the encoding profile is defined in the head tag.

<head profile="http://dublincore.org/documents/2008/08/04/dc-html/">

Then the schema for the descriptive vocabulary is defined in the link tag.

<link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />

The schema declaration works similar to the concept of a namespace declaration, allowing the property name to be concatenated with its URI.

DCTERMS.title = http://purl.org/dc/terms/title

Meta tags encode literals and link tags encode non-literals. Meta tags take the attributes “name” and “content.”

Table 1: Meta tag attributes  
Attribute Function RDF Description
name Encodes the property name Predicate
content Contains the literal Object

Link tags take the attributes “rel,” “rev” and “href.”

Table 2: Link tag attributes  
Attribute Function RDF Description
rel Encodes the property name Predicate
rev Encodes the property name (in a reverse relationship from rel) Predicate
href Contains the referenced resource Object

This encoding method doesn’t require a special DTD and no base href has to be defined (DCMI, 2008b).

This first method seems somewhat like a stopgap measure to be used until better standards can supplant it. Meta tags don’t identify data within the body of the page.2 The data therefore has to be extracted from the body and separately marked up in the head, meaning that if changed, it must be changed in two places. See appendix C, Example 1 for an example of how DC-HTML might be encoded for CBSR’s main page.

The next two profiles both allow markup in both the head and in the body of the X/HTML document. This has the advantage of allowing more flexibility with the markup positioning. Both extend the use of existing X/HTML coding within the web page. Bits of data are enclosed with tags such as "p," "div" or "span." Formatted as properties of the attributes of these tags, metadata vocabulary terms can be linked to the displayed text or to other URIs.

The first of these two profiles is called eRDF. It was developed by Talis, a corporation at the forefront of semantic web development and it predates the specification issued by the W3C (see the description of RDFa below). The profile is fairly simple and straightforward, but also less sophisticated than W3C’s RDFa. Also working against eRDF is the fact that it was developed by a private company, not a standards-issuing body. Should Talis refocus its interests in another direction, there is always the possibility that their profile could disappear from the web.

Encoding this profile begins the same way as DC-HTML. The profile is declared in the head tag.

<head profile="http://purl.org/NET/erdf/profile" />

However, the next step is to explicitly define the base URI.

<base href="http://some/example" />

The schema for the descriptive vocabulary is then declared in the link tag as previously.

<link rel="schema.foaf" href="http://xmlns.com/foaf/0.1/" />

Here again, the schema declaration is similar to the concept of using namespaces but is not actually a namespace declaration. In fact, because the profile does not use namespace syntax, the semantic properties have to be formatted in two different ways: with a period in the head and with a dash in the body. Thus the example in Appendix C uses “DCTERMS .subject” in the head and “DCTERMS-hasPart” in the body.

The meta and link tags can be used in the header as previously described above in DC-HTML, but normal XHTML tagging can also be used in the body. X/HTML tags are qualified with the following attributes:

Table 3: eRDF general attributes  
Attribute Function RDF Description
id Defines the start of a new separate resource Subject
class Encodes the property name Predicate
title Used to assign a literal to the class property, in place of the text displayed in the XHTML document. Object

The “a” (anchor) tag is qualified with rel, rev and href.

Table 4: eRDF anchor attributes  
Attribute Function RDF Description
rel Encodes the property name Predicate
rev Encodes the property name (in a reverse relationship from rel) Predicate
href Contains the referenced resource Object

There is no special DTD required (Talis, 2006). See appendix C, Example 2 for an example of how eRDF might be used for encoding CBSR’s main page.

The next profile, RDFa, is the most sophisticated of the four major profiles, as well as being more recent than the two profiles described above. Developed by the worldwide standards-issuing body W3C, it is also probably the most stable of the profiles. However, it uses a special DTD that only validates with XHTML documents. It is perfectly feasible to use it with HTML documents, but said documents won’t validate against any HTML DTD. (W3C, 2008b, section 1.1)

The RDFa specification states that XHTML document authors should use XML declarations in all their documents. XHTML document authors must use an XML declaration when the character encoding of the document is other than the default UTF-8 or UTF-16 and no encoding is specified by a higher-level protocol.

<?xml version="1.0" encoding="iso-8859-1"?>

The document may include an RDFa-specific doctype.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"

"http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">

It must include a default namespace declaration for XHTML in the html tag.

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">

Any vocabularies used for description are also declared here using namespace syntax as well.

<html xmlns="http://www.w3.org/1999/xhtml" version="XHTML+RDFa 1.0"

xmlns:dcterms="http://purl.org/dc/terms/"

xml:lang="en">

XHTML+RDFa documents should be labeled with the Internet Media Type "application/xhtml+xml" as defined in [RFC3236]. The document may include a profile attribute in the head tag.

<head profile=" http://www.w3.org/1999/xhtml/vocab" />

Finally, the base URI does not need to be set explicitly, unless it needs to be set to something other than the default URI (the URL of the document is served) (W3C, 2008b, section 4.1). RDFa re-uses the following XHTML attributes.

Table 5: XTHML attributes  
Attribute Function RDF Description
rel Encodes the property name Predicate
rev Encodes the property name (in a reverse relationship from rel) Predicate
content Contains the literal Object
href Contains the referenced resource Object
src Contains the embedded referenced resource Object

The following attributes are RDFa specific.

Table 6: RDFa attributes  
Attribute Function RDF Description
about Defines a separate thing to be described Subject
property Encodes the property name Predicate
resource Encodes a URI that is not also a URL Object
datatype To express the datatype of a literal  
typeof To express the RDF type of the subject  

See appendix C, Example 3 for an example of RDFa using the main CBSR page.

The final profile is actually a group of profiles, called Microformats. Each one is designed for one type of markup. hCard describes contact information (implementing a version of the vCard specification), hCalendar marks up events and hResume marks up resumes and CVs. Other microformats still under development will describe recipes, citations and currency (About microformats, n.d.). In contrast to DC-HTML, eRDF and RDFa, which are all vocabulary-neutral, each microformat has its own vocabulary and syntax to describe one type of resource. They are being developed and maintained by the open source community, so it seems likely that they will be around for awhile. Unfortunately, the development of individual microformats seems to be generated by individual need on an ad-hoc basis rather than as part of an overall planning and development process. A microformat for any given resource type may or may not even exist.

Ultimately, the choice of which method to use depends on the resource being described and for what purpose. Does a metadata harvester, for example, expect to find a particular profile and/or descriptive vocabulary? Guidelines for general use don’t really exist. For this project, Dublin Core is the descriptive vocabulary, embedded in XHTML using the RDFa standard.

Methodology

First, a basic table was created, describing the elements of the metadata vocabulary and stating, very generally, their possible values for CBSR’s pages. See Appendix A. Then, the investigation of the descriptive vocabulary and encoding scheme resulted in a secondary literature search specifically into the available software and guidelines for encoding and parsing RDF using the RDFa standard. Finally, working from the list of elements and the source code for each page, the RDF relationships to be marked up were mapped out. See appendix D.

A development version of the CBSR web site was created in a separate directory on the server. This served two purposes: to work on the pages without having to worry about disrupting any real time services and providing a stable set of pages to work with as CBSR was concurrently in the process of shifting its website into a content management system hosted by UC Riverside. Using the tables showing all of the RDF relationships as the checklists, semantic markup was manually added to the web pages using the HTML editor Amaya and then checked the generated RDF relationships using a Firefox extension named Fuzz.

Discussion

A discussion of the descriptive vocabulary, Dublin Core, and the encoding scheme, RDFa, takes place above. The decision to use RDFa resulted in a need for more specific guidelines for its use, as well as a need for software that could validate and/or parse the markup. There was nothing in the library-related literature about using RDFa. A more general search for literature specific to using RDF retrieved a surprising number of items, given its newness, but not much in the way of practical how-to for the end-user. The majority of information was geared towards introducing the basic concept of the semantic web or towards developers (“how to build ontologies”) and not towards end-users. The RDFa Wiki comes the closest by having an actual “Best Practices” page (Best-practices, 2008). But even the guidance there doesn’t go beyond the level of “make sure your pages validate” (good advice that is probably ignored far too often).

A couple of websites profess to check the validity of RDFa markup (W3C [2009b] and Swignition [n.d.]). W3C also provides an HTML editor, called Amaya, that validates the the XTHML+RDFa doctype (2009a). The Firefox extension Fuzz will extract and display the RDF relationships in a web page for trouble-shooting purposes (Digital Bazaar, n.d.). On a more general level of RDFa support, similar to Yahoo’s Search Monkey project, Google is developing support for what it calls “rich snippets” (Google, 2009). In addition to the previously mentioned Operator and Fuzz, the RDFa Wiki lists other examples of extensions or plug-ins such as the W3C’s javascript bookmarklets and the SIOC Project’s Semantic Radar (Consume, 2009). There is also a whole group of software libraries developed for the various scripting languages, including everything from C to Ruby (Consume, 2009).

Choosing the vocabulary and encoding method was fairly straightforward and obvious. Even learning the “rules” for what would work and what would not in the markup was not that difficult. The most complex part of the markup process turned out to be the decisions about how to mark up the relationships expressed on each page. Adding a layer of semantic markup to web pages seems like a simple enough process on the face of it. However, we don’t consciously consider the implicit relationships between the things represented on those pages and explicitly defining those implicit relationships not as simple as it seems. Consider this sentence from the “About CBSR” web page:

During 2005-7 it worked with a consortium of libraries in the Consortium of University Research Libraries in the British Isles to add on line a record of all their pre-1701 British imprints, estimated at more than 40,000 copies.

First of all, while the page itself is about CBSR, the “it” in this sentence is not CBSR; “it” is defined a couple of sentences previously in the same paragraph as being the ESTC. Second, even though it is not named, this entire sentence is describing the Britain in Print (BIP) project. Finally, BIP was a Consortium of University Research Libraries (CURL) project. Not one of those three things is explicitly stated by that sentence. The tangle of relationships finally worked out to be:

Table 7: RDF relationships for Britain in Print
Subject Predicate Object
http://estc.ucr.edu/ dcterms:contributor http://www.britaininprint.net/
http://www.britaininprint.net/ dcterms:date 2005-2007
http://www.britaininprint.net/ dcterms:creator http://www.curl.ac.uk/
http://www.britaininprint.net/ dcterms:description add on line a record of all their pre-1701 British imprints, estimated at more than 40,000 copies.

Another issuewith identifying relationships was defining a stopping point. Are relationships always “thing1 relates to thing2?” The ESTC is now in its third decade of existence. It turns out that there are many, many, many things it relates to on the web. A search in Worldcat for a description of the third edition of the CD-ROM version of the ESTC retrieved not just that edition but also the first and second editions, assorted editions of the original bibliographies (Pollard & Redgrave and Wing 3) used to build parts of the database and the various modes of ESTC access that have been available over the years (RLIN/Eureka and BLAISE). A search of the Internet retrieves a wikipedia entry, a multitude of library research guides, and references to the project in various blogs.

Obviously it would be impossible to try to include everything. But, for example, should the first and second editions of the CD-ROM be included? Technically, the third edition has replaced them both. Should that relationship be expressed instead? In the end, only the third edition of the CD-ROM was included, and only the most obvious relationships were marked up. There is a whole other level of relationships that could have been marked up. It could, in fact, be unnecessary to do so, in theory at least. If all (or most) pages are marked up semantically, those relationships should be expressed somewhere else and would not then need to be duplicated in this markup.

Other issues included the best way to plan out the relationships, what to do about reciprocal relationships between pages, the same relationship expressed multiple times and where in XHTML document the relationships should be expressed. It seemed logical to lay out all of the possible relationships based on the available descriptive vocabulary before ever looking at the actual web page to be marked up, but the web pages themselves were not predictable enough to make that work. It was easier (though probably less consistent) to actually comb through each page with the list of elements in hand, asking which element described this particular relationship. Also, some relationships varied from my original conception of them depending on whether there was a URI that could be used or if there was only text on the web page that could be used.

Another question that was not resolved was the problem of what to do about two CBSR web pages that have reciprocal relationships. CBSR has the part ESTC and the ESTC is part of CBSR. This is the same relationship expressed two different ways. Does it need to be expressed on both pages? The relationship was marked up on both pages for this project.

Table 8: RDF relationship between CBSR and ESTC
Subject Predicate Object
http://cbsr.ucr.edu/ dcterms:hasPart http://estc.ucr.edu/
http://estc.ucr.edu/ dcterms:isPartOf http://cbsr.ucr.edu/

Each page could express the same relationship multiple times. The question was easy to test and resolve. The relationship was marked up in the web page multiple times and the resulting RDF relationships were examined in Fuzz, which showed that the multiple markups were identical. Thus, each relationship only needs to be marked up once on any given page.

Despite the fact that the main selling point of RDFa is that the user can mark up data directly in the body of the XHTML document, this is not always actually possible. If there was nothing in the body to which markup could be attached, it was placed in the head of the document. It is also of interest to note that it was actually easier to do the markup in the head than in the body of the document.

From a practical perspective, the actual markup process had a shallow learning curve for the basic markup. The RDFa standard also allows for more sophisticated/complex possibilities that were not explored. The process of adding markup manually to the pages was no more than a little tedious and it went fairly quickly. There are several things of minor importance that were discovered. Since the web pages were built professionally in the first place, they have significant presentation markup, comments and links, etc. already embedded in the XHTML. As long as the semantic markup is done correctly, it appears that the two markup schemes do not affect each other in any way. The web browser, for example, uses the href attribute for the link displayed on the web page and the RDFa extraction software correctly retrieves the same href attribute as the subject/object of the RDF relationship. A second thing of note is that RDFa cannot handle multiple subjects for same predicate/object. The ESTC and the CNP both have the funder NEH, but since they are different projects, there is no way to attach the URI for both of them to the same bit of NEH data on the page. Another thing is the software to extract the RDF relationships expressed using RDFa will not use the meta tag with the name and content attributes in the head. It will, however, pick up any link tags using the rel/rev and href attributes, like the style sheet link. Finally, it is important to remember that not every link or bit of text on a page contains a relationship that needs to be described.

There were limitations with the descriptive vocabulary as well. DC does not describe relationships between people but a vocabulary like Friend of a Friend (FOAF) does. DC also does not have an element to identify funding institutions. MARC relator terms have been encoded for use in the RDF environment, and so the term “funder” was used for this purpose.

One final lingering question on the usage of any particular vocabulary remains. RDF is designed to use any vocabulary; it is part of core idea of RDF. However, to see, for example, improved search results in a semantic web browser, does the user who is searching have to use the same descriptive vocabulary in the query as the one used to encode the page for which they are searching? This would seemingly undermine the whole design of RDF being vocabulary-neutral but this issue is not addressed, or even hinted at, in the literature.

Conclusion

The software that is currently available is primitive and buggy. Fuzz does an admirable job of displaying the RDF relationships, but not in the order in which they are encoded in the web page and not in any user-specified order. Amaya did not like the MARC relator terms namespace declaration being added to the list of pre-sets to be automatically added to a web document with the doctype. The addition of that namespace declaration resulted in gibberish instead; gibberish that also overwrote the entire existing document. The Swignition validation service never did seem to work. Clearly, while development has begun, the software has a long way to go before it will be ready for the mainstream.

Manual coding of the embedded RDF, while less resource intensive than creating an external RDF document, seems unlikely to appeal to more than a geeky segment of the web authoring population. Learning the syntax is easy enough, however, identifying the relationships is time-consuming and requires a great deal of in-depth knowledge of the resources involved. It is hard to imagine someone putting in that much work for no, as yet, tangible benefit.

Meaningful semantic web coding is not going to happen without sophisticated software to make it as easy as HTML coding. Even so, it seems likely to be a situation where better, more knowledgeable and careful web designers will document deeper and more meaningful relationships. Lazy web authors will have, at most, high-level relationships that barely skim the surface and provide little added value to their pages. Perhaps, if all web pages are marked up, that will be enough?

Notes

1. Actually, that is not strictly true, Yahoo’s SearchMonkey should be able to extract any semantic information added to CBSR’s web pages, but SearchMonkey is a tool that gathers data for developers, not for end-users.

O'Donnell (2006, Future Developments section) says that usage of the meta tag within the body is implied as possible in the XHTML2 specification, but even three years later, this standard does not seem to be in wide use yet.

3. Pollard, A. & Redgrave, G. R. (1926). A short-title catalogue of books printed in England, Scotland, & Ireland and of English books printed abroad, 1475-1640. London: Bibliographical Society; Wing, D. (1945). Short-title catalogue of books printed in England, Scotland, Ireland, Wales, and British America, and of English books printed in other countries, 1641-1700. New York: Index Society.

References

About microformats. (n.d.). Retrieved October 4, 2009 from http://microformats.org/about

Alexander, K. (2007). eRDF detector. Userscripts.org. Retrieved October 17, 2009 from http://userscripts.org/scripts/show/8260

Altova. (n.d.) SemanticWorks Semantic Web tool. Retrieved November 27, 2009 from http://www.altova.com/semanticworks.html

Angjeli, A., Isaac, A., Cloarec, T., Martin, F., Meji, L. van der, Matthezing, H., et. al. (2009). Semantic web and vocabulary interoperability: an experiment with illumination collections. ICBC, 38(2), 25-29.

Arch-int, N. & Sophatsathit, P. (2003). A semantic information gathering approach for heterogeneous information sources on WWW. Journal of Information Science, 29(5), 357-374.

Best-practices. (2008). RDFa Wiki. Retrieved November 27, 2009 from http://rdfa.info/wiki/Best-practices

Centers for Disease Control and Prevention. (2005). Public Health Information Network vocabulary metadata standards. Version 1.2. 08/08/2005. Retrieved October 11, 2009 from http://www.cdc.gov/phin/library/documents/pdf/PHIN%20Vocabulary%20 Metadata%20V1.2.pdf

Chavarriaga, E. & Macias, J. A. (2009). A model-driven approach to building modern semantic web-based user interfaces. Advances in Engineering Software, 40, 1329-1334.

Consume. (2009). RDFa Wiki. Retrieved November 27, 2009 from http://rdfa.info/wiki/Consume

Craig, J. (2007). hAccessibility. The Web Standards Project. Retrieved November 22, 2009 from http://www.webstandards.org/2007/04/27/haccessibility/

Damiani E. & Fugazza, C. (2007). Toward semantics-aware management of intellectual property rights. Online Information Review, 31(1), 59-72.

DC-dot: Dublin Core metadata editor. (n.d.). Retrieved October 17, 2009 from http://www.ukoln.ac.uk/cgi-bin/dcdot.pl?n=0&guesspublisher=yes

DCMI/RDA Task Group. (2008). DCMI/RDA Task Group Wiki. Retrieved October 10, 2009 from http://dublincore.org/dcmirdataskgroup/

Dublin Core Metadata Initiative. (2005) Using Dublin Core. Retrieved October 17, 2009 from http://dublincore.org/documents/usageguide/

Dublin Core Metadata Initiative. (2008a). The Dublin Core metadata registry: Promoting the discovery and reuse of metadata. Retrieved October 10, 2009 from http://dcmi.kc.tsukuba.ac.jp/dcregistry/

Dublin Core Metadata Initiative. (2008b). Expressing Dublin Core metadata using HTML/XHTML meta and link elements. Retrieved November 27, 2009 from http://dublincore.org/documents/dc-html/

Dublin Core Metadata Initiative. (2009). DCMI metadata terms. Retrieved October 10, 2009 from http://www.dublincore.org/documents/dcmi-terms/

Digital Bazaar [msporny]. (n.d.). [Fuzz]. Retrieved November 27, 2009 from http://rdfa.digitalbazaar.com/fuzz/trac/wiki/WikiStart

Gateway to Educational Materials information. (2009). Retrieved October 17, 2009 from http://www.thegateway.org/about

Global Digital Format Registry. (n.d.). Retrieved October 17, 2009 from http://www.gdfr.info/

Google. (2009, October 26). Help us make the web better: An update on Rich Snippets. Webmaster Central Blog. Retrieved November 27, 2009 from http://googlewebmastercentral.blogspot.com/2009/10/help-us-make-web-better-update-on-rich.html

Guzmán Luna, J., Torres Pardo, D. & López García, A. N. (2006). Desarrollo de una ontología en el contexto de la web semántica a partir de un tesauro documental tradicional. Revista Interamericana de Bibliotecología 29(2), 79-95

Heery, R. & Wagner, H. (2002). A metadata registry for the semantic web. D-Lib Magazine, 8(5). Retrieved September 20, 2009 from http://www.dlib.org/dlib/may02/wagner/05wagner.html

Hildebrand, M., Ossenbruggen, J. van, Hardman, L. & Jacobs, G. (2009). Supporting subject matter annotation using heterogeneous thesauri: A user study in Web data reuse. International Journal of Human-Computer Studies, 67, 887-902.

International Federation of Library Associations and Institutions (2008). Declaring FRBR entities and relationships in RDF. Retrieved October 10, 2009 from http://www.ifla.org/files/cataloguing/frbrrg/namespace-report.pdf

JISC IE Metadata Schema Registry. (2009). Retrieved October 17, 2009 from http://www.ukoln.ac.uk/projects/iemsr/

Kolbitsch, J. & Krottmaier, H. (2006). The Use of HTML-Encoded Dublin Core in Academic and Educational Settings. Retrieved October 17, 2009 from http://www.kolbitsch.org/research/papers/2006-Dublin_Core_Analysis.pdf

Library of Congress. (n.d.). About [authorities & vocabularies]. Retrieved October 17, 2009 from http://id.loc.gov/authorities/about.html

Metadata. (2009). LISWiki. Retrieved November 27, 2009 from http://liswiki.org/wiki/Metadata

Microformat. (2009). Wikipedia. Retrieved October 4, 2009 from http://en.wikipedia.org/wiki/Microformats

The National Archives. (n.d.). The technical registry: PRONOM. Retrieved October 10, 2009 from http://www.nationalarchives.gov.uk/PRONOM/Default.aspx

National Science Digital Library. (2009). NSDL registry: Supporting metadata interoperability. Retrieved October 10, 2009 from http://metadataregistry.org/

O'Donnell, J. (2006). Naked Metadata. Retrieved October 17, 2009 from http://jod.id.au/tutorial/naked-metadata.html

Operator. (n.d.). Mike’s Musings: My musings about mozilla, microformats, me and my motivations. Retrieved October 17, 2009 from http://www.kaply.com/weblog/operator/

Piggy Bank. (2008). Retrieved October 17, 2009 from http://simile.mit.edu/wiki/Piggy_Bank

Swignition: Try Swignition online. (n.d.). Retrieved November 27, 2009 from http://buzzword.org.uk/swignition/try

Talantikite, H. N., Aissani, D. & Boudjlida, N. (2008). Semantic annotations for web services discovery and comparison. Computer Standards & Interfaces, 31, 1108-1117.

Talis. (2006). Rdf In Html. Retrieved October 12, 2009 from http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml

Tonkin, E. & Strelnikov, A. (2009). Spinning a semantic web for metadata: Developments in the IEMSR. Ariadne, 59. Retrieved October 17, 2009 from http://www.ariadne.ac.uk/issue59/tonkin-strelnikov/

Uddin, M. N. & Janecek, P. (2006). Faceted classification in web information architecture: A framework for using semantic web tools. The Electronic Library 25(2), 219-233.

Unified Digital Format Registry (UDFR). (2009). Retrieved October 17, 2009 from http://www.udfr.org/

Vinyard, P. (2001). An analysis of embedded metadata usage on the world wide web. Unpublished master’s thesis, University of North Carolina, Chapel Hill. Retrieved November 27, 2009 from http://ils.unc.edu/MSpapers/2698.pdf

World Wide Web Consortium. (2004a). RDF primer: W3C recommendation 10 February 2004. Retrieved October 10, 2009 from http://www.w3.org/TR/rdf-primer/

World Wide Web Consortium. (2004b). RDF vocabulary description language 1.0: RDF Schema: W3C recommendation 10 February 2004. Retrieved October 4, 2009 from http://www.w3.org/TR/rdf-schema/

World Wide Web Consortium. (2004c). RDF/XML syntax specification (revised): W3C recommendation 10 February 2004. Retrieved October 17, 2009 from http://www.w3.org/TR/rdf-syntax-grammar/

World Wide Web Consortium. (2007). Gleaning resource descriptions from dialects of languages (GRDDL): W3C recommendation 11 September 2007. Retrieved October 4, 2009 from http://www.w3.org/TR/grddl/

World Wide Web Consortium. (2008a). RDFa primer: Bridging the human and data webs: W3C working group note 14 October 2008. Retrieved October 4, 2009 from http://www.w3.org/TR/xhtml-rdfa-primer/

World Wide Web Consortium. (2008b). RDFa in XHTML: Syntax and Processing. Retrieved November 27, 2009 from http://www.w3.org/TR/rdfa-syntax/

World Wide Web Consortium. (2008c). SPARQL query language for RDF: W3C recommendation 15 January 2008. Retrieved October 10, 2009 from http://www.w3.org/TR/rdf-sparql-query/

World Wide Web Consortium. (2009a). Welcome to Amaya. Retrieved November 27, 2009 from http://www.w3.org/Amaya/

World Wide Web Consortium. (2009b). Markup Validation Service. Retrieved November 27, 2009 from http://validator.w3.org/

World Wide Web Consortium. (2009c). Protocol for web description resources (POWDER): W3C working group note 1 September 2009: Primer. Retrieved October 4, 2009 from http://www.w3.org/TR/powder-primer/

World Wide Web Consortium. (2009d). Semantic web: Web ontology language (OWL). Retrieved October 4, 2009 from http://www.w3.org/2004/OWL/

World Wide Web Consortium. (2009e). SKOS simple knowledge organization system primer: W3C working group note 18 August 2009. Retrieved October 10, 2009 from http://www.w3.org/TR/skos-primer/

Yahoo! Developer Network. (2009). SearchMonkey. Retrieved November 27, 2009 from http://developer.yahoo.com/searchmonkey/

Additional Resources

Department of Defense. (2009). Department of Defense Discovery Metadata Specification Home Page. Retrieved November 29, 2009 from http://metadata.dod.mil/mdr/irs/DDMS/

DOAP: Description of a Project (n.d.). Retrieved October 17, 2009 from http://trac.usefulinc.com/doap

Dublin Core Metadate Initiative. (2009). Home. Retrieved October 23, 2009 from http://dublincore.org/

Dumbill, E. (n.d.). DOAP. Retrieved October 10, 2009 from http://trac.usefulinc.com/doap

EAD: Encoded Archival Description. (2009). Retrieved October 10, 2009 from http://www.loc.gov/ead/

Embedded RDF (2009). Wikipedia. Retrieved October 12, 2009 from http://en.wikipedia.org/wiki/Embedded_RDF

Feigenbaum, L. (2009 May). The 2009 semantic web landscape [slideshow]. Presentation at the PRISM Forum SIG Meeting, Luzern, Switzerland. Retrieved October 4, 2009 from http://www.slideshare.net/LeeFeigenbaum/semantic-web-landscape-2009

Fichter, D. & Wisniewski, J. (2008). Microformats and the search for meaning. Online, 32(4), 55-57.

The Friend of a Friend (FOAF) project. (n.d.). Retrieved October 10, 2009 from http://www.foaf-project.org/

Herman, I. (2009a June). Introduction to the Semantic Web (tutorial) [slideshow]. Presentation at the 2009 Semantic Technology Conference, San Jose, CA, USA. Retrieved October 4, 2009 from http://www.w3.org/2009/Talks/0615-SanJose-tutorial-IH/Slides.pdf

Herman, I. (2009b June). What is new in W3C land? [slideshow]. Presentation at the 2009 Semantic Technology Conference, San Jose, CA, USA. Retrieved October 4, 2009 from http://www.w3.org/2009/Talks/0615-SanJose-talk-IH/Slides.pdf

Msporny. (2008). RDFa basics [Video file]. Retrieved October 4, 2009 from http://www.youtube.com/watch?v=ldl0m-5zLz4&NR=1

OAIster …find the pearls. (2009). Retrieved October 17, 2009 from http://www.oaister.org/

Open Archives Initiative. (2008). The Open Archives Initiative Protocol for Metadata Harvesting. Retrieved October 23, 2009 from http://www.openarchives.org/OAI/openarchivesprotocol.html

RDFa. (2009). Wikipedia. Retrieved October 10, 2009 from http://en.wikipedia.org/wiki/Rdfa

Semantic Web. (2009). Wikipedia. Retrieved October 23, 2009 from http://en.wikipedia.org/wiki/Semantic_web

TEI: Text Encoding Initiative. (2009). Retrieved October 10, 2009 from http://www.tei-c.org/index.xml

World Wide Web Consortium. (2009). Semantic web: W3C semantic web frequently asked questions. Retrieved October 4, 2009 from http://www.w3.org/RDF/FAQ

Appendix A: A table of Dublin Core elements and their possible values

Notes

1. Elements refining other elements in the set are listed as sub-elements in the table.

2. Element descriptions from: DCMI Metadata Terms, retrieved November 27, 2009 from http://dublincore.org/documents/dcmi-terms/

Element/Sub-element DCMI  Description Use/Values
accrualMethod accrualPeriodicity accrualPolicy The method by which items are added to a collection. The frequency with which items are added to a collection. The policy governing the addition of items to a collection. NO
audience A class of entity for whom the resource is intended or useful. researchers, scholars
audience/educationLevel A class of entity, defined in terms of progression through an educational or training context, for which the described resource is intended. NO
audience/mediator An entity that mediates access to the resource and for whom the resource is intended or useful.

In an educational context, a mediator might be a parent, teacher, teaching assistant, or care-giver.

NO
contributor An entity responsible for making contributions to the resource. Examples of a Contributor include a person, an organization, or a service. Typically,

the name of a Contributor should be used to indicate the entity.
contributing libraries
contributor/creator An entity primarily responsible for making the resource. Examples of a Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity. ESTC: BL, ESTC/NA, AAS CNP: CBSR CCILA: CBSR CDNC: CBSR CNMA: CBSR
Element/Sub-element DCMI  Description Use/Values
coverage The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant. Spatial topic and spatial applicability may be a named place or a location specified by its geographic coordinates. Temporal topic may be a named period, date, or date range. A jurisdiction may be a named administrative entity or a geographic place to which the resource applies. Where appropriate, named places or time periods can be used in preference to numeric identifiers such as sets of coordinates or date ranges. print scope of projects
coverage/spatial Spatial characteristics of the resource.
Recommended best practice is to use a controlled vocabulary such as the Thesaurus of Geographic Names [TGN]. Where appropriate, named places can be used in preference to numeric identifiers such as sets of coordinates.
geographical scope for projects
coverage/temporal Temporal characteristics of the resource.

Where appropriate, named time periods can be used in preference to numeric identifiers such as date ranges.

time scope for projects
date A point or period of time associated with an event in the lifecycle of the resource. Date may be used to express temporal information at any level of granularity. Recommended best practice is to use an encoding scheme, such as the W3CDTF profile of ISO 8601. NO
date/available Date (often a range) that the resource became or will become available. database access
date/created Date of creation of the resource. When projects were created.
Element/Sub-element DCMI  Description Use/Values
date/dateAccepted Date of acceptance of the resource.  Examples of resources to which a Date Accepted may be relevant are a thesis

(accepted by a university department) or an article (accepted by a journal).
NO
date/dateCopyrighted Date of copyright. NO
date/dateSubmitted Date of submission of the resource. Examples of resources to which a Date Submitted may be relevant are a thesis (submitted to a university department) or an article (submitted to a journal). NO
date/issued Date of formal issuance (e.g., publication) of the resource. NO
date/modified Date on which the resource was changed. NO
date/valid Date (often a range) of validity of a resource. NO
description An account of the resource.  Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource. Description of projects
description/abstract A summary of the resource. NO
description/tableOfContents A list of subunits of the resource. NO
format The file format, physical medium, or dimensions of the resource.

Examples of dimensions include size and duration. Recommended best practice is to use a controlled vocabulary such as the list of Internet Media Types [MIME].

NO
format/extent The size or duration of the resource. NO
format/medium The material or physical carrier of the resource. NO
Element/Sub-element DCMI  Description Use/Values
identifier An unambiguous reference to the resource within a given context.

Recommended best practice is to identify the resource by means of a string conforming to a formal identification system.
NO
identifier/bibliographicCitation A bibliographic reference for the resource. Recommended practice is to include sufficient bibliographic detail to identify the resource as unambiguously as possible. NO
instructionalMethod A process, used to engender knowledge, attitudes and skills, that the described resource is designed to support.

Instructional Method will typically include ways of presenting instructional materials or conducting instructional activities, patterns of learner-to-learner and learnerto-instructor interactions, and mechanisms by which group and individual levels oflearning are measured. Instructional methods include all aspects of the instruction and learning processes from planning and implementation through evaluation and feedback.

lesson plans for CDNC
language A language of the resource.  Recommended best practice is to use a controlled vocabulary such as RFC 4646 english or spanish
provenance A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation. The statement may include a description of any changes successive custodians made to the resource. ESTC only

Element/Sub-element DCMI  Description Use/Values
publisher An entity responsible for making the resource available. ESTC: British Library and ESTC/NA CNP: CBSR CCILA: CBSR CDNC: CBSR CNMA: CBSR
relation A related resource.

Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system.

miscellaneous resources related to projects
relation/confomsTo An established standard to which the described resource conforms. Cataloging standards.
relation/hasFormat A related resource that is substantially the same as the pre-existing described resource, but in another format. various incarnations of ESTC
relation/hasPart A related resource that is included either physically or logically in the described resource. CBSR --> projects --> pieces within projects
relation/hasVersion A related resource that is a version, edition, or adaptation of the described resource. NO
relation/isFormatOf A related resource that is substantially the same as the described resource, but in another format. reverse of hasFormat
relation/isPartOf A related resource in which the described resource is physically or logically included. reverse of hasPart
relation/isReferencedBy A related resource that references, cites, or otherwise points to the described resource. NO
relation/isReplacedBy A related resource that supplants, displaces, or supersedes the described resource. NO
relation/isRequiredBy A related resource that requires the described resource to support its function, delivery, or coherence. NO
Element/Sub-element DCMI  Description Use/Values
relation/isVersionOf A related resource of which the described resource is a version, edition, or adaptation.

Changes in version imply substantive changes in content rather than differences in format.
reverse of hasVersion
relation/references A related resource that is referenced, cited, or otherwise pointed to by the described resource. bibliographies subsumed by ESTC and CCILA
relation/replaces A related resource that is supplanted, displaced, or superseded by the described resource. bibliographies subsumed by ESTC and CCILA
relation/requires A related resource that is required by the described resource to support its function, delivery, or coherence. bibliographies subsumed by ESTC and CCILA
relation/source A related resource from which the described resource is derived. NO
rights Information about rights held in and over the resource. Typically, rights information includes a statement about various property rights associated with the resource, including intellectual property rights. NO
rights/accessRights Information about who can access the resource or an indication of its security status. Access Rights may include information regarding access or restrictions based on privacy, security, or other policies. Define access available for databases (public or restricted).
rights/license A legal document giving official permission to do something with the resource. NO
rightsHolder A person or organization owning or managing rights over the resource. ESTC: British Library and ESTC/NA CNP: CBSR CCILA: CBSR CDNC: CBSR CNMA: CBSR
Element/Sub-element DCMI  Description Use/Values
subject The topic of the resource.

Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlled vocabulary. To describe the spatial or temporal topic of the resource, use the Coverage element.

LCSH terms and general keywords.
title A name given to the resource. each project name
title/alternative An alternative name for the resource. The distinction between titles and alternative titles is application-specific. acronyms, 18th century STC, translation of CCILA either way
type The nature or genre of the resource. Recommended best practice is to use a controlled vocabulary such as the DCMI Type Vocabulary [DCMITYPE]. To describe the file format, physical medium, or dimensions of the resource, use the Format element. Databases are Dataset.

Appendix B

CBSR web pages to be enhanced

Public Web Address

Development Web Address

Page Description

http://cbsr.ucr.edu/index.html

http://cbsrdb.ucr.edu/~ginger/cbsr/cbsr/index.html

CBSR main page

http://cbsr.ucr.edu/about.html

http://cbsrdb.ucr.edu/~ginger/cbsr/cbsr/about.html

Description of CBSR

http://estc.ucr.edu/index.html

http://cbsrdb.ucr.edu/~ginger/cbsr/estc/index.html

ESTC main page

http://cnp.ucr.edu/index.html

http://cbsrdb.ucr.edu/~ginger/cbsr/cnp/index.html

CNP main page

http://cdnc.ucr.edu/index.html

http://cbsrdb.ucr.edu/~ginger/cbsr/cdnc/index.html

CDNC main page

http://ccila.ucr.edu/index.html

http://cbsrdb.ucr.edu/~ginger/cbsr/ccila/index.html

CCILA main page (English)

http://ccila.ucr.edu/es/index.html

http://cbsrdb.ucr.edu/~ginger/cbsr/ccila/es/index.html

CCILA main page (Spanish)

http://cnma.ucr.edu/index.html

http://cbsrdb.ucr.edu/~ginger/cbsr/cnma/index.html

CNMA main page

Appendix C: Examples of X/HTML semantic markup

Notes

1. Changes for semantic markup are in bold and some existing markup and comments have been removed to save space.

Example C1.

DC-HTML semantic markup example

CBSR’s main web page shows how DC-HTML markup might be encoded.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head profile="http://dublincore.org/documents/2008/08/04/dc-html/">

<title>Center for Bibliographical Studies and Research</title>

...

<link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />

<meta name="DCTERMS.title" content=" Center for Bibliographical Studies and Research " />

<meta name="DCTERMS.alternative" content="CBSR" />

<meta name="DCTERMS.description" content=" The Center for Bibliographical Studies and Research is currently involved in five major bibliographical projects: the California Newspaper Project, the English Short-Title Catalog, the Latin American Short-Title Catalog, the California Digital Newspaper Collection and the California Newspaper Microfilm Archive. " />

<link rel="DCTERMS.subject" href=" http://id.loc.gov/authorities/sh85013832#concept " />

<meta name=" DCTERMS.hasPart" content="English Short-Title Catalog (1473-1800)" />

<meta name=" DCTERMS.hasPart" content="California Newspaper Project" />

</head>

<body>

...

<div align="center">

<p><a class="project" href="http://estc.ucr.edu/">English Short-Title Catalog <br /> (1473-1800</a></p>

</div>

<div align="center">

<p><a class="project" href="http://cnp.ucr.edu/">California Newspaper Project</a></p>

</div>

...

</body>

</html>

Example C2.

eRDF semantic markup example

The example below shows how eRDF might be used for encoding CBSR’s main page. In contrast to the DC-HTML profile above, the property DCTERMS-hasPart is used in the body of the document where the text actually occurs, not in the head.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head profile="http://purl.org/NET/erdf/profile" />

<title>Center for Bibliographical Studies and Research</title>

...

<base href="http://cbsr.ucr.edu/" />

<link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />

<meta name="DCTERMS.title" content=" Center for Bibliographical Studies and Research " />

<meta name="DCTERMS.alternative" content="CBSR" />

<meta name="DCTERMS.description" content=" The Center for Bibliographical Studies and Research is currently involved in five major bibliographical projects: the California Newspaper Project, the English Short-Title Catalog, the Latin American Short-Title Catalog, the California Digital Newspaper Collection and the California Newspaper Microfilm Archive. " />

<link rel="DCTERMS.subject" href=" http://id.loc.gov/authorities/sh85013832#concept " />

</head>

<body>

...

<div align="center">

<p><a class="project" href="http://estc.ucr.edu/"> <span class=" DCTERMS-hasPart ">English Short-Title Catalog <br />

(1473-1800) </span></a></p>

</div>

<div align="center">

<p><a class="project" rel=" DCTERMS-hasPart " href="http://cnp.ucr.edu/">California Newspaper Project</a></p>

</div>

...

</body>

</html>

Example C3.

RDFa semantic markup example

A final example using the main CBSR page shows how RDFa markup might look.

<?xml version="1.0" encoding="iso-8859-1"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"

"http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" version="XHTML+RDFa 1.0"

xml:lang="en"

xmlns:dcterms=" http://purl.org/dc/terms/">

<head profile=" http://www.w3.org/1999/xhtml/vocab" />

<title>Center for Bibliographical Studies and Research</title>

<link href="css/home.css" rel="stylesheet" type="text/css" />

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

...

<meta property="dcterms:title" content=" Center for Bibliographical Studies and Research " >

<meta property="dcterms:alternative" content="CBSR" >

<meta property="dcterms:description" content=" The Center for Bibliographical Studies and Research is currently involved in five major bibliographical projects: the California Newspaper Project, the English Short-Title Catalog, the Latin American Short-Title Catalog, the California Digital Newspaper Collection and the California Newspaper Microfilm Archive. " >

<link rel="dcterms:subject" href=" http://id.loc.gov/authorities/sh85013832#concept " />

</head>

<body>

...

<div align="center">

<p><a rel="dcterms:hasPart" class="project" href="http://estc.ucr.edu/">English Short-Title Catalog <br />

(1473-1800) </a></p>

</div>

<div align="center">

<p><a class="project" href="http://cnp.ucr.edu/"> <span property="dcterms:hasPart">California Newspaper Project </span></a></p>

</div>

...

</body>

</html>

Appendix D: Tables of RDF relationships

Table D1

Webpage: cbsr.ucr.edu/index.html

Location Subject Predicate Object Obj

Type

Data

Type

Notes
head http://cbsr.ucr.edu/ dcterms:title Center for Bibliographical Studies and Research plain literal    
head http://cbsr.ucr.edu/ dcterms:alternative CBSR plain literal    
head http://cbsr.ucr.edu/ dcterms:description The Center for Bibliographical Studies and Research is currently involved in five major bibliographical projects: the California Newspaper Project, the English Short-Title Catalog, the Latin American Short-Title Catalog, the California Digital Newspaper Collection and the California Newspaper Microfilm Archive. plain literal    
head http://cbsr.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013832#concept URI    
head http://cbsr.ucr.edu/ dcterms:audience researchers, scholars plain literal    
head http://cbsr.ucr.edu/ dcterms:created 1989 typed literal gYear  
head http://cbsr.ucr.edu/ dcterms:subject uc riverside, ucr, riverside, university of california, center for bibliographical studies and research, cbsr, estc, cnp, cdnc, ccila, cnma, english short title catalog, california newspaper project, california digital newspaper collection, catalogo colectivo de impresos latinoamericanos, california newspaper microfilm archive, bibliographic studies, neh, national endowment for the humanities plain literal    
body http://cbsr.ucr.edu/ dcterms:hasPart English Short-Title Catalog <br /> (1473-1800) -- use: http://estc.ucr.edu/ URI   ESTC
body http://cbsr.ucr.edu/ dcterms:hasPart California Newspaper Project -- USE: http://cnp.ucr.edu/ URI   CNP
body http://cbsr.ucr.edu/ dcterms:hasPart California Digital Newspaper Collection -- USE: http://cdnc.ucr.edu/ URI   CDNC
body http://cbsr.ucr.edu/ dcterms:hasPart Catálogo Colectivo de Impresos Latinoamericanos hasta 1851 -- USE: http://ccila.ucr.edu/ URI   CCILA
body http://cbsr.ucr.edu/ dcterms:hasPart California Newspaper Microfilm Archive -- USE: http://cnma.ucr.edu/ URI   CNMA
body http://cbsr.ucr.edu/ marcrel:FND http://www.neh.gov/ URI    
body http://cbsr.ucr.edu/ dcterms:isPartOf http://www.ucr.edu/ URI    
body http://cbsr.ucr.edu/ dcterms:hasPart cbsr contacts -- USE: http://cbsr.ucr.edu/cbsrcontacts.html URI    
body http://cbsr.ucr.edu/ dcterms:description http://cbsr.ucr.edu/about.html URI    

Table D2

Web page: cbsr.ucr.edu/about.html

Location Subject Predicate Object Obj

Type

Data

type

Notes
head http://cbsr.ucr.edu/ dcterms:title Center for Bibliographical Studies and Research plain literal    
head http://cbsr.ucr.edu/ dcterms:alternative CBSR plain literal    
head http://cbsr.ucr.edu/ dcterms:description The Center for Bibliographical Studies and Research is currently involved in five major bibliographical projects: the California Newspaper Project, the English Short-Title Catalog, the Latin American Short-Title Catalog, the California Digital Newspaper Collection and the California Newspaper Microfilm Archive. plain literal    
head http://cbsr.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013832#concept URI    
head http://cbsr.ucr.edu/ dcterms:subject uc riverside, ucr, riverside, university of california, center for bibliographical studies and research, cbsr, estc, cnp, cdnc, ccila, cnma, english short title catalog, california newspaper project, california digital newspaper collection, catalogo colectivo de impresos latinoamericanos, california newspaper microfilm archive, bibliographic studies, neh, national endowment for the humanities plain literal    
body http://cbsr.ucr.edu/ dcterms:created 1989 typed literal gYear  
body http://cbsr.ucr.edu/ dcterms:isPartOf College of Humanities and Social Sciences: use http://chass.ucr.edu/ URI   CHASS
body http://cbsr.ucr.edu/ dcterms:hasPart Eighteenth century short-title catalog, use: http://estc.ucr.edu/ URI   ESTC
body http://estc.ucr.edu/ dcterms:alternative Eighteenth century short-title catalog (ESTC) plain literal    
body http://cnp.ucr.edu/ dcterms:created 1990 typed literal gYear  
body http://cbsr.ucr.edu/ dcterms:hasPart cal news project, use: http://cnp.ucr.edu/ URI   CNP
body http://cnp.ucr.edu/ dcterms:isPartOf project, use: http://www.neh.gov/projects/usnp.html URI    
body http://cnp.ucr.edu/ marcrel:FND NEH, use: http://www.neh.gov/ URI    
body http://ccila.ucr.edu/ dcterms:created 2000 typed literal gYear  
body http://cbsr.ucr.edu/ dcterms:hasPart ccila: use http://ccila.ucr.edu/ URI   CCILA
body http://ccila.ucr.edu/ dcterms:contributor colleagues and inst in NA… plain literal    
body http://cdnc.ucr.edu/ dcterms:created 2005 typed literal gYear  
body http://cbsr.ucr.edu/ dcterms:hasPart cal dig lib: use http://cdnc.ucr.edu/ URI   CDNC
body http://cdnc.ucr.edu/ marcrel:FND NEH, use: http://www.neh.gov/      
body http://cdnc.ucr.edu/ marcrel:FND state library, use: http://www.library.ca.gov/      
body http://estc.ucr.edu/ dcterms:hasFormat http://estc.bl.uk URI    
body http://estc.bl.uk dcterms:available 2006 typed literal gYear  
body http://estc.bl.uk/ dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body http://estc.bl.uk/ dcterms:accessRights free access plain literal    
body http://cbsr.ucr.edu/ dcterms:hasPart http://estc.ucr.edu/EESMain.html URI   Early English Serials
body http://estc.ucr.edu/ dcterms:contributor dozens of libraries… plain literal    
body http://estc.ucr.edu/ dcterms:contributor During 2005-2007, it worked with…: use http://www.britaininprint.net/ URI    
body http://www.britaininprint.net/ dcterms:date 2005-2007 Plain literal    
body http://www.britaininprint.net/ dcterms:creator consortium…: use (http://www.curl.ac.uk/) http://www.rluk.ac.uk/ URI    
body http://www.britaininprint.net/ dcterms:description text:  add online a record… URI    
body http://cnp.ucr.edu/ dcterms:description The california newspaper project is managing… plain literal    
body http://cnma.ucr.edu/ dcterms:description moreover, it has worked… plain literal    
body http://ccila.ucr.edu/ dcterms:description The latin american project will create … plain literal    
body http://cnp.ucr.edu/ dcterms:relation http://www.cnpa.com/ URI   Cal. News. Pubr. Assoc.
body http://ccila.ucr.edu/ dcterms:relation http://abinia.ucol.mx URI   ABINIA
body http://ccila.ucr.edu/ dcterms:relation http://www.library.cornell.edu/colldev/salalmhome.html URI   SALALM
body http://cbsr.ucr.edu/ marcrel:FND national endowment for…: use http://www.neh.gov/ URI    
body http://cbsr.ucr.edu/ marcrel:FND http://www.ed.gov/ URI    
body http://cbsr.ucr.edu/ marcrel:FND a number of private foundations plain literal    

Table D3

Web page: estc.ucr.edu

Location Subject Predicate Object Obj

type

Data

type

Notes
head http://estc.ucr.edu/ dcterms:title English Short-Title Catalog plain literal    
head http://estc.ucr.edu/ dcterms:alternative ESTC plain literal    
head http://estc.ucr.edu/ dcterms:replaces Eighteenth Century Short-Title Catalog plain literal    
head http://estc.ucr.edu/ dcterms:hasPart dcterms:replaces dcterms:references http://lccn.loc.gov/76374523 URI   STC
head http://estc.ucr.edu/ dcterms:hasPart dcterms:replaces dcterms:references  http://lccn.loc.gov/2007002668 URI   Wing
head http://estc.ucr.edu/ dcterms:conformsTo DCRM(B) plain literal    
head http://estc.ucr.edu/ dcterms:description The English Short Title Catalog (ESTC) is a database of books and periodicals covering the years 1475-1800. Included in each entry are a description of the item, microfilm availability, and locations worldwide. plain literal    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85094710#concept URI    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85079330#concept URI    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013850#concept URI    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013851#concept URI    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013852#concept URI    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh2008118518#concept URI    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh2008118519#concept URI    
head http://estc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85020878#concept URI    
head http://estc.ucr.edu/ dcterms:subject center for bibliographical studies and research, cbsr, cnp, cdnc, estc, cnma, california newspaper project, california digital newspaper collection, english short-title catalog, california newspaper microfilm archive, catalogue, news, book, cclia, cat&aacute;logo colectivo de impresos latinoamericanos, latin american imprints, printing, letterpress plain literal    
head http://estc.ucr.edu/ dcterms:publisher British Library and ESTC/NA plain literal    
head http://estc.ucr.edu/ dcterms:rightsHolder British Library and ESTC/NA plain literal    
head http://estc.ucr.edu/ dcterms:audience researchers, scholars plain literal    
head http://estc.ucr.edu/ dcterms:created 1978 typed literal gYear  
head http://estc.ucr.edu/ dcterms:available 1982 typed literal gYear  
head http://estc.ucr.edu/ dcterms:provenance Project to catalog holdings of British Library started in 1978. North American imprints cataloged by the American Antiquarian Society.  North American holdings of English items added from 1980. Since the 1980s, the ESTC has been co-owned by the ESTC/NA and British Library.  In 1982 it was made available for searching through RLIN. In 2003 web login access for contributors was implemented at via estc.ucr.edu. In 2006 free public search access was implemented at estc.bl.uk. plain literal    
head http://estc.ucr.edu/ dcterms:hasVersion URN:ISBN:0753657457 URI   ESTC CD-ROM (3rd ed)
head http://estc.ucr.edu/ dcterms:hasVersion URN:ISBN:9780753657454 URI   ESTC CD-ROM (3rd ed)
head http://estc.ucr.edu/ marcrel:FND http://neh.gov/ URI   NEH
head http://estc.ucr.edu/ marcrel:FND http://www.mellon.org/ URI   Mellon Found.
head http://estc.ucr.edu/ marcrel:FND http://www.rockfound.org/ URI   Rockefeller Foun.
head http://estc.ucr.edu/ marcrel:FND http://www.hwwilson.com/ URI   H. W. Wilson F.
head http://estc.ucr.edu/ marcrel:FND http://www.theahmansonfoundation.org/ URI   Ahmanson F.
head http://estc.ucr.edu/ marcrel:FND http://www.pewtrusts.org/ URI   Pew Char. Trusts
head http://estc.ucr.edu/ marcrel:FND Carl & Lily Pforzheimer Foundation Inc plain literal   Pforzheimer Found.
head http://estc.ucr.edu/ marcrel:FND http://www.delmas.org/ URI   Krieble Delmas F.
head http://estc.ucr.edu/ marcrel:FND https://www.ed.gov/pubs/Biennial/611.html URI   Dept. Ed. II-C
body http://estc.ucr.edu/ dcterms:hasFormat http://estc.bl.uk/ URI    
body http://estc.bl.uk/ dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body http://estc.bl.uk/ dcterms:accessRights free public access plain literal    
body http://estc.bl.uk/ dcterms:available 2006 typed literal gYear  
body http://estc.bl.uk/ dcterms:replaces http://www.worldcat.org/oclc/43313545 URI   OCLC description of BLAISE access
body http://estc.bl.uk/ dcterms:replaces http://www.worldcat.org/oclc/55880642 URI   OCLC description of RLIN access
body http://estc.ucr.edu/ dcterms:requires http://estc.ucr.edu/cgi-bin/rlinlibsearch.pl URI   libraries db
body http://estc.ucr.edu/cgi-bin/rlinlibsearch.pl dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body http://estc.ucr.edu/ dcterms:hasFormat EstcPassword.html URI   webmatchers login access
body EstcPassword.html dcterms:available 2003 typed literal gYear  
body EstcPassword.html dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body EstcPassword.html dcterms:accessRights restricted access plain literal    
body http://estc.ucr.edu/ dcterms:relation reporting.html URI   reporting instructions for contributors
body http://estc.ucr.edu/ dcterms:spatial britem.html URI   describes geographical scope
body http://estc.ucr.edu/ dcterms:temporal CHRONOLOGY_1473-1640.html URI   stc chronology
body http://estc.ucr.edu/ dcterms:relation http://estc.ucr.edu/factotum_index.html URI   factotum index
body http://estc.ucr.edu/ dcterms:hasVersion estcfilm.html URI   film sets of ESTC-scope items
body EESMain.html dcterms:created 1994 typed literal gYear early english serials
body http://estc.ucr.edu/ dcterms:hasPart EESMain.html URI    
body http://estc.ucr.edu/ dcterms:isPartOf http://cbsr.ucr.edu/ URI    
body http://estc.ucr.edu/ dcterms:relation http://cbsr.ucr.edu/cbsrcontacts.html URI    
body http://estc.ucr.edu/ dcterms:relation http://cnma.ucr.edu/ URI    
body http://estc.ucr.edu/ dcterms:relation http://cnp.ucr.edu/ URI    
body http://estc.ucr.edu/ dcterms:relation http://cdnc.ucr.edu/ URI    
body http://estc.ucr.edu/ dcterms:relation http://ccila.ucr.edu/ URI    
body http://estc.ucr.edu/ dcterms:isPartOf http://www.ucr.edu/ URI    
body http://estc.ucr.edu/ dcterms:description The English Short-Title Catalog (ESTC) is a vast database … fully searchable online. plain literal    
body http://estc.ucr.edu/ dcterms:creator the British Library, plain literal   sentence begins: The ESTC is the joint effort of

body

http://estc.ucr.edu/

dcterms:creator

the American Antiquarian Society,

plain literal

body

http://estc.ucr.edu/

dcterms:creator

 the ESTC/NA

plain literal

body

http://estc.ucr.edu/

dcterms:contributor

many contributing libraries throughout the world

plain literal

Table D4

Webpage: cnp.ucr.edu

Location Subject Predicate Object Obj

type

Data

type

Notes
head http://cnp.ucr.edu/ dcterms:title California Newspaper Project plain literal    
head http://cnp.ucr.edu/ dcterms:alternative CNP plain literal    
head http://cnp.ucr.edu/ dcterms:conformsTo CONSER plain literal    
head http://cnp.ucr.edu/ dcterms:description The California Newspaper Project identifies, describes and preserves California newspapers and provides a free public database of California newspaper titles and their locations. plain literal    
head http://cnp.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85094710#concept URI    
head http://cnp.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85079330#concept URI    
head http://cnp.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85091594#concept URI    
head http://cnp.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85091596#concept URI    
head http://cnp.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85091593#concept URI    
head http://cnp.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85020870#concept URI    
head http://cnp.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85091588#concept URI    
head http://cnp.ucr.edu/ dcterms:subject center for bibliographical studies and research, cbsr, cnp, cdnc, estc, cnma, california newspaper project, california digital newspaper collection, english short-title catalog, california newspaper microfilm archive, catalogue, news, book, cclia, cat&aacute;logo colectivo de impresos latinoamericanos, latin american imprints, printing, letterpress plain literal    
head http://cnp.ucr.edu/ dcterms:audience researchers, scholars plain literal    
head http://cnp.ucr.edu/ dcterms:created 1990 typed literal gYear  
head http://cnp.ucr.edu/ dcterms:publisher http://cbsr.ucr.edu/ URI    
head http://cnp.ucr.edu/ dcterms:rightsHolder http://cbsr.ucr.edu/ URI    
body http://cnp.ucr.edu/ dcterms:isPartOf http://cbsr.ucr.edu/ URI    
body http://cnp.ucr.edu/ dcterms:hasVersion cnpsearchdb.html URI    
body cnpsearchdb.html dcterms:accessRights public access plain literal    
body cnpsearchdb.html dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body cnpsearchdb.html dcterms:available 1995 typed literal gYear  
body http://cnp.ucr.edu/ dcterms:relation http://cnma.ucr.edu/ URI    
body http://cnp.ucr.edu/ dcterms:relation http://estc.ucr.edu/ URI    
body http://cnp.ucr.edu/ dcterms:relation http://cdnc.ucr.edu/ URI    
body http://cnp.ucr.edu/ dcterms:relation http://ccila.ucr.edu/ URI    
body http://cnp.ucr.edu/ dcterms:isPartOf http://www.ucr.edu/ URI    
body http://cnp.ucr.edu/ dcterms:relation http://cbsr.ucr.edu/cbsrcontacts.html URI    
body http://cnp.ucr.edu/ dcterms:description The California Newspaper Project is an 18 year effort by the CBSR to identify, describe and preserve California newspapers. Close to 9,000 California newspapers were inventoried in over 14,000 repositories throughout the state, 1.5 million pages of California newspapers were preserved and made available on microfilm, and 100,000 rolls of negative microfilm rolls are being processed for permanent storage at the UC Regional Library Storage Facilities. plain literal    
body http://cnp.ucr.edu/ dcterms:created ...an 18 year effort… : use 1990 typed literal gYear  
body http://cnp.ucr.edu/ dcterms:description The California Newspaper Project is a participant in the United States Newspaper Program. It is supported in part by the National Endowment for the Humanities; the California State Library; and the U.S. Institute of Museum and Library Services, under provisions of the Library Services and Technology Act administered in California by the State Librarian. plain literal    
body http://cnp.ucr.edu/ dcterms:isPartOf http://www.neh.gov/projects/usnp.html URI    
body http://cnp.ucr.edu/ marcrel:FND http://www.neh.gov/ URI    
body http://cnp.ucr.edu/ marcrel:FND http://www.library.ca.gov/ URI    
body http://cnp.ucr.edu/ marcrel:FND IMLS? http://www.library.ca.gov/grants/lsta/ URI   imls lsta funding
body http://cnp.ucr.edu/ dcterms:relation BMI Catalog -- use: http://cnma.ucr.edu/BMIcatalog.html URI    
body http://cnp.ucr.edu/ dcterms:relation BMI, Data and Custom film databases -- use: http://cnma.ucr.edu/additionalresources.html URI    

Table D5

Web page: cdnc.ucr.edu

Location Subject Predicate Object Obj

type

Data

type

Notes
head http://cdnc.ucr.edu/ dcterms:title California Digital Newspaper Collection plain literal    
head http://cdnc.ucr.edu/ dcterms:alternative CDNC plain literal    
head http://cdnc.ucr.edu/ dcterms:description A collection of digitized and searchable California newspapers spanning the years 1849-1911. plain literal    
head http://cdnc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85091588#concept URI    
head http://cdnc.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh2002011497#concept URI    
head http://cdnc.ucr.edu/ dcterms:subject center for bibliographical studies and research, cbsr, cnp, cdnc, estc, cnma, california newspaper project, california digital newspaper collection, english short-title catalog, california newspaper microfilm archive, catalogue, news, book, cclia, cat&aacute;logo colectivo de impresos latinoamericanos, latin american imprints, printing, letterpress plain literal    
head http://cdnc.ucr.edu/ dcterms:audience researchers, scholars plain literal    
head http://cdnc.ucr.edu/ dcerms:created 2005 typed literal gYear  
head http://cdnc.ucr.edu/ dcterms:publisher http://cbsr.ucr.edu/ URI    
head http://cdnc.ucr.edu/ dcterms:rightsHolder http://cbsr.ucr.edu/ URI    
body http://cdnc.ucr.edu/ dcterms:isPartOf http://cbsr.ucr.edu/ URI    
body http://cdnc.ucr.edu/ dcterms:hasVersion /search URI    
body /search dcterms:accessRights public access plain literal    
body /search dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body /search dcterms:available 2007 typed literal gYear  
body http://cdnc.ucr.edu/ dcterms:description about.html URI    
body http://cdnc.ucr.edu/ dcterms:description CallHistory.html URI    
body http://cdnc.ucr.edu/ dcterms:description herald.html URI    
body http://cdnc.ucr.edu/ dcterms:instructionalMethod lessons.html URI    
body http://cdnc.ucr.edu/ dcterms:relation digitizationspecifications.html URI    
body http://cdnc.ucr.edu/ dcterms:relation cndp/index.html URI    
body http://cdnc.ucr.edu/ dcterms:relation DuplicateScanningofMicrofilms.html URI    
body http://cdnc.ucr.edu/ dcterms:relation http://www.ibiblio.org/slanews/conferences/

sla2002/programs/slides/McCargar.ppt

URI    
body http://cdnc.ucr.edu/ dcterms:relation files/20071019_CDNC.pdf URI    
body http://cdnc.ucr.edu/ dcterms:relation http://cnma.ucr.edu/ URI    
body http://cdnc.ucr.edu/ dcterms:relation http://estc.ucr.edu/ URI    
body http://cdnc.ucr.edu/ dcterms:relation http://cnp.ucr.edu/ URI    
body http://cdnc.ucr.edu/ dcterms:relation http://ccila.ucr.edu/ URI    
body http://cdnc.ucr.edu/ dcterms:isPartOf http://www.ucr.edu/ URI    
body http://cdnc.ucr.edu/ dcterms:relation http://cbsr.ucr.edu/cbsrcontacts.html URI    
body http://cdnc.ucr.edu/ dcterms:description The California Digital Newspaper Collection offers over 200,000 pages of California newspapers spanning the years 1849-191l: the Alta California, 1849-1891; the San Francisco Call, 1893-1910; the Amador Ledger, 1900-1911; the Imperial Valley Press, 1901-1911; the Sacramento Record-Union, 1859-1890; and the Los Angeles Herald, 1905-1907. Additional years are forthcoming, as are other early California newspapers: the Californian; the California Star; the California Star and Californian; the Sacramento Transcript; the Placer Times; and the Pacific Rural Press. plain literal    
body http://cdnc.ucr.edu/ marcrel:FND Library Services and Technology Act IMLS? -- use: http://www.library.ca.gov/grants/lsta/ URI    
body http://cdnc.ucr.edu/ marcrel:FND http://www.neh.gov/ URI    
body http://cdnc.ucr.edu/ dcterms:isPartOf http://www.neh.gov/projects/ndnp.html URI    

Table D6

Web page: ccila.ucr.edu

Location Subject Predicate Object Obj

type

Data

type

head http://ccila.ucr.edu/ dcterms:title Cat&aacute;logo Colectivo de Impresos Latinoamericanos plain literal  
head http://ccila.ucr.edu/ dcterms:alternative CCILA plain literal  
head http://ccila.ucr.edu/ dcterms:conformsTo AACR2 plain literal  
head http://ccila.ucr.edu/ dcterms:description The Cat&aacute;logo Colectivo de Impresos Latinoamericanos (CCILA) is a database of books and periodicals covering the years 1539-1850. Included in each entry are a description of the item and locations worldwide. plain literal  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh2008115995#concept URI  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85094710#concept URI  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85079330#concept URI  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013850#concept URI  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013851#concept URI  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013852#concept URI  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh2008118518#concept URI  
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85020878#concept URI  
head http://ccila.ucr.edu/ dcterms:subject center for bibliographical studies and research, cbsr, cnp, cdnc, estc, cnma, california newspaper project, california digital newspaper collection, english short-title catalog, california newspaper microfilm archive, catalogue, news, book, cclia, cat&aacute;logo colectivo de impresos latinoamericanos, latin american imprints, printing, letterpress plain literal  
head http://ccila.ucr.edu/ dcterms:audience researchers, scholars plain literal  
head http://ccila.ucr.edu/ dcterms:created 2000 typed literal gYear
head http://ccila.ucr.edu/ dcterms:publisher http://cbsr.ucr.edu/ URI  
head http://ccila.ucr.edu/ dcterms:rightsHolder http://cbsr.ucr.edu/ URI  
head http://ccila.ucr.edu/ marcrel:FND http://nsf.gov/ URI  
head http://ccila.ucr.edu/ marcrel:FND http://www.ucmexus.ucr.edu/ URI  
body http://ccila.ucr.edu/es/ dcterms:language spa typed literal language
body http://ccila.ucr.edu/ dcterms:hasVersion http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK URI  
body http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK dcterms:accessRights Free public access plain literal  
body http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK dcterms:type http://purl.org/dc/dcmitype/Dataset URI  
body http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK dcterms:available 2003 typed literal gYear
body http://ccila.ucr.edu/ dcterms:contributor CCILA_Contributors.html URI  
body http://ccila.ucr.edu/ dcterms:hasPart dcterms:replaces dctersm: references bibliographies.html URI  
body http://ccila.ucr.edu/ dcterms:description about.html URI  
body http://ccila.ucr.edu/ dcterms:hasPart LASTC_Proposal.html URI  
body http://ccila.ucr.edu/ dcterms:coverage scope.html URI  
body http://ccila.ucr.edu/ dcterms:hasPart CCILA_Progress_report.html URI  
body http://ccila.ucr.edu/ dcterms:hasPart CCILA_report_Spanish.pdf URI  
body http://ccila.ucr.edu/ dcterms:relation http://cbsr.ucr.edu/cbsrcontacts.html URI  
body http://ccila.ucr.edu/ dcterms:isPartOf http://cbsr.ucr.edu/ URI  
body http://ccila.ucr.edu/ dcterms:relation http://cnma.ucr.edu/ URI  
body http://ccila.ucr.edu/ dcterms:relation http://estc.ucr.edu/ URI  
body http://ccila.ucr.edu/ dcterms:relation http://cdnc.ucr.edu/ URI  
body http://ccila.ucr.edu/ dcterms:isPartOf http://www.ucr.edu/ URI  
body http://ccila.ucr.edu/ dcterms:description The Cat&aacute;logo Colectivo de Impresos Latinoamericanos hasta 1851 (CCILA), when complete, will provide digital access … plain literal  
body http://ccila.ucr.edu/ dcterms:description Phase One of CCILA is based on keyed versions of all relevant bibliographies … plain literal  
body http://ccila.ucr.edu/ dcterms:description Phase Two is focused on expanding Phase One through cooperation … plain literal  
body http://ccila.ucr.edu/ dcterms:description Concurrently with Phases One and Two, the Director of CBSR … plain literal  
body http://ccila.ucr.edu/ dcterms:description As part of the effort to recover the print heritage of Latin America … plain literal  

Table D7

Web page: ccila.ucr.edu/es/

Location Subject Predicate Object Obj

type

Data

type

Notes
head http://ccila.ucr.edu/es/ dcterms:title Cat&aacute;logo Colectivo de Impresos Latinoamericanos plain literal    
head http://ccila.ucr.edu/es/ dcterms:alternative CCILA plain literal    
head http://ccila.ucr.edu/es/ dcterms:conformsTo AACR2 plain literal    
head http://ccila.ucr.edu/es/ dcterms:description The Cat&aacute;logo Colectivo de Impresos Latinoamericanos (CCILA) es una base de datos para libros y seriales desde 1539 hasta 1851.  Cada registro tiene una descripci&oacute;n, y una lista de bibliotecas de todo el mundo. plain literal   translation proofed
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh2008115995#concept URI    
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85094710#concept URI    
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85079330#concept URI    
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013850#concept URI    
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013851#concept URI    
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85013852#concept URI    
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh2008118518#concept URI    
head http://ccila.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85020878#concept URI    
head http://ccila.ucr.edu/es/ dcterms:subject center for bibliographical studies and research, cbsr, cnp, cdnc, estc, cnma, california newspaper project, california digital newspaper collection, english short-title catalog, california newspaper microfilm archive, catalogue, news, book, cclia, cat&aacute;logo colectivo de impresos latinoamericanos, latin american imprints, printing, letterpress plain literal    
head http://ccila.ucr.edu/es/ dcterms:audience researchers, scholars plain literal    
head http://ccila.ucr.edu/es/ dcterms:created 2000 typed literal gYear  
head http://ccila.ucr.edu/es/ dcterms:publisher http://cbsr.ucr.edu/ URI    
head http://ccila.ucr.edu/es/ dcterms:rightsHolder http://cbsr.ucr.edu/ URI    
head http://ccila.ucr.edu/ marcrel:FND http://nsf.gov/ URI    
head http://ccila.ucr.edu/ marcrel:FND http://www.ucmexus.ucr.edu/ URI    
body http://ccila.ucr.edu/ dcterms:language en typed literal language  
body http://ccila.ucr.edu/es/ dcterms:hasVersion http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK URI    
body http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK dcterms:accessRights Free public access plain literal    
body http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
  http://ccila.ucr.edu/cgi-bin/starfinder/0&#63;path=lastc.txt&amp;id=lastc&amp;pass=lastc&amp;OK=OK dcterms:available 2003 typed literal gYear  
body http://ccila.ucr.edu/es/ dcterms:contributor contribuidores.html URI    
body http://ccila.ucr.edu/es/ dcterms:hasPart dcterms:replaces dctersm: references bibliografias.html URI    
body http://ccila.ucr.edu/es/ dcterms:description sobre.html URI    
body http://ccila.ucr.edu/es/ dcterms:hasPart oferta.html URI    
body http://ccila.ucr.edu/es/ dcterms:coverage alcance.html URI    
body http://ccila.ucr.edu/es/ dcterms:hasPart report_2004.html URI    
body http://ccila.ucr.edu/es/ dcterms:hasPart ../CCILA_report_Spanish.pdf URI    
body http://ccila.ucr.edu/es/ dcterms:relation http://cbsr.ucr.edu/cbsrcontacts.html URI    
body http://ccila.ucr.edu/es/ dcterms:isPartOf http://cbsr.ucr.edu/ URI    
body http://ccila.ucr.edu/es/ dcterms:relation http://cnma.ucr.edu/ URI    
body http://ccila.ucr.edu/es/ dcterms:relation http://estc.ucr.edu/ URI    
body http://ccila.ucr.edu/es/ dcterms:relation http://cdnc.ucr.edu/ URI    
body http://ccila.ucr.edu/es/ dcterms:isPartOf http://www.ucr.edu/ URI    
body http://ccila.ucr.edu/es/ dcterms:description El Cat&aacute;logo Colectivo de Impresos Latinoamericanos hasta 1851 … plain literal    
body http://ccila.ucr.edu/es/ dcterms:relation ABINIA -- use: http://www.abinia.org/ URI   sentence begins: El CCILA ha
body http://ccila.ucr.edu/es/ dcterms:relation SALALM -- use: http://www.salalm.org/ URI   recibido respaldo
body http://ccila.ucr.edu/es/ dcterms:description La primera fase del CCILA est&aacute; basada … plain literal    
body http://ccila.ucr.edu/es/ dcterms:description Concurrentemente con las dos fases, el Director del CBSR … plain literal    
body http://ccila.ucr.edu/es/ dcterms:description Como parte del esfuerzo para recuperar … plain literal    

Table D8

Web page: cnma.ucr.edu

Location Subject Predicate Object Obj

type

Data

type

Notes
head http://cnma.ucr.edu/ dcterms:title California Newspaper Microfilm Archive plain literal    
head http://cnma.ucr.edu/ dcterms:alternative CNMA plain literal    
head http://cnma.ucr.edu/ dcterms:description An archive of master negative microfilm rolls stored at the UC Regional Library Storage Facilities. plain literal    
head http://cnma.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85091636#concept URI    
head http://cnma.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85084838#concept URI    
head http://cnma.ucr.edu/ dcterms:subject http://id.loc.gov/authorities/sh85106451#concept URI    
head http://cnma.ucr.edu/ dcterms:subject center for bibliographical studies and research, cbsr, cnp, cdnc, estc, cnma, california newspaper project, california digital newspaper collection, english short-title catalog, california newspaper microfilm archive, catalogue, news, book, cclia, cat&aacute;logo colectivo de impresos latinoamericanos, latin american imprints, printing, letterpress plain literal    
head http://cnma.ucr.edu/ dcterms:audience researchers, scholars plain literal    
head http://cnma.ucr.edu/ dcterms:created 2009 typed literal gYear  
head http://cnma.ucr.edu/ dcterms:available 2010 typed literal gYear  
head http://cnma.ucr.edu/ dcterms:publisher http://cbsr.ucr.edu/ URI    
head http://cnma.ucr.edu/ dcterms:rightsHolder http://library.ucr.edu/ URI    
body http://cnma.ucr.edu/ dcterms:hasPart dcterms:replaces BMIcatalog.html URI    
body BMIcatalog.html dcterms:hasPart files/catalog_A-G.pdf URI    
body BMIcatalog.html dcterms:hasPart files/catalog_H-P.pdf URI    
body BMIcatalog.html dcterms:hasPart files/catalog_Q-Z.pdf URI    
body http://cnma.ucr.edu/ dcterms:hasPart additionalresources.html URI    
body http://cnma.ucr.edu/ dcterms:hasPart http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=datafmpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK URI   data db access
body http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=datafmpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK dcterms:accessRights Free public access plain literal    
body http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=datafmpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body http://cnma.ucr.edu/ dcterms:hasPart http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=customlistpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK URI   custom db access
body http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=customlistpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK dcterms:accessRights Free public access plain literal    
body http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=customlistpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body http://cnma.ucr.edu/ dcterms:hasPart http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=BMIpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK URI   bmi db access
body http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=BMIpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK dcterms:accessRights Free public access plain literal    
body http://cnp.ucr.edu/cgi-bin/starfinder/0&#63;path=BMIpub.txt&amp;id=cnppub&amp;pass=cnppub&amp;OK=OK dcterms:type http://purl.org/dc/dcmitype/Dataset URI    
body http://cnma.ucr.edu/ dcterms:hasPart http://cnma.ucr.edu/orderform.html URI    
body http://cnma.ucr.edu/ dcterms:relation http://cbsr.ucr.edu/cbsrcontacts.html URI    
body http://cnma.ucr.edu/ dcterms:isPartOf http://cbsr.ucr.edu/ URI    
body http://cnma.ucr.edu/ dcterms:relation http://estc.ucr.edu/ URI    
body http://cnma.ucr.edu/ dcterms:relation http://cnp.ucr.edu/ URI    
body http://cnma.ucr.edu/ dcterms:relation http://cdnc.ucr.edu/ URI    
body http://cnma.ucr.edu/ dcterms:relation http://ccila.ucr.edu/ URI    
body http://cnma.ucr.edu/ dcterms:isPartOf http://www.ucr.edu/ URI    

homepage

contents

contact us