Previous PageTable Of ContentsNext Page

3. General Issues

3.1 Before you start

It is assumed that the readers of this document are already XML savvy. We are not going to discuss in detail XML. The readers are, however, introduced to the very essential issues of converting to XML. This section describes the process of converting your local data to AGRIS AP compliant XML documents.

The AGRIS AP, or in database terms the AGRIS data model, prescribes the vocabulary, content and structure rules that can be used to share information between heterogeneous datasets without requiring any change to the local system. With the possibility of using tools such as XSLT, information extraction and conversion becomes a simple yet extremely important task towards facilitating interoperability. The fact that the resource itself is not required to be attached to the metadata makes it easy to control access rights on it.

The AGRIS AP is accompanied by a DTD4 which is used to validate inputs from different resource centres. The first steps should be:

3.2 Exporting from XML-enabled databases

The production and export of XML resources from local databases to the AGRIS AP model (see Figure 1 below) is facilitated when the source DB is XML-enabled, that is if it supports extensions for transferring data between XML documents and their own data structures.

The following four steps briefly describe the process of generating valid AGRIS XML records from proprietary XML-enabled databases:

  1. Identification of the fields in the catalogue of the local database that will match the AGRIS AP XML DTD elements and schemes. The resulting mapping document links the fields of the local database to the elements and qualifiers of the DTD.
  2. An XSLT stylesheet encodes the mapping document produced by the cataloguers. The template will link and match the nodes from each field of the local database to the appropriate elements and schemes of the AGRIS AP XML DTD.
  3. The well-formed XML documents are converted to AGRIS AP XML resources by means of the XSL processor.
  4. The XML documents are validated against the AGRIS AP XML DTD by means of XML parsers (3.2).

Figure 1: The AGRIS AP XML production process

3.3 The OAI-PMH6 example

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH in short), “provides an application-independent interoperability framework based on metadata harvesting”. Implementing the OAI protocol, data providers are able to generate well formed and valid XML data, by mapping their local repositories to a common DC metadata format. The The Norwegian Univ. Library of Life Sciences7 as data provider has implemented the OAI protocol, and exposed their metadata to the AGRIS harvester by means of a unique identifier (URI). For detailed information on the OAI-PMH implementation refer to its guidelines8.

3.4 XSLT transformation to the AGRIS AP metadata

The final production of valid AGRIS AP XML documents is achieved with XSLT Stylesheets. The Extensible Style Language (XSL) provides elements that define rules for how one XML document is transformed into another XML document. In this context, the development of an XSLT can be an easy exercise if the structure of the local repository is DC compliant. For other more complex metadata formats, such as MARC, this can require more work. There are several options that can be considered, each one having particular requirements, such as different fields, different conditions and rules to apply, and accordingly different stylesheets to encode.

The example provided below is being used by one of the AGRIS Resource Centres that use an XML-enabled ILMS called InMagic. It shows how an XML tag of a record extracted by InMagic is transformed (from a well-formed “InMagic” XML to valid AGRIS AP XML) using an XPath expression, addressing the Title node of the local DB in the output result tree. In human language the XSLT instructions say: if title element exists, select it and write it with the right AGRIS AP XML tag, in this case the core Dublin Core element dc:title.

Input

<inm:Title---Eng-M>Conservation and use of native tropical fruit species biodiversity in Asia</inm:Title---Eng-M>

XSLT instructions

<xsl:if test="string-length(inm:Title---Eng-M)>0">
<dc:title xml:lang="eng">
<xsl:value-of select="inm:Title---Eng-M"/>
<xsl:text/>
</dc:title>
</xsl:if>

Output

<dc:title xml:lang="eng">Conservation and use of native tropical fruit species biodiversity in Asia</dc:title>

3.5 Tools to validate XML documents

Validating parsers check the well-formedness of the XML documents and verify that the same documents conform to the specific rules of the AGRIS AP XML DTD. The process of validating can be easily achieved with the Microsoft XML Parser (MSXML) which is included in Microsoft Internet Explorer. In the next section we will see that AGRIS AP XML validation is facilitated in that the AGRIS DTD is located in a fixed (PURL) location.

Other XML parsers, many of them freeware, are available on the Internet9. The tool that is most widely used is XML Spy10, a comprehensive package used to create, edit and validate XML, XSL and DTD/XML Schemas documents.


4 The AGRIS AP XML DTD is available for validation at http://purl.org/agmes/agrisap/dtd/ and, for display, in APPENDIX A
5 http://www.fao.org/agris/
6 http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm
7 http://www.umb.no/
8 http://www.openarchives.org/OAI/2.0/guidelines.htm
9 http://www.xml.com/pub/rg/XML_Parsers
10 http://www.altova.com



Previous PageDébut de pageNext Page