It is assumed that the readers of this document are already XML savvy. We are not going to discuss in detail XML. The readers are, however, introduced to the very essential issues of converting to XML. This section describes the process of converting your local data to AGRIS AP compliant XML documents.
The AGRIS AP, or in database terms the AGRIS data model, prescribes the vocabulary, content and structure rules that can be used to share information between heterogeneous datasets without requiring any change to the local system. With the possibility of using tools such as XSLT, information extraction and conversion becomes a simple yet extremely important task towards facilitating interoperability. The fact that the resource itself is not required to be attached to the metadata makes it easy to control access rights on it.
The AGRIS AP is accompanied by a DTD4 which is used to validate inputs from different resource centres. The first steps should be:
The production and export of XML resources from local databases to the AGRIS AP model (see Figure 1 below) is facilitated when the source DB is XML-enabled, that is if it supports extensions for transferring data between XML documents and their own data structures.
The following four steps briefly describe the process of generating valid AGRIS XML records from proprietary XML-enabled databases:
Figure 1: The AGRIS AP XML production process
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH in short), “provides an application-independent interoperability framework based on metadata harvesting”. Implementing the OAI protocol, data providers are able to generate well formed and valid XML data, by mapping their local repositories to a common DC metadata format. The The Norwegian Univ. Library of Life Sciences7 as data provider has implemented the OAI protocol, and exposed their metadata to the AGRIS harvester by means of a unique identifier (URI). For detailed information on the OAI-PMH implementation refer to its guidelines8.
The final production of valid AGRIS AP XML documents is achieved with XSLT Stylesheets. The Extensible Style Language (XSL) provides elements that define rules for how one XML document is transformed into another XML document. In this context, the development of an XSLT can be an easy exercise if the structure of the local repository is DC compliant. For other more complex metadata formats, such as MARC, this can require more work. There are several options that can be considered, each one having particular requirements, such as different fields, different conditions and rules to apply, and accordingly different stylesheets to encode.
The example provided below is being used by one of the AGRIS Resource Centres that use an XML-enabled ILMS called InMagic. It shows how an XML tag of a record extracted by InMagic is transformed (from a well-formed “InMagic” XML to valid AGRIS AP XML) using an XPath expression, addressing the Title node of the local DB in the output result tree. In human language the XSLT instructions say: if title element exists, select it and write it with the right AGRIS AP XML tag, in this case the core Dublin Core element dc:title.
Input
<inm:Title---Eng-M>Conservation and use of native tropical fruit species biodiversity in Asia</inm:Title---Eng-M> |
XSLT instructions
<xsl:if test="string-length(inm:Title---Eng-M)>0"> |
Output
<dc:title xml:lang="eng">Conservation and use of native tropical fruit species biodiversity in Asia</dc:title> |
Validating parsers check the well-formedness of the XML documents and verify that the same documents conform to the specific rules of the AGRIS AP XML DTD. The process of validating can be easily achieved with the Microsoft XML Parser (MSXML) which is included in Microsoft Internet Explorer. In the next section we will see that AGRIS AP XML validation is facilitated in that the AGRIS DTD is located in a fixed (PURL) location.
Other XML parsers, many of them freeware, are available on the Internet9. The tool that is most widely used is XML Spy10, a comprehensive package used to create, edit and validate XML, XSL and DTD/XML Schemas documents.
4 The AGRIS AP XML DTD is available for validation at http://purl.org/agmes/agrisap/dtd/ and, for display, in APPENDIX A
5 http://www.fao.org/agris/
6 http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm
7 http://www.umb.no/
8 http://www.openarchives.org/OAI/2.0/guidelines.htm
9 http://www.xml.com/pub/rg/XML_Parsers
10 http://www.altova.com