Another Step Toward Lifting Library Metadata into the Cloud


Linked Data Cloud

up

up

http://content.esotericteaching.org/courses/intro_to_ontology/Ontology.jpg

up

mods logo

up

MARC logo

This post is the beginning of what will eventually be a longer and more complete entry describing my effort toward, and reasons for attempting to create a MODS ontology.

First, a simple statement of the reason for attempting to create a MODS ontology.

Long-term goal: To help open the door for the vast quantity of the world’s MARC formatted librarian created metadata to migrate into the Linked Open Data space.

I would hope that the existence of an established and accepted ontology for MODS could enable the conversion of MARC to MODS to RDF in such a way that would facilitate ingestion of that data into triplestores in a manner that would lead to support of “natural” SPARQL query formation. Unlike the XML schema representation of MODS, an OWL-based expression of MODS could also provide an ontological base for asserting equivalent (owl:sameas) relationships to other ontologies, and other set-theoretic assertions about MODS elements. These additional benefits of a MODS ontology could, hopefully, help to establish richer potential for library metadata to be integrated and queried (via SPARQL) with other Linked Open Data sets.

Additionally, see my follow-up post, More on Motivation for Investment in Implementation of a MODS Ontology

As for my efforts toward developing such an ontology, the sequence of images below is intended to serve as a visual aid for comprehending the increasing complexity of structure involved in implementing an RDF/OWL ontology based representation of the MODS XML schema discussed at: http://www.loc.gov/standards/mods/ and defined in detail at: http://www.loc.gov/standards/mods/v3/mods-3-3.xsd

Each succeeding image builds on the previous one in the sequence by adding representations of a new set of statements about the MODS ontology to the previous level. The sets of statements chosen for addition at each new level of the progression are selected so as to keep the overall structure of the representation “tree-like” for as long as possible, only introducing visually complicating overlapping (one-to-many) relationships in later levels, after most generally “tree-preserving” visualizations have been exhausted.

The purpose of matching the increasingly complex visualization sequence to an increasingly complete set of RDF statement from which each visualization was derived, is intended to assist reviewers in understanding and verifying the accuracy and completeness of the final full set of statements which comprise the full ontology being offered for consideration.

Thus, each image level corresponds to an increasingly large subset of the entire set of RDF statements about the MODS ontology and has been produced by the open source graph visualization program Cytoscape (available for free download from the main Cytoscape site). A table of RDF (subject/predicate/object) statements is associated with each image, and each image was produced by importing that set of statements into Cytoscape. The full spreadsheet is also available as: Excel File and Google Doc.

Furthermore, associated with each image and data table level is a corresponding already imported Cytoscape (.cys) file which may be downloaded, viewed and manipulated directly within Cytoscape.

Click on any of the images for larger versions.

Level 1
Level 1
This graph clearly demonstrates the connection of the 20 top-level MODS elements to the “modsGroup” center, which is in turn connected to the most general “Owl:Thing –> ModsCollection –> Mods” hierarchy.
level.01_statement_table
level.01.cys

Level 2
Level 2
Adds another level of class structure to top-level mods elements that require it.
level.02_statement_table
level.02.cys

Level 3
Level 3
Adds unique literal (owl:DataTypeProperty) values to appropriate locations in the MODS ontology structure.
level.03_statement_table
level.03.cys

Level 4
Level 4
Adds remaining components that preserve the pure tree structure of the graph, including: repeated groups of predicates, such as: Xlink and LanguageGroup, plus enumeration classes, enumeratio values, and a few subclasses of owl:Thing.
level.04_statement_table
level.04.cys

Level 5
Level 5
Adds only eight new statements that are brought in specially now because they begin to significantly alter the pure tree-like structure of the previous graphs. In other words, they introduce new branches to already connected nodes, and thus begin to introduce noticeable new complexities in the graph.
level5_statement_table
level.05.cys

Level 6
Level6
Adds numerous repeated subClass relationships, identifying which items are:
Date, LanguageGroup, Xlink and Enumeration. Adding these new nodes and branches significantly complicates the visual appearance of the graph, since these many-to-one relationships introduce crossing branches. However, by bringing them in in this late order has allowed us to preserve some visual clarity of structure and thus better understand and be able to verify the completeness of the build-up of the ontology to this point.
level.06_statement_table
level.06.cys

Note a couple other less useful, but possibly interesting patterns that can be derived from the master spreadsheet. These are provided mostly to stimulate the imagination regarding the kinds of patterns which might be observed with this technique:

2-3-4

2-3-5

In addition to the as yet incomplete upload of data and graphs above, the following is little more than an outline of areas that remain to be completed in the rest of this article.

Key issues to be considered and perhaps debated:

My currently most complete OWL MODS ontology is downloadable from here.

An Example MODS Record expressed in the candidate ontology

Not yet available

Tools Used:

Basic list of owl:DatatypeProperty (literal) predicates defined in this MODS ontology:

Text outline of MODS structure

References and Resources :

UCSD implementation and utilization of MODS in XDRE/DAMS/PAS

Acknowledgements:

I would particularly like to thank Gokhan Soydan from TopQuadrant for his assistance in making use of the TopBraid XML Schema Importer which, although not 100% automatic, was extremely helpful in providing a concrete example of how XML schema language might be transformed into OWL. Gokhan has indicated that he expects future version of TopBraid Composer to more completely handle conversion of XML Schema construct to OWL. The current version (1.3.0) which I used did most of the job, though it took me a while to realize that because it did fail to process the top-level tag elements which, unfortunately, defines the whole first-level primary structure that bind the 20 top-level MODS elements to the central “modsGroup”.

, ,

  1. #1 by admin - July 30th, 2009 at 11:53

    2009.07.28

    From: twitter.com/declan

    @edsu @anarchivist @mjgiarlo Looking for feedback: Another Step Toward Lifting Library Metadata into the Cloud http://tinyurl.com/nfhhpk

    From: twitter.com/epoz

    “But why stick with MODS? Have you perused: http://bibliontology.com/ ?

    2009.07.29

    From: twitter.com/cfrymann

    “Starting to share work on MODS ontology development” (http://tinyurl.com/nfhhpk)

    “@epoz Thanks for the reference to http://bibliontology.com

    From: twitter.com/epoz

    “@cfrymann You’re welcome. Great work there.”

    “It is very modish to make everything MODS, but it feels like a square/round/peg/hole thing to me often.”

  2. #2 by Cyril - September 15th, 2011 at 01:03

    Hello! This is my first comment in this article thus i i would like to provide a simple shout out and show you I absolutely love looking through your blog articles. Could you recommend any blogs

(will not be published)
  1. No trackbacks yet.