Monday, April 28, 2014

General, Reusable Metadata Ontology - V0.2

This is just a short post that a newer version of the general metadata ontology is available. The ontology was originally discussed in a blog post on April 16th. And, if you have trouble downloading the files, there is help in the blog post from Apr 17th.

I have taken all the feedback, and reworked and simplified the ontology (I hope). All the changes are documented in the ontology's changeNote.

Important sidebar: I strongly recommend using something like a changeNote to track the evolution of every ontology and model.

As noted in the Apr 16th post, most of the concepts in the ontology are taken from the Dublin Core ELements vocabulary and the SKOS data model. In this version, the well-established properties from Dublin Core and SKOS use the namespaces/IRIs from those sources (http://purl.org/dc/elements/1.1/ and http://www.w3.org/2004/02/skos/core#, respectively). Some examples are dc:contributor, dc:description and skos:prefLabel. Where the semantics are different, or more obvious names are defined (for example, creating names that provide "directions" for the skos:narrower and broader relations), then the purl.org/ninepts namespace is used.

This release is getting much closer to a "finished" ontology. All of the properties have descriptions and examples, and most have scope/usage notes. The ontology's scope note describes what is not mapped from Dublin Core and SKOS, and why.

In addition, I have added two unique properties for the ontology. One is competencyQuestions and the other is competencyQuery. The concept of competency questions was originally defined in a 1995 paper by Gruninger and Fox as "requirements that are in the form of questions that [the] ontology must be able to answer." The questions help to define the scope of the ontology, and are [should be] translated to queries to validate the ontology. These queries are captured in the metadata ontology as SPARQL queries (and the corresponding competency question is included as a comment in the query, so that it can be tracked). This is a start at test-driven development for ontologies. :-)

Please take a look at the ontology (even if you did before since it has evolved), and feel free to comment or (even better) contribute.

Andrea

Thursday, April 17, 2014

Downloading the Metadata Ontology Files from GitHub

Since I posted my ontology files to GitHub, and got some emails that the downloads were corrupted, I thought that I should clarify the download process.

You are certainly free to fork the repository and get a local copy. Or, you can just download the file(s) by following these instructions:
  • LEFT click on the file in the directory on GitHub
  • The file is displayed with several tabs across the top. Select the Raw tab.
  • The file is now displayed in your browser window as text. Save the file to your local disk using the "Save Page As ..." drop-down option, under File.
After you download the file(s), you can then load one of them into something like Protege. (It is only necessary to load one since they are all the same.) Note that there are NO classes, data or object properties defined in the ontology. There are only annotation properties that can be used on classes, data and object properties. Since I need this all to be usable in reasoning applications, I started with defining and documenting annotation properties.

I try to note this in a short comment on the ontology (but given the confusion, I should probably expand the comment). I am also working on a metadata-properties ontology which defines some of the annotation properties as data and object properties. This will allow (for example) validating dateTime values and referencing objects/individuals in relations (as opposed to using literal values). It is important to note, however, that you can only use data and object properties with individuals (and not with class or property declarations, or you end up with OWL Full with no computational guarantees/no reasoning).

Lastly, for anyone that objects to using annotation properties for mappings (for example, where I map SKOS' exactMatch in the metadata-annotations ontology), no worries ... More is coming. As a place to start, I defined exactMatch, moreGeneralThan, moreSpecificThan, ... annotation properties for documentation and human-consumption. (I have to start somewhere. :-) And, I tried to be more precise in my naming than SKOS, which names the latter two relations, "broader" and "narrower", with no indication of whether the subject or the object is more broad or more narrow. (I always get this mixed up if I am away from the spec for more than a week. :-)

I want to unequivocally state that annotation properties are totally inadequate to do anything significant. But, they are a start, and something that another tool could query and use. Separately, I am working on a more formal approach to mapping but starting with documentation is where I am.

Obviously, there is a lot more work in the pipeline. I just wish I had more time (like everyone).

In the meantime, please let me know if you have more questions about the ontologies or any of my blog entries.

Andrea

Wednesday, April 16, 2014

General, Reusable, Metadata Ontology

I recently created a new ontology, following the principles discussed in Ontology Summit 2014's Track A. (If you are not familiar with the Summit, please check out some of my earlier posts.) My goal was to create a small, focused, general, reusable ontology (with usage and scope information, examples of each concept, and more). I must admit that it was a lot more time-consuming than I anticipated. It definitely takes time to create the documentation, validate and spell-check it, make sure that all the possible information is present, etc., etc.

I started with something relatively easy (I thought), which was a consolidation of basic Dublin Core and SKOS concepts into an OWL 2 ontology. The work is not yet finished (I have only been playing with the definition over the last few days). The "finished" pieces are the ontology metadata/documentation (including what I didn't map and why), and several of the properties (contributor, coverage, creator, date, language, mimeType, rights and their sub-properties). The rest is all still a work-in-progress.

It has been interesting creating and dog-fooding the ontology. I can definitely say that it was updated based on my experiences in using it!

You can check out the ontology definition on github (http://purl.org/ninepts/metadata). My "master" definition is in the .ofn file (OWL functional syntax), and I used Protege to generate a Turtle encoding from it. My goals are to maintain the master definition in a version-control-friendly format (ofn), and also providing a somewhat human-readable format (ttl). I also want to experiment with different natural language renderings that are more readable than Turtle (but I am getting ahead of myself).

I would appreciate feedback on this metadata work, and suggestions for other reusable ontologies (that would help to support industry and refine the development methodology). Some of the ontologies that I am contemplating are ontologies for collections, events (evaluating and bringing together concepts from several, existing event ontologies), actors, actions, policies, and a few others.

Please let me know what you think.

Andrea

Saturday, April 5, 2014

Ontology Reuse and Ontology Summit 2014

I've been doing a lot of thinking about ontology and vocabulary reuse (given my role as co-champion of Track A in Ontology Summit 2014). We are finally in our "synthesis" phase of the Summit, and I just updated our track's synthesis draft yesterday.

So, while this is all fresh in my mind, I want to highlight a few key take-aways ... For an ontology to be reused, it must provide something "that is commonly needed"; and then, the ontology must be found by someone looking to reuse it, understood by that person, and trusted as regards its quality. (Sam Adams made all these points in 1993 in a panel discussion on software reuse.) To be understood and trusted, it must be documented far more completely than is (usually) currently done.

Here are some of the suggestions for documentation:
  • Fully describe and define each of the concepts, relationships, axioms and rules that make up the ontology (or fragment)
  • Explain why the ontology was developed
  • Explain how the ontology is to be used (and perhaps how the uses may vary with different triple stores or tools)
  • Explain how the ontology was/is being used (history) and how it was tested in those environment(s)
    • Explain differences, if it is possible to use the ontology in different ways in different domains and/or for different purposes
  • Provide valid encoding(s) of the ontology
    • These encodings should discuss how each has evolved over time
    • "Valid" means that there are no consistency errors when a reasoner is run against the ontology
    • It is also valuable to create a few individuals, run a reasoner, and make sure that the individual's subsumption hierarchy is correct (e.g., an individual that is supposed to only be of type "ABC", is not also of type "DEF" and "XYZ")
    • Multiple encodings may exist due to the use of different syntaxes (Turtle and OWL Functional Syntax, for example, to provide better readability, and better version control, respectively) and to specifically separate the content to provide:
      • A "basic" version of the ontology with only the definitive concepts, axioms and properties
      • Other ontologies that add properties and axioms, perhaps to address particular domains
      • Rules that apply to the ontology, in general or for particular domains
Defining much of this information is a goal of the VOCREF (Vocabulary and Ontology Characteristics Related to Evaluation of Fitness) Ontology, which was a Hackathon event in this year's Ontology Summit. I participated in that event on March 29th and learned a lot. A summary of our experiences and learnings is posted on the Summit wiki.

VOCREF is a good start at specifying characteristics for an ontology. I will certainly continue to contribute to it. But, I also feel that too much content is contained in the vocref-top ontology (I did create an issue to address this). That makes it too top-heavy and not as reusable as I would like. Some of the content needs to be split into separate ontologies that can be reused independently of characterizing an ontology. Also, the VOCREF ontology needs to "dog-food" its own concepts, relationships, ... VOCREF itself needs to be more fully documented.

To try to help with ontology development and reuse, I decided to start a small catalog of content (I won't go so far as to call it a "repository"). The content in the catalog will vary from annotation properties that can provide basic documentation, to general concepts applicable to many domains (for example, a small event ontology), to content specific to a domain. The catalog may directly reference, document and (possibly) extend ontologies like VOCREF (with correct attribution), or may include content that is newly developed. For example, right now, I am working on some general patterns and a high level network management ontology. I will post my current work, and then drill-down to specific semantics.

All of the content will be posted on the Nine Points github page. The content will be fully documented, and licensed under the MIT License (unless prohibited by the author and the licensing of the original content). In addition, for much of the content, I will also try to discuss the ontology here, on my blog.

Let me know if you have feedback on this approach and if there is some specific content that you would like to see!

Andrea