Saturday, April 5, 2014

Ontology Reuse and Ontology Summit 2014

I've been doing a lot of thinking about ontology and vocabulary reuse (given my role as co-champion of Track A in Ontology Summit 2014). We are finally in our "synthesis" phase of the Summit, and I just updated our track's synthesis draft yesterday.

So, while this is all fresh in my mind, I want to highlight a few key take-aways ... For an ontology to be reused, it must provide something "that is commonly needed"; and then, the ontology must be found by someone looking to reuse it, understood by that person, and trusted as regards its quality. (Sam Adams made all these points in 1993 in a panel discussion on software reuse.) To be understood and trusted, it must be documented far more completely than is (usually) currently done.

Here are some of the suggestions for documentation:
  • Fully describe and define each of the concepts, relationships, axioms and rules that make up the ontology (or fragment)
  • Explain why the ontology was developed
  • Explain how the ontology is to be used (and perhaps how the uses may vary with different triple stores or tools)
  • Explain how the ontology was/is being used (history) and how it was tested in those environment(s)
    • Explain differences, if it is possible to use the ontology in different ways in different domains and/or for different purposes
  • Provide valid encoding(s) of the ontology
    • These encodings should discuss how each has evolved over time
    • "Valid" means that there are no consistency errors when a reasoner is run against the ontology
    • It is also valuable to create a few individuals, run a reasoner, and make sure that the individual's subsumption hierarchy is correct (e.g., an individual that is supposed to only be of type "ABC", is not also of type "DEF" and "XYZ")
    • Multiple encodings may exist due to the use of different syntaxes (Turtle and OWL Functional Syntax, for example, to provide better readability, and better version control, respectively) and to specifically separate the content to provide:
      • A "basic" version of the ontology with only the definitive concepts, axioms and properties
      • Other ontologies that add properties and axioms, perhaps to address particular domains
      • Rules that apply to the ontology, in general or for particular domains
Defining much of this information is a goal of the VOCREF (Vocabulary and Ontology Characteristics Related to Evaluation of Fitness) Ontology, which was a Hackathon event in this year's Ontology Summit. I participated in that event on March 29th and learned a lot. A summary of our experiences and learnings is posted on the Summit wiki.

VOCREF is a good start at specifying characteristics for an ontology. I will certainly continue to contribute to it. But, I also feel that too much content is contained in the vocref-top ontology (I did create an issue to address this). That makes it too top-heavy and not as reusable as I would like. Some of the content needs to be split into separate ontologies that can be reused independently of characterizing an ontology. Also, the VOCREF ontology needs to "dog-food" its own concepts, relationships, ... VOCREF itself needs to be more fully documented.

To try to help with ontology development and reuse, I decided to start a small catalog of content (I won't go so far as to call it a "repository"). The content in the catalog will vary from annotation properties that can provide basic documentation, to general concepts applicable to many domains (for example, a small event ontology), to content specific to a domain. The catalog may directly reference, document and (possibly) extend ontologies like VOCREF (with correct attribution), or may include content that is newly developed. For example, right now, I am working on some general patterns and a high level network management ontology. I will post my current work, and then drill-down to specific semantics.

All of the content will be posted on the Nine Points github page. The content will be fully documented, and licensed under the MIT License (unless prohibited by the author and the licensing of the original content). In addition, for much of the content, I will also try to discuss the ontology here, on my blog.

Let me know if you have feedback on this approach and if there is some specific content that you would like to see!

Andrea

1 comment:

  1. I updated this post since I decided to put the source ontologies on GitHub (https://github.com/NinePts/reusable-ontologies) and use the MIT License.

    ReplyDelete