Showing posts with label competency questions. Show all posts
Showing posts with label competency questions. Show all posts

Monday, March 5, 2018

Using Ontology Design Patterns as Templates for Alignment

The paper, When owl:sameAs isn't the Same, has an interesting observation that I (often) have to address when working with my customers ...
Contrary to popular belief in some circles, formal semantics are not a silver bullet. Just because a construct in a knowledge representation language is prescribed a behavior using formal semantics does not necessarily mean that people will follow those semantics when actually using that language “in the wild.” This can be laid down to a wide variety of reasons. In particular, the language may not provide the facilities needed by people as they actually try to encode knowledge, so they may use a construct that seems close enough to their desired one. A combination of not reading specifications - especially formal semantics, which even most software developers and engineers lack training in - and the labeling of constructs with “English-like” mnemonics naturally will lead to the use of a knowledge representation language by actual users that varies from what its designers intended.
A variant of this subject also came up in the opening keynote address (From Artwork to Cyber Attacks) at the US. Semantics Symposium 2018 held last week at Wright State University. Craig Knoblock (USC Information Sciences Institute) gave the keynote and said that the mapping of information in the American Art Collaborative Project (slides 32-38) was difficult due to the various interpretations that different students applied to the backing ontology. Although the students who mapped the data all used the CIDOC CRM cultural heritage ontology, at the end of the day, there was much clean-up and coordination required. On slide 35, the statistics are reviewed - although only 76 files were mapped, there were 4636 commits required to get consistency!

So, clearly there is a problem. To solve it, I have started applying Ontology Design Patterns (ODPs) to define a consistent interpretation and usage of one or more ontologies for an application. I have found that ODPs are not just for ontology development and reuse!

The list below shows the information that I include in an ODP to inform people/teams that are using ontologies to define consistent semantics for their data:
  • Description of why the pattern is needed and what problems are being solved - This includes:
    • A short statement describing what the pattern is used for and/or what is being mapped
    • The considerations that influenced the pattern or required the pattern to be created - for example:
      • When defining whole-part relationships, how deeply should the mereology hierarchy go?
      • What is needed to identify "the same" individuals?
      • How to determine the most appropriate type(s) for an individual?
      • What meta-data is needed?
  • Design sources for the ODP (this is not be needed if the source is a single ontology)
  • High level (block) diagram of the main concepts and relationships in the ODP
  • Overview of each of the concepts/classes and relationships shown in the block diagram, as well as important sub-classes and sub-properties
    • Include in the overview or in another section of the ODP any differences in approaches - for example, why a detailed class hierarchy might be used in one place but not in another
  • Class and property diagrams related to the overview-ed classes and properties (such as can be generated by OntoGraph)
  • Information on using the ODP
    • One or more examples illustrating the use of the ODP, along with instance/individual diagrams
    • A list of competency questions with SPARQL queries representing the questions
    • The results of applying the queries to the examples generated above (this step then also provides a set of test cases to gauge conformance)
I am pretty excited about how ODPs are working for my customers. Please let me know if you have similar problems and how/if you solved them.

P.S. If you aren't sure what competency questions are, please see the blog post, Breaking Down the "Documents and Policies" Project.

Andrea

Sunday, October 19, 2014

Breaking Down the "Documents and Policies" Project - Competency Questions

Our previous post defined a project for which a set of ontologies is needed ... "What access and handling policies are in effect for a document?" So, let's just jump into it!

The first step is always to understand the full scope of work and yet to be able to focus your development activities. Define what is needed both initially (to establish your work and ontologies) and ultimately (at the end of the project). Determine how to develop the ontologies, in increments, to reach the "ultimate" solution. Each increment should improve or expand your design, taking care to never go too far in one step (one development cycle). This is really an agile approach and translates to developing, testing, iterating until things are correct, and then expanding. Assume that your initial solutions will need to be improved and reworked as your development activities progress. Don't be afraid to find and correct design errors. But ... Your development should always be driven by detailed use cases and (corresponding) competency questions.

Competency questions were discussed in an earlier post, "General, Reusable Metadata Ontology - V0.2". (They are the questions that your ontology should be able to answer.) Let's assume that you and your customer define the following top-level questions:
  • What documents are in my repositories?
  • What documents are protected or affected by policies?
  • What documents are not protected or affected by policies? (I.E., what are the holes?)
  • What policies are defined?
  • What are the types of those policies (e.g., access or handling/digital rights)?
  • What the details of a specific policy?
  • Who was the author of a specific policy?
  • List all documents that are protected by multiple access control policies. And, list the policies by document.
  • List all documents that are affected by multiple handling/digital rights policies. And, list the policies by document.
These questions should lead you to ask other questions, trying to determine the boundaries of the complete problem. Remember that it is unlikely that the customers' needs will be addressed in a single set of development activities. (And, work will hopefully expand with your successes!) Often, a customer has deeper (or maybe) different questions that they have not yet begun to define. Asking questions and working with your customer can begin to tease this apart. Even if the customer does not want to go further at this time, it is valuable to understand where and how the ontologies may need to be expanded. Always take care to leave room to expand your ontologies to address new use cases and semantics.

This brings us back to "General Systems Thinking". It is important to understand a system, its parts and its boundaries.

Here are some follow-on questions (and their answers) that the competency questions could generate:
  • Q: Given that you have document repositories, how are the documents identified and tagged?
    • A: A subset of the Dublin Core information is collected for each document: Author, modified-by, title, creation date, date last modified, keywords, proprietary/non-proprietary flag, and description.
  • Q: How are the documents related to policies?
    • A: Policies apply to documents based on a combination of their metadata.
  • Q: Will we ever care about parts of documents, or do we only care about the documents as a whole?
    • A: We may ultimately want to apply policies to parts of documents, or subset a document based on its contents and provide access to its parts. But, this is a future enhancement.
  • Q: Do policies change over time (for example, becoming obsolete)?
    • A: Yes, we will have to worry about policy evolution and track that.
  • Q: What policy repositories do you have?
    • A: Policies are defined in code and in some specific content management systems. The goal is to collect the details related to all the documents and all the policies in order to guarantee consistency and remove/reduce conflicts.
  • Q: Given the last 2 competency questions, and your goal of removing/reducing conflicts, would you ultimately like the system to find inconsistencies and conflicts? How about making recommendations to correct these?
    • A: Yes! (We will need to dig into this further at a later time in order to define conflicts and remediation schemes.)
Well, we now know more about the ontologies that we will be creating. Initially, we are concerned with document identification/location/metadata and related access and digital rights policies. We can then move onto the provenance and evolution of documents and policies, and understanding conflicts and their remediation.

So, the next step is to flesh out the details for documents and policies. We will begin to do that in the next post.

Andrea