The first step is always to understand the full scope of work and yet to be able to focus your development activities. Define what is needed both initially (to establish your work and ontologies) and ultimately (at the end of the project). Determine how to develop the ontologies, in increments, to reach the "ultimate" solution. Each increment should improve or expand your design, taking care to never go too far in one step (one development cycle). This is really an agile approach and translates to developing, testing, iterating until things are correct, and then expanding. Assume that your initial solutions will need to be improved and reworked as your development activities progress. Don't be afraid to find and correct design errors. But ... Your development should always be driven by detailed use cases and (corresponding) competency questions.
Competency questions were discussed in an earlier post, "General, Reusable Metadata Ontology - V0.2". (They are the questions that your ontology should be able to answer.) Let's assume that you and your customer define the following top-level questions:
- What documents are in my repositories?
- What documents are protected or affected by policies?
- What documents are not protected or affected by policies? (I.E., what are the holes?)
- What policies are defined?
- What are the types of those policies (e.g., access or handling/digital rights)?
- What the details of a specific policy?
- Who was the author of a specific policy?
- List all documents that are protected by multiple access control policies. And, list the policies by document.
- List all documents that are affected by multiple handling/digital rights policies. And, list the policies by document.
This brings us back to "General Systems Thinking". It is important to understand a system, its parts and its boundaries.
Here are some follow-on questions (and their answers) that the competency questions could generate:
- Q: Given that you have document repositories, how are the documents identified and tagged?
- A: A subset of the Dublin Core information is collected for each document: Author, modified-by, title, creation date, date last modified, keywords, proprietary/non-proprietary flag, and description.
- Q: How are the documents related to policies?
- A: Policies apply to documents based on a combination of their metadata.
- Q: Will we ever care about parts of documents, or do we only care about the documents as a whole?
- A: We may ultimately want to apply policies to parts of documents, or subset a document based on its contents and provide access to its parts. But, this is a future enhancement.
- Q: Do policies change over time (for example, becoming obsolete)?
- A: Yes, we will have to worry about policy evolution and track that.
- Q: What policy repositories do you have?
- A: Policies are defined in code and in some specific content management systems. The goal is to collect the details related to all the documents and all the policies in order to guarantee consistency and remove/reduce conflicts.
- Q: Given the last 2 competency questions, and your goal of removing/reducing conflicts, would you ultimately like the system to find inconsistencies and conflicts? How about making recommendations to correct these?
- A: Yes! (We will need to dig into this further at a later time in order to define conflicts and remediation schemes.)
So, the next step is to flesh out the details for documents and policies. We will begin to do that in the next post.