Wednesday, September 23, 2009

What Modelers Can Learn from WordNet

In WordNet, there are several significant types of relationships, with very specific semantics. These semantics apply in modeling as well. However, some modeling approaches focus on (or only allow) some of the semantics - for example, focusing on inheritance and property definition with much else excluded.

So, what are the types of relationships in WordNet? They are listed below along with their linguistic terminology (holonym, meronym, troponym, etc.).
  • Inheritance information - the definition of hypernyms/superclasses and hyponyms/subclasses. Most modeling approaches handle this very well.
  • Coordinate terms - Related information, usually the sibling entities under a single superclass. Again, this is well covered.
  • Aggregation - the definition of holonyms/aggregates and meronyms/aggregated items. However, WordNet further refines aggregation as:

    • Whole/part information - For example, fingers are part of a hand, but can be treated as separate entities. Their lifetimes are influenced by the lifetime of the "whole". Obviously, if a hand is cut off, the fingers are cut off with the hand.
    • Substance/composition of an entity - For example, cement and sand are substances in concrete, but once mixed, they are not separate entities.
    • Membership information - For example, certain employees are members of a security group, but the entities are separate, with separate lifetimes. So, removing the security group does not remove the employees, or removing the employees from the group does not delete the group.

  • Attribute information - HAS-A data, well addressed by all modeling infrastructures.
  • Synonym information - Alias information and equivalent terms. Lack of this information (or meta-information) usually causes the arguments when defining the single "name" of a modeled entity.
  • Antonym/opposite information - There is usually no need to reflect this in a model. My preference is OWL's disjointWith distinction, that 2 classes have no common individuals.
  • Refinement information - Defining troponyms for verbs (relationships). This involves refining a verb by the manner in which it is performed. For example, to mumble is "to talk indistinctly by lowering the voice or partially closing the mouth". This could be modeled as a typing hierarchy involving associations. But, typically, typing hierarchies involving associations are defined based on a restriction of the referenced elements, versus a refinement of the semantics of the association. Often, we make too much of the restriction scenarios and too little of the refinement of semantics.
  • Entailment of information, in WordNet it is entailment of verbs - Entailment is the implication of one fact from another. For verbs, it is based on temporal inclusion. For example, the act of snoring implies sleeping. OCL is one example of how this is supported in today's modeling infrastructure - across nouns and verbs/associations.
  • Cause data for transitive, intransitive verbs - This is best described by example ... knowing that the wind storm broke the window is the CAUSE of the window being broken (a resulting state). Having this level of information as data or meta-data in a model could assist immeasurably with root cause analysis.
I don't know about you, but I am very impressed with the knowledge in WordNet.

No comments:

Post a Comment