Semantic Technology

with Dave McComb

Zen Mind Part 2: Multiple Inheritance v Multiple Classification

Preconceived Idea # 2: Multiple Inheritance

If you come from an Object Oriented background, in particular one that supports multiple inheritance, you might find an apparent similarity between multiple inheritance in OO and having a class be subsumed by two others. However, if you try this out, you’ll realize you’re not getting what your expecting. This is because the semantics are different.  In OO there are really two things going on at the same time: subtyping and inheritance. The inheritance piece is giving you properties from both of your parents. If one parent had the “foo()” method and the other parent had the “bar()” method, the child now has both. The child has all of the attributes, and all of the behaviors of both parents. The child is essentially the union of the behaviors of the two parents. Semantics is not dealing with behavior, it’s dealing with typing, membership and classification.

So, take a couple of koans and call me in the morning:

  • Subclassing from two classes makes you the intersection, not the union of the two If a class A is a subclass of class B and class C, all members must be members of both parents. This is the intersection of the two parents, not the union. It is really a subclass of the intersection, but we’ll do that on another post.

  • Multiply classify an instance – The power in semantics lies in the ability to classify an instance multiple ways. This gets at what most OO people want to do with MI, and it’s far more flexible.

April 08, 2006 | Permalink | Comments (1)

"My Brain is Full"

I just returned from our second annual “Semantic Technology Conference.”  I think the refrain that I heard over and over, that captured the mood of the week, was the caption from the Gary Larson cartoon of the kid who wants to be excused because “My Brain is Full”)

It was heartening to me to hear even CTO’s of technology companies who had been in this space tell me, as one did “normally when we attend conferences like this there isn’t anything new for me to learn, but this one was different.  There was more sessions I wanted to attend than I could get to.”

I was very pleased.  I learned a lot.  Near as I could tell most of the 600 attendees and 80+  presenters felt the same way. Much thanks to all the participation and interaction that really makes an event like this.

March 17, 2006 | Permalink | Comments (0)

A Minimalist Upper Ontology

A title guarenteed to scare off just about everyone: if you're not familiar with work on upper ontologies, the title is just opaque.  If you are familiar, you'll likely thing that the combination of "minimalist" with "upper ontology" is an oxymoron.

So, now that I've gotten rid of all my audience, I can probably say just about anything.  And will.

Let's review our position here.  For two systems to communicate they must commit to a common ontology.  It doesn't matter how elegant or clever your ontology is, if no one else shares it, you don't participate in anything broader than your own ontology.

Given that there are three main positions:

  • Wait until you want to integrate, and then build a bridge ontology.  This works, but is numercially exhaustive if you have a lot of other ontologies to link to.
  • Integrate on a topic by topic basis.  Use a set of special purpose ontologies to link up.  This is a reasonable strategy and works for a lot of things (geography for instance)
  • Commit to an upper ontology early.  If you commit to a very broad upper ontology, you are conceptually linked to anyone else who does.

For some that third strategy is very appealling.  And there are some options to choose from here, most notably Cyc and SUMO.  But, there is also a dark side.  Any time you commit to an ontology, you agree to be bound by all the assertions made in that ontology.  If nothing else, you need to review them, understand them, and determine whether committing to them will cause problems.

As  a result, the most popular shared ontologies to date have been narrow scope upper ontologies, such as Dublin Core for documents, FoaF for contact lists and interests, and RSS for news feeds.  What they all share is a small set of concepts, and relatively few constraints.

I have postulated, built and will present what I call a "minimalist upper ontology."  That is, it is very broad in scope, comparable to the large upper ontologies (in this case I am trying to cover commercial information systems, so most of the corporate and government systems, but not games, compilers, embedded or scientific systems).  But I have tried to mimic the size of the more popular ontologies: there are about 50 concepts in this ontology.   I bleive there are immediate benefits for projects adopting it just to remove ambiguity from their definitions.  But longer term I think it sets a basis for much broader scale cooperation.

I think I'm on to something here. And it will only be of value if it is shared.  So, I will be presenting it at the Semantic Technology Conference, if you can make it Wed afternoon it is called "Gist: Minimalist Upper Ontology."  If you are not able to make it to the conference it will be available shortly thereafter, I will have white papers and the ontology itself will be available for free download.  Avaliable now at http://gist-ont.com/ 

I'm eager to get feedback on this project. 

February 24, 2006 | Permalink | Comments (9)

Autistic Systems

I’ve just come to the conclusion that the main reason we are frustrated with most of the application systems we have implemented in the last several decades, is because they are autistic. By that I don’t mean that they are hard to communicate with (although many are) I mean something a little broader.

Lest I get a call from the autism defense league or whatever, let me tell you where I’m coming from.  I'm not completely insensitive to toll of autism, nor am I bleak about its outlook.  We have very good friends who have an autistic son.  He is my sons age (14) and we have known them since both were about 5.  Sammy, the autistic child, is a high functioning autistic in that he goes to school, communicates and plays with others (including my son), and is generally a very charming young man.  When we first met him he was fairly hard to communicate with, and apparently the several years before we knew him were quite a struggle.  These days he is well on his way to “mainstreaming” into society.  But it was something his father told me last year that was just reinforced in a book I’m reading that lead me to put these thoughts together, and connect them to application systems.

Last year Sammy’s dad told me that he could teach Sammy to tie his shoes (which he did) but when he asked him to tie his boots, or tie a package, he couldn’t do it.  Boots and packages are “different.”

This weekend I was reading “The First Idea: How Symbols, Language, and Intelligence Evolved from our Primate Ancestors to Modern Humans.”  It is quite an interesting thesis: language isn’t hardwired into our brains, but arises from emotional responses to our environment.  It is the baby’s interactions with a caregiver well before distinguishable sounds are formed, that form the basis for concepts such as “safety,” “comfort” and “causation” and a whole host of other concepts, which we much later get around to attaching sounds and symbols to.  At one point in the book the authors mention some of their work with autistic children, and state “children with autism… have difficulties making inferences.”

And the coin dropped.

I’ve been railing a lot lately about corporate information systems (and I’m including in this set those that I built or implemented over the last several decades).  In particular I’d been railing about the increase in complexity of these systems.  They tell me that SAP now has 35,000 tables, and therefore hundreds of thousands of attributes.  Most large enterprises have many systems of that level of complexity.  Each entity, attribute and relation in these systems is distinct.  That is the way procedural and relational technology works.  Even Object Oriented adds only limited bits of generalization. The problem, as I had been saying, is that current technology is very good at differences, and not very good at similarity.  The similarities and relationships between the hundreds of thousands of attributes in a complex system or enterprise have had to be negotiated by the only two things that up to now could deal with ambiguity: end users and programmers.  Neither scale very well.

And now, a much more succinct way to say this: our systems are autistic.  They don’t make inferences.  When we learn something in one system or one area, it doesn’t carry over to other areas. 

We can deal with this now.  Semantic Technologies, and in particular those based on Description Logics, offer us a way to make inferences across broad domains of systems.  I’m convinced that our recurring problems will be addressed this way.  The classic problem of getting a “single view of the customer” is not a technology problem (although there are still plenty of technological hurdles to overcome).  It is essentially a semantic problem: what defines a “customer” and what about them is of interest that we would share in one place.  One approach is to come up with a really good definition and get all our systems (and systems outside our organization) to agree and implement that.  But this doesn’t work.  It is too hard in general, and inappropriate in many applications, to boot.  Many systems deal with users, or creditors, or agents or whatever, and it wouldn’t be appropriate to convert them (many of these systems wouldn’t work if you did convert them).

Better to come up with a definition of a customer that can be inferred from a set of properties (how about anyone who received final delivery of our product or contacted us about the technical characteristics of a shipment, as one of many possibilities).  We can set up other definitions for closely related concepts such as the concept of a creditor being such as "someone who owes us money."  We can also set up criteria for establishing whether two parties are likely to be one and the same.  Armed with this, we can make a broad set of inferences about who our best customers are, and what kind of activity we have had with them, despite the fact that the particulars on this are scattered over a large number of systems and called many different things.

I'm quite optimistic that we can start to turn back the tide of complexity this way.

Inference: getting beyond autistic systems.

February 18, 2006 | Permalink | Comments (13)

Zen Mind, Part 1

We just conducted a weeklong training session on OWL/DL and Ontology Engineering.  Several of the participants will be attending the Semantic Technology Conference, and felt they will be getting a lot more out of the conference, because of the training.  On drilling down a bit further, we found that the main benefit in this regard was breaking down their pre-conceived ideas of what semantics is.  They were several days into the training before they were deprogrammed enough to completely follow what was going on.

In this blog, and perhaps the next couple, I want to summarize some of these preconceptions, and some ideas that will at least make you aware of them, and may help you get more out of the conference, or any other studying you may be doing in the area.  We call this “Zen Mind” from the Zen masters belief that to really learn you have to get as many of your preconceived ideas out of your head long enough to establish some new patterns.  I believe the Zen Masters called it “beginners mind” (perhaps they thought Zen Mind was too promotional). 

In that spirit, let us offer up some preconceived ideas and the “koans” (statements meant to elicit additional thinking) that seem to best address them.

Preconceived idea #1: Properties belong to Classes.  People from a relational background make the partially correct analogy between relational attributes and semantic datatype properties and between foreign key relationships in relational and object properties in semantics.  However, this analogy will bite you.  Repeatedly, as our students demonstrated. 

They had a tough time remembering that the same property can be associated with many different classes.  They were so used to each property being unique, that when they did associate the same property with more than one class, they gave it different names (locatedIn, became locatedInState, locatedInCountry etc).

The koans we decided were most useful in this case were two:

·      Classes are really “sets” (to help get past the idea that classes are some sort of template, as they are in relational and Object Oriented technologies.  This seems to help overcome the temptation to believe that the property belongs to the class)

·      Properties own classes (when you define a restriction class in OWL/DL, what you have really done is use a pre-existing property to create a set of instances that have “someValues” from that property.  It is the property that gives rise to the class, and therefore is more useful to think of the properties owning the classes – at least compared to the classes owning the properties)

So, if you find yourself relapsing into relational thinking, just repeat the two koans until the symptoms disappear. 

February 12, 2006 | Permalink | Comments (5)

Strategy and Your Stronger Hand

I picked up this month’s (December 2005 which is still on the newsstands) Harvard Business Review at the airport yesterday.  There were two excellent articles in there by two of my favorite business authors; Geoffrey Moore and Clayton Christiansen (if I don’t get back to this thread, Christiansen’s article was titled Marketing Malpractice, and is also applicable as we start looking at commercializing Semantic Technology).

Moore’s article had many fresh insights, chief among them were that companies have a dominant business model.  The model does not depend on the industry they are in (indeed virtually every industry has both), nor their age or size.  He likens this to our dominant “handed-ness” and as the editor pointed out on the editorial page, “It’s easier to convert a shortstop into an outfielder than it is to change a southpaw into a righty.” 

Some firm’s dominant model is “volume operations” and for others it is “complex systems.”  The first relies on many customers, brands, advertising, channels and compelling offers.  The later relies on targeted customers and the integration of third party products into total solutions.  For each the grass often looks greener in the other model, but almost no business succeeds when they attempt to change models.

The rhythm of most high tech sectors is that the complex sale companies forge new territories and solve unique customer problems.  The volume companies come in later and try to commoditize the solution.  To survive, the complex sale companies need to do two things simultaneously: defend, for as long as possible the position they have already won, and move up the solution chain and incorporate the newly commoditized components into an even more interesting solution.  The one thing they need to avoid doing is trying to convert their own early wins into volume opportunities.

What does this have to do with semantics?  We are just beginning the commercial roll out of this technology.  We are going to have all the fits and starts of any new high tech sector.  We have an opportunity to be a bit more self aware.  Those of us in the complex sale sector need to be aware that volume operations from adjacent marketplaces will soon enter ours.  We need to be continually vigilant about incorporating rather than competing, and moving on up the solution chain.  Consumers of this technology have the opposite challenge: how to recognize which aspects of their problems require “complex” solutions and which aspects are ripe to be solved with “volume” solutions. 

January 23, 2006 | Permalink | Comments (2)

Supersumption

We've been doing training in Semantics and Description Logics lately, and have decided it's worth emphasizing the concept of supersumption. Of course, supersumption is nothing but the inverse of subsumption: that is if we say A subsumes B, then we have also said B supersumes A.

But the reason we're finding it worth calling out the difference is because a mindset that many of our students (and myself) bring to the classroom. If you've been brought up with Object Oriented modeling, you naturally draw a parallel between subsumption and subclassing (or even better subtyping). And the analogy is close enough that you can mostly find your way around in this aspect of semantics.

But us Object Oriented folks have a blind spot, and here's where it shows up. In OO if you imported a library, you might subtype some of the classes you found in your library for your use in a particular project (or you might use them as is). But, it would never occur to you to supertype a class that you got from somewhere else: you would be virtually assured of breaking it.

But in Semantics this is routine. Let's say you're designing an ontology for contracting, and you want to have a concept of "Party" (as in party to a contract). You may be importing another ontology with the concept of Person that you'd like to use. For some people, it's an unnatural act to say that Person subsumes Party (because it seems like you are redefining the external sources definition of Person by making it a subtype of Party without their concept). You're not really redefining someone elses concept (except in your own local use). To get around this blind spot, we've found it useful to say that your new concept "Party" supersumes the imported "Person." And this seems to make it easier to bring this pattern to mind.

I did a quick google search and found 400,000 references for subsumption, and only 32 for supersumption.  Must be a point of view thing.

January 08, 2006 | Permalink | Comments (0)

The Treaty of Tordesillas

Lately we've been grappling with the issue of how to get a Semantic Inference Engine and a Business Rules Engine to play nice in an Enterprise Architecture.  Some long dormant neuron fired and the Treaty of Tordesillas was invoked as a possible exemplar.

For those of you whose recollection of the Papal role in the age of exploration has faded, allow me to refresh your memory. The Portuguese were the first out of the blocks, in the Age of Discovery, sailing down the coast of Africa, and as a result, being granted by the Pope sovereignty over all the lands south of the Canary Islands (off the coast of Morocco, effectively giving them most of Africa, India, and Indonesia – the “East Indies”).  Columbus, sailing under a Spanish flag, discovered islands in the new world (which he believed to part of the East Indies).  On his return, the King of Spain petitioned the Pope (also a Spaniard) who granted Spain everything 300 miles West of the Cape Verde Islands (south and west of the Canaries, but still in the Eastern Atlantic).  The Portuguese howled in protest, and in 1494 in the Spanish Town of Tordesillas, pounded out a compromise.  The Treaty moved the line 800 miles further west.  A few years later, figuring that the world was likely far larger than earlier estimates, continued this imaginary line through the poles and to the “anti-meridian” which effectively gave Spain an entire hemisphere, but assured Portugal's claim over the East Indies.  No one had any idea at the time, but the line bisected the still undiscovered South America, eventually giving Portugal domain over Brazil, and Spain the remainder.  For the most part Spain and Portugal honored this Treaty for the rest of their explorations in the new world, and people in Brazil speak Portuguese today due to the placement of a line through a continent that the line drawers had no idea existed. 

So what does this have to do with Semantics and Business Rules?  Glad you asked.  If you've looked at a Business Rules implementation you will notice that majority of the rules being written are either rules that define new terms or that create simple inference between terms.  At the same time, if you look at Semantic implementations you'll see a great deal of effort trying to describe behavior but  exporting it to scripting or other programming environments to be implemented. 

Rather than embark on a series of unnecessary internecine battles over which tool to use for what, we recently postulated our own Treaty of  Tordesillas: all term definition and inference of class membership would be done in a Semantic Inference environment, and all assertions about relationships, all calculations and all initiation of side effects would be done in the Rule Engine environment. 

As near as we can tell, we've just drawn an imaginary line in the middle of the ocean.  However it seems to be a reasonable place to start.  I was a speaker at the recent Business Rules Forum, and had a chance to try this out with Business Rule vendors and practitioners.  The general response I got, I took to be favorable.  Most questioned the need to use a separate tool for inference as the Business Rules could do that.  But on reflection could see where it solves two of the Business Rules Communities problems: 1) rule implementations are getting large and unwieldy (and removing over half the rules could certainly help here) and 2) the terms and distinctions created in the Rule environment are typically not available as resources to the rest of the enterprise.

We haven't appealed to higher authority yet about this, but this is our starting point now for Enterprise Architectures: terms and inference on the Semantic side of the demarcation, calculation, property assertions and side effects on the Business Rule side.  Who knows, maybe future generations will speak other languages as a result.

I'd love to hear any feedback from anyone who has been to this new world.

December 24, 2005 | Permalink | Comments (8)

Welcome!

Welcome to the Semantic Technology Blog with Dave McComb.

December 20, 2005 | Permalink | Comments (2)

« Previous

Recent Posts

  • Time Zones
  • Categories and Classes
  • Semantisize
  • Web 3.0
  • A $3.5 Billion Semantic Distinction
  • Blogging the Semantic Technology Conference
  • What Will it Take to Build the Semantic Technology Industry?
  • Shirky, Syllogism and the Semantic Web
  • Necessary and Sufficient
  • SemBoK

About

My Photo
Subscribe to this blog's feed
Blog powered by TypePad

SemTech 2008

  • Meet me in San Jose
    SemTech 2008