Semantic Technology

with Dave McComb

Time Zones

We had a workshop last week on gist (our minimalist upper ontology, more on that in another post maybe next week).  Anyway as part of the aftermath, I decided to get a bit more rigorous about some of the lowest level primitives. 

One of the basic ideas about gist is that you may not be able to express every distinction you might want to make, but at least what you do exchange through gist will be understood and unambiguous.  In the previous version of gist I had some low level concepts, like distance, which was a subtype of magnitude.  And there was a class distanceUnit which was a subclass of unitOfMeasure.  And unit of measure has a property that points to conversion factor (ie how to convert from one unit of measure to the base unit of that "dimension") 

But what occurred to me just after the workshop is that two applications or two organizations communicating through gist could still create problems by just picking a different base (ie is one said their base for distance was a meter and another foot they have a problem).  This was pretty easily solved by going to NIST, and getting the best thinking on what these dimensions should be and what the base unit of each dimension should be.  Looking at it, I don't think there ought to be much problem with people adopting these.

Emboldened, I thought I would do the same for time.  For starters universal time seems to the way to go.  However many applications record time in local time so we need some facility to recognize that and provide an offset. 

Here's where the problem came in and maybe you dear readers can help. After about an hour of searching the web the best I could find for a standard in this area is something called the tz database.  And while you can look up various cities I didn't see anything definative on what the geographical regions are that make up each of the time zones (and to make things worse the abbreviations for time zones are not unique, there is an EST in North America and one in Australia for instance).

So if anyone has a thought in this area, I'm all ears.

October 03, 2008 | Permalink | Comments (19) | TrackBack (0)

Categories and Classes

We’ve been working with two clients lately, both of whom are using an ontology as a basis for their SOA messages as well as the design of their future systems. As we’ve been building an ontology for this purpose we became aware of a distinction that we think is quite important, we wanted to formalize it and share it here.

In an ontology there is no real distinction that I know of between a class and a category. That is: classes are used for categorizing, and you categorize things into classes. If you wanted to make a distinction it might be that category is used more in the verb form as something you do, and the class is the noun form.

Categories and Classes in traditional apps

But back in the world of traditional applications there is a quite significant difference (although again I don’t believe this difference has ever been elaborated). In a traditional (relational or object oriented) application if you just wanted to categorize something, (say by gender: male and female) you would create a gender attribute and depending on how much control you wanted to put on its content you would either create an enum, a lookup table or just allow anything. On the other hand if you wanted behavioral or structural differences between the categories (let’s say you wanted to distinguish sub contractors from employees) you would set up separate classes or tables  for them, potentially with different attributes and relationships.

We’ve been studying lately what drives the cost of traditional systems, and getting this category/class distinction right is one of the key drivers. Here’s why: in a traditional system, every time you add a new class you have increased the cost and complexity of the system. If you reverse engineer the function point methodology, you’ll see that the introduction of a new “entity” (class) is the single biggest cost driver for an estimate. So every distinction that might have been a class, that gets converted to a category, provides a big economic payoff.

It’s possible to overdo this though. If you make something a category that should have been a class you end up pushing behavior into the application code, which generally is even less tractable than the schema. So we were interested in coming up with some guidelines for when to make a distinction a category and when to make it a class.

Category and class in gist

As it turns out, we had foreshadowed this distinction, although not for this reason, in gist, our upper ontology. Gist has a class called “category” whose intent is to carry categorical distinctions (from one lower level ontology to another) without necessarily carrying their definitions.

For instance when we worked with a State Department of Transportation, we had a class in their enterprise ontology called “Roadside Feature.” A Roadside Feature has properties such as location and when by what process it was recorded. Several of their applications had specific roadside features, for instance “fire hydrants.” In the application fire hydrant is a class, and therefore is one in the application ontology. But in the enterprise ontology “fire hydrant” is an instance of the category class. Instances of fire hydrant are members of the roadside feature class at the enterprise ontology level, and associated with the category “fire hydrant” via a property “categorizedBy.” A fire hydrant can therefore be created in an application and communicated to another application that doesn’t know the definition of fire hydrant, with almost no loss of information. The only thing that is lost on the receiving end is the definition of fire hydrant, not any of the properties that had been acquired by this fire hydrant.

Category and Class a formal distinction

So we came to this: a category is an intensional  set with criteria for defining membership. A class is an extensional set where membership is explicitly asserted and specific properties can be defined as necessary


Cc1

 

 

In an ontology these two definitions can, and almost always do, overlap. But in a traditional system they don’t. In a traditional system the overlap is “the excluded middle”

This is helpful for us if we’re using our ontology to generate artifacts for traditional systems, but on closer inspection we’re finding interesting use even in ontology driven systems. The area in the class rectangle outside the category box is essenCc2_4tially the primitive classes, those that can not be defined in terms of other classes and properties. The intersecting region are classes that are formally defined, and therefore we could infer membership. And the category outside class is something where we accept that a distinction has been made, we may know the sufficient properties, but we don’t necessarily know any other necessary properties. Nor do we need to know how the individual got categorized.

In an ontology, we focus most of our effort on the intersection

Cc3

As these are the classes that have formal definitions. But if we think deeply about it, what we are d

 

oing when we define a class and give it a formal definition is: we name a class and say it is equivalent to a restriction. The two things we traditionally haven’t focused on are classes with

 

out

 

definitions (primitive classes) and categories without classes (many instance based taxonomies fit this pattern).

 

Our initial work suggests that this is a key pattern for getting large constellations of ontologies to work together. Feedback, comments, brickbats all welcome. 

April 18, 2008 | Permalink | Comments (1) | TrackBack (0)

Semantisize

I was alerted to this site from a comment.  It's pretty cool.  You can while away a lot of time on this site which is rounding up lots of podcasts, videos etc etc all related to Semantic Technology.

semantisize.com

I got a kick out of a video of Eric Schmidt taking a question from the floor on "What is Web 3.0?" Schmidt's answer "I think you [the questioner] just made it up"

March 22, 2008 | Permalink | Comments (0)

Web 3.0

I’m weighing in in favor of Web 3.0 as an alias for the Semantic Web. I know there are a lot of people who will roll their eyes and initiate some anti hype exorcism, but let’s have a sober look at the pluses and minuses here.

Web 3.0 is not without it’s problems. The first problem with Web 3.0 is that everyone is defining it to their own ends. As Montoya Herald summed it up at  http://www.christianmontoya.com/2007/10/08/web-30-i-about-money/ , Web 3.0 is essentially whatever each of the companies that used the term are working on next. The second problem is that it does pander to the hypemeisters. But the very people who decry hype the loudest are often those who benefit from it the most (who can argue that the hype of the Web and Web 2.0 didn’t advance the careers and opportunities of the very people who now think Web 3.0 is hype. )

A lot of people seem to be comfortable with Web 2.0 now, despite the fact that it has no real unifying principle. Web 2.0 is blogs and wikis and Facebook and MySpace (user generated content) and AJAX and Rich Internet Applications for a richer user experience in a browser, but really there isn’t anything holding it together or giving it a defined shape.

Maybe we don’t need to call the Semantic Web: Web 3.0. But if we don’t some other marginal improvement in an existing technology will claim the moniker. In other words, there will be a Web 3.0 and we will find ourselves explaining to people: “well yes, but that is just a part of the vision...”

Isn’t the term “Semantic Web” good enough? It’s good for the population that is already “in the tent” but it suffers from being the next big thing for too long for many others. Many people have discounted what they believe the Semantic Web to be(often by making up things that it isn’t and then objecting to that straw man). Web services suffered from a similar fate, for a long time, as thought leaders confused it with services delivered over the web (Software as a Service for instance) which it has some things in common with, but the two aren’t the same. For some, calling the Semantic Web: Web 3.0 gives an opportunity to take another look.

So, I’m coming down in favor of “Web 3.0 = Semantic Web.” What do you think?

November 20, 2007 | Permalink | Comments (3)

A $3.5 Billion Semantic Distinction

Steven Pinker is well on his way to a best selling book on, of all things: semantics. I haven’t finished it yet, but am finding it fascinating. What is surprising me, is how popular it is. Must be a lot more armchair ontologists out there than I thought.

One of the things that sent me to the blog though was his quantification of a semantic distinction. It turns out that the insurance policy on the World Trade Center has a maximum payout of $3.5 billion per event. The $3.5 billion dollar semantic question is did the two plane crashes constitute one event or two?

Stuff_of_thought Slightly longer review

September 20, 2007 | Permalink | Comments (0)

Blogging the Semantic Technology Conference

I noticed a big difference in the coverage in the blogosphere of this years conference versus last years.  That leads of course to the question: is it the conference or growth in blogging?  I don’t have any stats, but my sense was that blogging was very much in full swing last year, and it’s rapid growth may be behind it. 

That said I thought I’d sum up what I did see. 

Meaningful Data http://meaningfuldata.wordpress.com/2007/05/29/re-naming-the-semantic-web/

Bruce MacVarish http://www.brucemacvarish.com/2007/06/index.html

Word Press http://wordpress.com/tag/semantic-technology-conference/

Library sputnik http://austlit.edublogs.org/2007/06/12/a-little-news/

Read/Write Web http://www.readwriteweb.com/archives/2007_semantic_technology_conference.php

Arch2Arch http://dev2dev.bea.com/blog/rmanning/archive/2007/05/semantics_and_e.html

Dean Allenmang http://dallemang.typepad.com/my_weblog/2007/05/blind_ambition.html

Bill Trippe http://www.billtrippe.com/archives/2007/05/impressions_of.html

WebCentric http://webcentrics.blogspot.com/2007/05/2007-semantic-technology-conference.html

Panaton http://dagoneye.tumblr.com/post/2547584

Dr. Data Dictionary http://datadictionary.blogspot.com/2007/05/impressions-of-sem-tech-07.html

Copia http://copia.ogbuji.net/blog/2007-05-19/musings-of-a-semweb-architect

Oralce  http://blogs.oracle.com/otn/2007/05/21#a503

Axonomics http://www.secondintegral.com/axonomics/?cat=4

Magazine coverage as well

Red Herring http://www.redherring.com/Article.aspx?a=22375&hed=Semantic+Web+For+the+Masses

InfoWorld http://www.infoworld.com/article/07/05/22/semantic_tech_conf_garlik_1.html?source=NLC-SOA&cgd=2007-05-31

Not everyone was happy

Business Intelligence Network http://www.b-eye-network.com/view/4605

This is probably just the tip of the iceberg, but compared to last year, seemed there was only a mention or two. Something’s happening.

August 08, 2007 | Permalink | Comments (0)

What Will it Take to Build the Semantic Technology Industry?

I get asked this question a lot. And I’d like to get your help in answering it please.

As co-chairman of the Semantic Technology Conference, I see lots of customer organizations experimenting and adopting semantic technologies – especially ontology-driven development projects and semantic search tools - and seemingly as many start-ups and new products emerging to address their requirements. It’s an exciting time to be in this space and I’m glad to have a part to play.

But back to the question of “what will it take?” I don’t think anyone has all the answers, though it seems there’s a growing consensus about how semantics will eventually take hold:

1. A Little Semantics Goes a Long Way
I think it was Jim Hendler who first used the expression, and I find myself in stark agreement. Much of the criticism of the semantic web vision focuses on the folly of trying to boil the ocean, yet many of the successful early adopters are getting nice results by taking small incremental steps. There’s a good exchange at Dave Beckett’s blog on this point.

2. Realistic Expectations
I guess this relates to my first point, but I remain concerned about the hype and expectations that are being set around the semantic web, and now the term Web 3.0. I, as much as anyone, would love to see the semantics field explode with growth, but this market is going to be driven by customers, not vendors, and the corporate clients I see are taking a cautious approach. I think they’ll catch on eventually, but let’s not try to push them too far, too fast. 

3. We Don’t Need a Killer App
Personally I think we need to look at semantic capabilities as an increasing component of the web and computing infrastructure, as opposed than trying to identify a killer app that’s going to kickstart a buying frenzy. If a killer app emerges then that’s great, but don’t hold your breath. There’s plenty of value to be gained in the meantime. More than anything, we need to demonstrate speedy, cheap ways to get started with semantics. This will be far more useful in the long run. 

4. We Need to Get Business Mindshare
It’s so obvious that I’m almost embarrassed to say it, but the main point is that we need to improve how we’re currently demonstrating the business value of semantic technology. I see a few key ways we can improve, starting with a greater willingness to talk about the projects already taking place. Secondly, I think we can leverage existing technology trends – especially SOA and mashups – to show how semantic technology can add value to these efforts. Third, and I might risk offending some people with this, but in the short term we should be emphasizing cost savings and reduced time to deployment over and above the extra intelligence and functionality that semantics can provide. Especially for corporate customers. Semantic SOA can save hugely over conventional approaches in data integration and interface projects, and this is where most businesses are really feeling the pain right now. 

OK, so this is a short and probably incomplete list of ideas.  Feel free to chime in here or at the Semantic Technology Conference

This year’s SemTech conference in particular will have numerous discussions around the theme of how to grow the semantic technology industry, including Mills Davis Semantic Wave 2007 tutorial, and the Keynote panel on Building the Semantic Technology Industry: A Conversation with Entrepreneurs and Investors

I hope to see you there and to get your input to the conversation.

PS: Next Tuesday (April 24) is the deadline for early registration discounts on SemTech, as well as group rates at the conference hotel in San Jose.

April 20, 2007 | Permalink | Comments (14)

Shirky, Syllogism and the Semantic Web

A friend recently sent me the link to Clay Shirky’s piece on the Semantic Web  with a “I assume you’ve seen this, what do you think?”

I had seen it, but I hadn’t looked at it for years. So I went back for another look.

As usual, Shirky’s writing is intelligent, insightful and even funny. Recommended reading. I had hoped the ensuing years would prove “us” (Semantic Technologists) right, and that the argument would look amusing in retrospect.

Alas we still have a long way to go to staunch the critics. More on that in a future blog.

For today’s blog I have to point out the real irony of the article that I managed to miss the first time I read it.

At the risk of oversimplifying his article to the same degree he oversimplified the Semantic Web, the essence of the article went like this:

· The Semantic Web relies on syllogisms “The semantic web is a machine for creating syllogisms”

· Nobody uses syllogisms “it will improve all the areas of your life where you currently use syllogisms, Which is to say, almost nowhere”

· Therefore nobody will use the Semantic Web “it requires too much coordination and too much energy to effect in the real world”

The first two quotes from the opening the last from the closing

The irony being of course, that this entire article is a syllogism. To make one of the major premises of an argument that something will fail because nobody uses that style of argument, reminds me of the admonition Yogi Berra gave to some teammates who had suggested a restaurant for the evenings dinner “Nah, nobody goes there anymore. It’s too crowded.”

The article points out some areas we need to pay more attention to, including controlling the hype machine. Reading between the lines, it appears that one of his major points is: the web is complex and only humans can really understand the nuances that our complex utterances mean.

But traffic is complex, and we know that traffic lights will never be as good as police in managing an intersection, but we’ve decided that an automated solution that gets us consistently pretty good results is good enough.

Back to the article, he relies on Lewis Carroll’s syllogisms as a critique of the medium, and by extension, the Semantic Web. The knock out punch was meant to be a five line syllogism about soap-bubble poems. But even here there were two implications: one that humans could follow this logic, and two that formalized ontologies could not. I of course rose to the bait and tried to formalize this syllogism.

I was not successful. Not because of the poverty of expression in the Semantic Web, nor even my own understanding, but attempting to get formal about this doggerel shone a light on the fact that it doesn’t make any sense at all. Indeed if he makes a point at all it is that humans can often get fooled by things that sound like they make sense, but actually don’t. Seems to me, defending that level of confusion and ambiguity isn’t an argument against the Semantic Web.

March 18, 2007 | Permalink | Comments (3) | TrackBack (0)

Necessary and Sufficient

We just completed another training class, and like they say, “no one learns more than the instructor.”  In this case the blindingly obvious, and yet elusive pattern that revealed itself was the separation of the sufficient from the necessary. 

Until last week, while we had an intellectual understanding of the distinction between “necessary” and “necessary and sufficient” (and a very tenuous grip on sufficient but not necessarily necessary), we weren’t using the distinction consistently in our designs.  In the course of discussion, prompted by some questions in the class and elaborated in the bar (thank god for cocktail napkins) we were able to tease out the patterns of “sufficient” (technically a superclass of a restriction) from necessary (a subclass of a restriction), and to line them up with some design patterns.  “Sufficient” is essentially the “rule in” pattern.  For instance, having a child who is a human is sufficient to make you a human. But of course it is not necessary.  Having a biologicallMother who is an animal is necessary as a Person, but not sufficient. 

I’m going back to gist and factoring out the necessary from the sufficient. 

Meanwhile, a reminder: it’s last call for papers for next years Semantic Technology Conference, we’ve still got a few slots open. 

December 15, 2006 | Permalink | Comments (15) | TrackBack (0)

SemBoK

As part of a Community Of Practice building exercise we're doing with Project10X, I have volunteered to put together an outline of a "Semantic Body of Knowledge" or SemBoK for short. What we're trying to do is put a vast array of competencies into an organized structure. I've got a reasonable start.  It will become public in the near future, but in the mean time I'm looking for a few people who would give it a rigorous critique.

I’ve identified 60 competencies, and have a paragraph on about half of them so far, hopefully by the time you read this I will have a first draft of all 60. I expect it to be 10-14 pages long at completion.  It’s organized a bit like a college curriculum (although it’s been a long time since I’ve been in college).

I’m looking for people to review it for completeness and relevance (ie: are their topics under or over represented) and to edit the descriptions and fill in the suggested readings for each topic. If you’re interested, and would really commit to preparing a critique, email me and I’ll get you a draft: [email protected]

August 07, 2006 | Permalink | Comments (0) | TrackBack (0)

Next »

Recent Posts

  • Time Zones
  • Categories and Classes
  • Semantisize
  • Web 3.0
  • A $3.5 Billion Semantic Distinction
  • Blogging the Semantic Technology Conference
  • What Will it Take to Build the Semantic Technology Industry?
  • Shirky, Syllogism and the Semantic Web
  • Necessary and Sufficient
  • SemBoK

About

My Photo
Subscribe to this blog's feed
Blog powered by Typepad

SemTech 2008

  • Meet me in San Jose
    SemTech 2008