UserPreferences

FiguringOutRdf


  1. RDF made me feel stupid -- Maybe you won't have to
  2. Why have I been so interested in learning about RDF?
    1. Specific Problems to Solve with RDF
      1. semantic interoperability among various XML specifications
      2. representing disparate data formats and materials in the Scholar's Box
      3. extensibility of XML specifications
    2. Besides solving specific problems, RDF is getting some traction in a number of places. Hence it might be time to see what all the fuss is about.
    3. A quick note on my own background
  3. My current understanding of RDF
    1. RDF is promising technology in spite of all the confusing hype around it.
    2. RDF is not a monolithic topic. RDF can be used independently of the Semantic Web. RDF is not inherently tied to XML.
    3. The RDF triple concept is a simple, elegant, and seemingly powerful one at its heart.
    4. RDF/XML is obscure to the uninitiated and makes it easy to confuse the relationship between RDF and XML.
    5. RDF Tools help a lot to make RDF understandable -- and usable.
    6. RSS 1.0 is a good place to start with RDF.
    7. Non-hype filled assessments of RDF and the somewhat related Semantic Web are hard to find.
    8. Blending RDF vocabularies is probably easier than blending XML vocabularies but it's not magic either! Some human must do the mapping of meanings between vocabularies.
    9. Too much abstraction and confusion might kill off RDF.
  4. Comments and Feedback

RDF made me feel stupid -- Maybe you won't have to

Ever have some subject area that you think that you should be able to understand but can't quite manage despite valient attempts? You know that it shouldn't be that hard but, for some reason, it eludes you. You end up saying, "But I'm not stupid....ok, well, maybe I am -- not it's the subject that's stupid!."

The SemanticWeb (SW) and all the many terms rightfully or unfairly associated with it is exactly such a subject area for me. Last week, as I was preparing my talk about RSS to the "Technical Solutions Group" at the California Digital Library, I made my third serious attempt over the last several years to understand RDF -- one of the lower stacks of the [WWW]"semantic web layer cake". Even though I have been using RSS for several years, I've largely been ignoring one of the important flavors of RSS: RSS 1.0 because of its use of RDF. I remember looking at plenty of RSS 1.0 files and being puzzled by exactly what it meant. I figured that this was a good time to try to get to the bottom of what RDF was.

There's been a few eureka moments but no Damascus Road experience yet. Call me a honest seeker on the road to RDF/Semantic Web enlightment. Although I didn't manage to figure out all the various pieces of the puzzle, I wanted to present some tentative conclusions and outline my current understanding of the topic. Hopefully, my write-up will help others have an easier time making sense of RDF and the semantic web.

Why have I been so interested in learning about RDF?

I've been doggedly pursuing RDF because I have suspected that it would be a very useful technology, that it helps to solves some very basic problems that I am tackling. Three specific problems come to mind:

Specific Problems to Solve with RDF

semantic interoperability among various XML specifications

We at the IU and the UCB Library been studying how to translate materials encoded in various XML specifications useful in the library and educational technology world. The basic approach I've been taking is to sit down and write crosswalks -- often in XSLT -- to translate one specification to another. See "A Preliminary Crosswalk from METS to IMS-CP" that summarizes work by Rick Beaubien and myself. Now, we are looking at the next steps to take.

--> For doing these kinds of crosswalks, I think you'd also need OWL, although that's currently a level beyond me. --Tom Hoffman

I don't think that there is an alternative to the painstaking, laborious hand-generated crosswalks between specifications that we have been pursuing. If there is a better, even automagical, approach, we want to know about it! The rhetoric behind RDF seems to promise a better solution than what we have been pursuing. For example, the answer to the question [WWW]"What is RDF?" is:

Such statements make RDF sound wonderful (especially at the expense of XML, I might add) -- but is RDF too wonderful to be true? The fact that key projects such as SIMILE and HARMONY directed at solving semantic interoperability issues use RDF gives credence to these assertions. So the question then becomes, "OK, RDF might help -- but exactly how does it help? With RDF, do we need to map elements between specifications or does RDF somehow magically take care of mappings? If so then why the hype of RDF over XML?" I could see in the abstract how RDF might be useful but I couldn't see how RDF is going to make life gloriously easy. And does using RDF mean first recasting METS and IMS-CP (two XML specifications we are looking at) into RDF/XML?

representing disparate data formats and materials in the Scholar's Box

The architecture for the Scholar's Box needs to handle the blending of multiple formats of data from many disparate sources. RDF is supposed to make such blending easier. But is it easier only if we are trying to blend RDF data? Does one and how does one retrofit XML data to fit into such a scheme? Increasing numbers of repositories are creating XML data (Amazon.com, the California Digital Library, a lot of non RSS 1.0 data feeds) -- and I want to blend them. Does RDF help me?

--> I think it depends on exactly where your data stops and your metadata starts. If you want to create a repository for metadata about these sources of XML data, that is, a central index that points to the other XML resources, RDF does that quite well out of the box. You absolutely can whip up some plain old XML that will do the same thing, but you'll be in effect recreating the functionality of RDF, which may be harder than you think and make you look stupid in a few years if RDF continues to grow. --TEH

extensibility of XML specifications

In RSS 2.0, METS, and IMS-CP, one can extend the metadata elements by adding elements in non-native XML namespaces. Is this mechanism of extensibility practically much worse than that of RDF (and the specific instance of RSS 1.0)? I'd love to see real examples to demonstrate one way or the other. If RDF-type extensibility is so much better, then how do we retrofit all this non-RDF XML to take advantage of superior RDF-type extensibility?

--> Practically speaking, this may have more to do with RDF-based applications' ability to ignore elements they don't understand without failing than anything else. --TEH

--> On the other hand, I keep trying to come up with a plain XML analogue to Brownsauce. Even when it doesn't have an RDF schema to draw from, it allows one to browse arbitrary RDF in a meaningful manner. It seems like you should be able to make a similar XML browser, but I can't think of one. Later... actually just a DOM browser like Mozilla's is probably the proper analogue. Never mind... --TEH

Besides solving specific problems, RDF is getting some traction in a number of places. Hence it might be time to see what all the fuss is about.

Though I've been working with RSS for years now, I never fully understood RSS 1.0, a major flavor of RSS, because of its RDF basis; all the other flavors of RSS seem straightforward by comparison. Nevertheless, that a number of applications (RSS 1.0, Chandler, [WWW]Adobe XMP is using RDF indicates to me that RDF is on the verge of critical adoption and has moved beyond a lab experiment. So even if I'm skeptical, I want to understand (I've wanted to use those three applications for a while now.)

RDF and the semantic web seems to have hit the education world too. Terry Anderson's talk at Merlot2003 "Beyond Learning Objects: Towards the Educational Semantic Web" as [WWW]reported by Greg Ritter is the first detailed example I've seen of work in this direction. I want now to understand the connections between learning objects and the "Educational Semantic Web"

A quick note on my own background

I come at trying to understand RDF with a solid background with the XML family of technologies (XML, XSLT, XML schemas) and some specific applications of XML (RSS 0.9x, RSS 2.0, METS, IMS-CP, IMS-MD). However, I have little knowledge of artificial intelligence, knowledge representation, and functional languages. Your background is probably different -- some things that might be clear to you might be unclear to me (and vice versa) because of different backgrounds.

My current understanding of RDF

Here I outline a set of tentative conclusions. I don't attempt to provide a tutorial on RDF below, save in passing, though I do point to useful resources for understanding RDF.

RDF is promising technology in spite of all the confusing hype around it.

Though I don't think that RDF has yet proven itself, I can definitely see its potential. The PIE wiki [WWW]nails this central point:

I've been frustrated that it's taken a long time to come to this very basic (and hardly earth-shattering) conclusion. The whole RDF scene is incredibly confusing -- and [WWW]a lot of people are confused.

Why is RDF so hard to get? In [WWW]RDF, What's It Good For?, Kendall Clark argues that RDF might be a victim of bad technology evangelism:

RDF is not a monolithic topic. RDF can be used independently of the Semantic Web. RDF is not inherently tied to XML.

I found the way that Mark Pilgrim's essay [WWW]"Should Atom Use RDF?" laying out "four related but completely independent issues" extremely helpful. Let me quote the beginning of the essay:

--> I think that RDF is <em>always</em> overkill for a single, clearly defined, isolated application. It is only in the interaction between different applications and distributed data sources that it becomes worthwhile. --Tom

I had conflated these issues which made everything difficult to understand. Take the issues individually, and things won't be so confusing.

The RDF triple concept is a simple, elegant, and seemingly powerful one at its heart.

For some reason, it took me a long time to get the key concept behind RDF. It took me a while to get -- not because the concept is that difficult -- but because a lot of other things seem to obscure it.

Let me first try to explain in my own words (though the following explanation might not be quite right):

Tim Bray's [WWW]"What is RDF?" was the first essay that I read in my attempts to understand RDF. It's still very good. However, I think that the triples idea was still unclear to me after reading the essay. (And I don't blame Tim Bray for that since the idea is clearly in the essay). So I would say to readers that one should follow up Bray's essay with reading something like Aaron Schwartz's [WWW]"RDF Primer Primer". The two complement each other.

RDF/XML is obscure to the uninitiated and makes it easy to confuse the relationship between RDF and XML.

RSS 1.0 was my first encounter with a RDF-based format. I looked at sample RSS 1.0 document and really didn't understand how to work with it. (For instance, I did not know [WWW]whether I could embed arbitrary XML fragments in other namespaces into a RSS 1.0 document.) Coming from a non-RDF XML background and seeing other XML documents did not prepare me for understanding RSS 1.0.

Since then, I've learned the following:

RDF Tools help a lot to make RDF understandable -- and usable.

The [WWW]W3 RDF Validator Service helped me to see the underlying simplicity of RDF. I dropped the [WWW]RSS 1.0 feed from my personal blog into the Validator and then related the triples that emerged to the actual RDF/XML document -- that helped a lot.

I am now also using [WWW]BrownSauce on TomHoffman's advice and such Python tools as [WWW]RDFLib.

RSS 1.0 is a good place to start with RDF.

Using RSS 1.0 as a place to study RDF is good since the conceptual model behind RSS is simple -- and if you understand other flavors of RSS (such as RSS 0.91 and 2.0) and use it (in blogging or RSS aggregation), you will know at least what RSS is supposed to be about.

Non-hype filled assessments of RDF and the somewhat related Semantic Web are hard to find.

Proponents of promising technologies such like XML and RDF often damage the reputation of their technologies by over-selling what those technologies can do. If you believe all the hype, you would have first believed that XML was magic and that it was going to solve all our interoperability problems. Then came along the RDF folks who then said XML didn't solve all those problems but RDF will.

Consider, for example, the statement "XML is syntax, RDF is semantics" from the [WWW]Semaview "At-a-Glance" Illustration Series: RDF and XML I can see how this statement is true in the specific case of RDF/XML serialization where the XML is being used to express the RDF. However, as a general statement, this is nonsense: Is a XML rendition of a DocBook just syntax? (As [WWW]Mark Butler says, "The following statements are nonsense: 'RDF is more semantic than XML', 'RDF allows us to reason concretely about the real world', 'The power of RDF is its semantic model')

Given the hype around the Semantic Web (and just around RDF), I've been wondering these technologies relate to other efforts. There is a need for a non-hype filled assessment of RDF and its relationship to a lot of other stuff that it gets associated with (fairly or unfairly) -- such as the semantic web in general.(to help place the context and answer questions. What is their relationship to AI? Knowledge representation? What philosophical presuppositions lie behind RDF and the Semantic Web?

My co-worker TomSchirmer pointed out how much RDF reminded him of Prolog. I could believe it -- but this fact is not commonly mentioned in discussions of RDF and the Semantic Web. It was, therefore, gratifying to read [WWW]"An Introduction to Prolog and RDF" by [WWW]Bijan Parsia in which he makes the helpful point that the SW is AI (so what don't people just say so?!):

He goes on:

Thanks for clearing up the confusion!

In a similar vein, one of the single most helpful resources that have helped me is Mark Butler's [WWW]"Is the Semantic Web Hype?" to which I have already refered. The slides are terse but very insightful, extemely helpful in sorting through the hype. We need more of the types of level-head evaluations of the technologies -- and they shouldn't be so hard to find!

Blending RDF vocabularies is probably easier than blending XML vocabularies but it's not magic either! Some human must do the mapping of meanings between vocabularies.

As I wrote above, a major reason I'm looking into RDF is seeing whether RDF makes it easier to blend and interconvert various data and metadata formats. My current conclusion is that blending RDF data is easier than blending XML data but that the blended RDF data doesn't magically reconcile various vocabularies.

This is an area with major confusion, partly fueled by hype from RDF advocates (see the RDF FAQ). A [WWW]thread on Sam Ruby's blog illustrates this confusion but also helps to bring some light into the topic. Let me trace parts of the conversation:

[WWW]Mark Pilgrim kicks it off:

[WWW]Mark Baker disagrees:

[WWW]Danny Ayer:

[WWW]Mark Pilgrim doesn't buy it:

[WWW]Danny Ayers:

[WWW]Michael Bernstein elaborates on these inference bots:

[WWW]Dan Brinkley has a nice summary:

Outside of this thread, [WWW]Mark Butler backs up this conclusion:

[WWW]Jon Udell, I think, concurs:

Too much abstraction and confusion might kill off RDF.

I've seen how hard it was for people to pick up XML. It's been hard for me to understand RDF. Granted, it doesn't need to be that hard -- and there is a real need for killer apps and good teaching materials. (This rant isn't it -- but my write-up might help somebody else with my background make sense of RDF.)

I like what [WWW]Sean McGrath proposes....not everyone has to learn the most abstract stuff -- but then, we'll have to make connections between the different levels of abstractions:

--> I agree with all the above, except the last. If RDF was going to die, it would already be dead. But the role it fits is necessary and inevitable. If it dies it will have to be recreated in almost exactly the same form, with all the same annoying complexities and contradictions.

Comments and Feedback

TomHoffman added his helpful comments to my essay and [WWW]added a few more points on his blog.

Looking good! You might want to check out the ESW Wiki, they're working on RDF FAQs and Dan Brickley's FOAF blog - may be a Wiki too) - sorry can't get links at the moment, my connection's playing up. DannyAyers


Why not [WWW]WikiAsYouLearn?

It's good that you've got notes here, but you should integrate them with [WWW]the ESW wiki.

That way you learn, we learn, we all learn. :)

-- 216.254.10.130 2004-07-03 04:32:40