[SBP SW]

The Simpsons - In RDF

This was to test a few things:-

Implementation Notes

Firstly, I hacked up a Notation3 version of the Episodes QuickList (ql.n3). I'd also been taking down quotes from episodes, which I entered into a form which put them into N3 format, (quotes.n3).

"quotes-out.n3" is a cleaned up version of that, using the following command line:-

     python cwm.py ql.n3 quotes.n3 -think
        -filter=quot-mergef.n3 > quotes-out.n3

Then, by running:-

     python cwm.py quotes-out.n3 -bySubject
        -rdf > quotes-out.rdf

You get the RDF version, as "quotes-out.rdf". If you go to that file, hopefully you'll actually get the XSLT transformation into XHTML, using "rdf2xhtml.xsl", because I've put the PI in the top... that's if you're running IE5.5/6 If not, use the W3C's XSLT service to get the Quotes XHTML output.

There's some other stuff in the directory as well - for example, "sbp-top.n3" is a list of my favourite episodes by ID, which I could then merge in with all the other stuff if I wanted to.

What's Goin' On?

Screen scraping by hand - how slow?

To get the quick list into Notation3 format, I had to copy, paste, and edit by hand, although I did come up with a couple of on-the-fly hacks and use CWM to make it a bit easier. I could have run it through tidy and done an XSLT screen scrape on it I suppose, but why should I have to go to all that trouble?

Anyhow, it's clear that we need to get more people to start providing the data that really counts in RDF... but when it comes down to it, you're probably going to have to do it yourself.

Just imagine a world where all of this was already done!

General RDF munging by CWM

The funny bit. All Simpsons episodes have uniqe episode numbers, which are noted in the closing credits. Because they're unique, you can use them as unambiguous properties for the episodes themselves:-

{ :episodeNumber a daml:UnambiguousProperty .
  :x :episodeNumber "3G01" . :y :episodeNumber "3G01" }
log:implies
{ :x = :y } .

Which is very useful for merging data. In this example, I merged a list of Homer quotes (with only the episode numbers) with the list of episodes to get information about the season, and title. Note that I could have done other things such as merging with a list of my favourite episodes, other people's favourite episodes, air dates, or whatever. Repurposability!

XSLT scraping, a la DanC, into XHTML

Quite easy, but proves a point: XSLT is a useful tool. The whole point of this excercise is that I wanted to produce a short list of Simpsons quotes in XHTML for my own amusement and the amusement of others, but I didn't really want to have to look up all of the production details and so forth to do so. Having all of the information in RDF and then being able to merge it, process it, query it, and then transform it into perfectly generated XHTML is just great!

SBP (© Sean B. Palmer). Last modified: $ $