Sean B. Palmer, Semantic Web Hacking

This is a general collection of my Semantic Web hackings, often using CWM and the Notation3 (N3) format. It's about time I started collecting this in one place.

EARL Stuff - Evaluation And Report Language

Eep & Semplesh

Eep3 is my RDF API playground-code that is used to develop features that can later be folded into the more stable semplesh RDF API. Here's what Eep3 contains in general:-

SW Education and Outreach

Guide To CWM

I wrote a Guide to the Closed World Machine, billed as "an anthology of information about CWM, what you need to get to run it, related file and work, and so on". Since documentation has always been rather lacking on CWM, this has been rather well received. It contains links to the files, a tar ball, related projects, a Web service, tests, and lots of little hints and tips on what to do with CWM once you've got it.

SWIntro

I published an article entitled The Semantic Web: An Introduction. It's quite lengthy, and is intended as something that I can point people at, now that whatIsSW has gotten a bit out of date. It's quite a good general discussion of many SW principles.

SW Hints And Tips

I've made available a Semantic Web and Resource Description Framework Hints and Tips page. I quote: "It is important that on the Semantic Web, people produce data that is clean and interoperable. Some RDF techniques can currently only be learned through the RDF community, through hours of research, or through implementation experience, so this is an attempt to gather some useful but quick hints and tips into one place.".

RDF in HTML

Cryptography in CWM

I released a small cryptography module for CWM which has now been integrated into the SWAP stuff. It allows one to generate keypairs, and sign and verify documents, as well as providing MD5 and SHA-1 hashing facilities. It seems to work well (TimBL is already powering a crypto demonstration with it); I did have to invent a new format for the keys, though:-

It uses PyCrypt and mxCrypto, which is not tremendously easy to install, but should pose no huge problems to anyone with a *nix shell, or CygWin.

RDF Lint

See the RDF Lint directory. It contains some rules files that you can merge using CWM with any schemata and/or instances, and it will find deductions about the data. If you filter the output with sparser.n3, then you have an effective RDF Schema "validator", which could be tweaked to validate instances.

The Simpsons, In RDF

I've hacked up some Simpsons related stuff in RDF, to test out stuff like graph merging, inferences, conversion to XML RDF, and screen scraping into XHTML form. This little experiment has it all! It is probably the best demonstration that I've come up with so far as to how the Semantic Web can be useful, and it's just about the only time that the Semantic Web has helped me out considerably in a day-to-day context.

BSWL - Basic Semantic Web Language

RDFWiki

RDFWiki (documentation/version 1.1) is like any conventional Wiki, except that all of the WikiNames have URIs (when exported), and all of the data is stored as RDF.

Instead of just putting some information under each WikiPage, you have to supply a predicate (which can be a WikiName or a URI), and an object (which can be a WikiName or a Literal). The Wiki can be exported as N-Triples (and only N-Triples at the moment).

Processing Distances in RDF

Coordinates: Processing Distances in RDF with CWM. I Managed to scrape (using a RegExp) a list of coordinates for the main cities in the USA into Notation3. i.e. Latitudes, longitudes, and information like the names of the cities.

Then, after modifying the math built-ins module for CWM, I was able to perform some calculations on the data, such that I can find the distance between any two cities in America (that are on the list).

SWN - Terms For The Semantic Web

SWN: Terms for the Semantic Web. Quite experimental in that it uses XML RDF + XSLT at the moment to display information about the terms. The interesting thing is how it uses CWM to keep track of and manage the terms in the namespace/vocabulary. The relevant files are in the swns directory.

This was meant to be a "fix" for what goes at the end of the namespace; XML RDF for the machines (should be served as application/rdf+xml), and a transformation into XHTML for the humans. What could be better? To those who doubt this method, swn-get-doc.n3 is a very simple Notation3 rules file that will follow the rdfs:isDefinedBy links on the hub page, and get the documentation for the terms. (result).

stuff.html is a general introduction to some of the files, and what they are supposed to do. It (and many more indexes like it) should probably be maintained as Notation3, e.g. stuff.n3 (no automatic conversion into XHTML yet).

Older Trust/Proof Mechanisms And CWM

Hashes In CWM

From the announcement: "the beginnings of a cryptography module for CWM, as "cwm_crypto.py", with hash-finding built-ins."

The properties that one can use at the moment are:-

   crypto:md5 a rdf:Property; rdfs:label "md5";
      rdfs:comment "The MD5 hash of a string";
      rdfs:domain string:String; rdfs:range string:String .

   crypto:sha a daml:UnambiguousProperty,
      daml:UniqueProperty; rdfs:label "sha";
      rdfs:comment "The SHA hash of a string";
      rdfs:domain string:String; rdfs:range string:String .

To get the hash of a file, you of course have to use log:content on it... I did consider just putting in a built-in function that would do that for you, but it seems more sensible to deploy one standard approach. I also considered using a new "hash:" URI scheme to identify hashes as first class objects on the Web, but after considering it carefully, decided not to.

Trust Implementation

Inspired by TimBL's Semantic Web Toolbox, I hacked up a small trust implementation.

Proof Experiments

I've also been messing about with some proof experiments (based on nurdle and TimBL's "Rules"), and although they are incomplete, I thought I'd publish them now.

"a.n3" and "b.n3" in that folder are taken to be files which are just found on the Web. "proof.n3" is a proof that involves these files, and "checksums.n3" is a kind of KB of checksums that could automatically have deduced from "a.n3" and "b.n3" using a proof-checking machine. Perhaps this functionality could be added to CWM. Note that this experiment could have been repeated using digital signatures rather than proofs, but that would have been more difficult.

What we want to do is prove some assertion using statements on the Semantic Web. We do that by gathering data, validating it, and then applying inference rules to it until we get to our answer. In this case, we want to prove that ":Joe :loves :Mary".

What it does is simply check that the files mentioned have a particular checksum, and that they contain data relative to the inference rules. If so, then we return a "p:Success" for those particular checks. Once we have a certain amount of checks that are successes, then we have our proof.

CSS2 Properties In RDF

I prepared an index Of CSS2 Properties In RDF, after is was requested by Al Gilman, on WAI GL. q.v. the RegExp in Python, and the results in Notation3.

CWM: Mathematical Built-Ins

From the announcement CWM can now do addition, multiplication, subtraction, division, remainders, negation, exponentiation, count the members in a DAML list, and do the normal truth checking functions, only sub classed for numeric values.

Try the module cwm_math.py, and the test math-test.n3 etc., all attached.

My Homepage

My homepage used to be generated by CWM from a Notation3 source (home.n3), using a Notation3 transform (hometoxhtml.n3). This was a basic demonstration of the --strings command line option in CWM.

SWAG - Semantic Web Agreement Group

I was a co-founder of the Semantic Web Agreement Group, along with Seth Russell, Aaron Swartz, and William Loughborough. SWAG is a group comitted to ensuring data interoperability on the Semantic Web. We set up the WebNS dictionary, and published RDF Namespace Best Practices for example.

I also published a Python version of the SWAG Dictionary.

URI Schemes in RDF

I was a dc:contributor to a list of URI schemes in RDF, now maintained by DanC/W3C. cf. my original version.

Chords/Tablature in RDF

I converted a chord dictionary to RDF, using a Python RegExp. Try: the input file, and the output as N3, and as XML RDF.

Graphviz Circles And Arrows From RDF

Based upon DanC's Circles and Arrows stuff, I made a simple N3 representation of my WAI involvements (sbp-wai.n3), converted this to XML RDF using the following command line:-

python cwm.py sbp-wai.n3 -think -bySubject -rdf > sbp-wai.rdf

To get sbp-wai.rdf, and then using DanC's RDF2DOT XSLT transformation, got sbp-wai.dot as a result.

Note that there's an embedded XML stylesheet PI in sbp-wai.rdf, so that if you have IE5 with MSXML3, or IE6, then you'll automatically get the DOT output. Otherwise, use the W3C's XSLT service (see the non-PI version).

UWIMP - Universal Web Indexing and Maintainence Program

William Loughborough and I set up UWIMP to investigate annotating and indexing Web content, making the service available for anyone.

Semantic Web Homepage

I built a Semantic Web homepage for myself, but found out that most of the stuff about me was boring. But really, why didn't it work? Because it was too general.

[...] just scattering the Web with such remarks will in the end be very interesting, but in the short term won't produce repeatable results unless we restrict the expressiveness of documents to solve particular application problems. - Semantic Web Roadmap, Tim Berners-Lee

And there's no data really that people would want to infer from just the simple stuff such as my publications and so forth. I do however maintain a small Semantic Web KB for myself based on FOAF terms, just for novelty purposes. I also published an archive of a small percentage of the Notation3 files that I have worked on; cf. file index.

Other Stuff

Code And Other Hackings

Some RDFS(FA) Hacking in Notation3: Includes the schema drafted for the first time in "proper" RDF.
TAG URI Python CGI: Converts a TAG URI into an HTTP URI
Some RDF bookmarklets: Inspired by DanBri
The blogspace links in RDF: From which I created the SVG.
Extensions to CWM - cwm_string.py: I added some built-ins to CWM to evaluate whether one string contains another, doesn't contain another, or are equal ignoring case. Try also: chat on RDF IG, at about 20:44:55

Prose

What Is The SW?: A short investigation into what the Semantic Web is
The Semantic Web, Taking Form: A short introduction to the SW, and a brief investigation of other issues
W3C SWAD Index: A list of the W3C's Semantic Web Advanced Development work, much of which is not referenced from anywhere else.
What Name Should A Namespace Name?: Conversation on RDF IG
CWM Bug: Terms Should Not Be Reserved: A note to TimBL, debating alternatives for reserving terms in documents for root contexts, etc.
Hash vs. Slash in #swhack.: This resulted in us launching PTS
Hash vs. Slash: What Is Identified?: No, nothing to do with Slashdot... it's the question of whether "namespaces" on the Semantic Web should end with a "#" or a "/" character. Opinion appears to be divided, so I drafted some notes on the subject.

See also Notation3, A Rough Guide. Note that this page is sometimes refered to as "SBP SWAD". Try the previous versions in the Web archive.

SBP Semantic Web Hacking