This is a general collection of my Semantic Web hackings, often using CWM and the Notation3 (N3) format. It's about time I started collecting this in one place.
EARL is a generic language for the evaluation of resources, based on the RDF model. See the RDF Schema (in XML RDF, and in N3). Significant SW hackings based upon EARL are the 0.9 to 0.95 conversion experiment, filtering out disagreements, and the graph merging with WCAG, for which I had to create a machine readable WCAG version.
See also my Python EARL report program.
Eep3 is my RDF API playground-code that is used to develop features that can later be folded into the more stable semplesh RDF API. Here's what Eep3 contains in general:-
Nearby: the afon.py Notation3 parser, and release notes.
I wrote a Guide to the Closed World Machine, billed as "an anthology of information about CWM, what you need to get to run it, related file and work, and so on". Since documentation has always been rather lacking on CWM, this has been rather well received. It contains links to the files, a tar ball, related projects, a Web service, tests, and lots of little hints and tips on what to do with CWM once you've got it.
I published an article entitled The Semantic Web: An Introduction. It's quite lengthy, and is intended as something that I can point people at, now that whatIsSW has gotten a bit out of date. It's quite a good general discussion of many SW principles.
I've made available a Semantic Web and Resource Description Framework Hints and Tips page. I quote: "It is important that on the Semantic Web, people produce data that is clean and interoperable. Some RDF techniques can currently only be learned through the RDF community, through hours of research, or through implementation experience, so this is an attempt to gather some useful but quick hints and tips into one place.".
I prepared a fairly comprehensive paper on methods of embedding RDF in HTML, which has been recieved warmly by various people. Here's the abstract:-
Since there is no one standardized approach for associating RDF compatible metadata with HTML, and since this is one of the most frequently asked questions on the RDF mailing lists, this document is provided as an outline of some RDF-in-HTML approaches that the author is aware of.
I released a small cryptography module for CWM which has now been integrated into the SWAP stuff. It allows one to generate keypairs, and sign and verify documents, as well as providing MD5 and SHA-1 hashing facilities. It seems to work well (TimBL is already powering a crypto demonstration with it); I did have to invent a new format for the keys, though:-
All I did was to encode the encryption exponent and the modulus of the key as base64 printed quotable blocks (plus the other 3 important bits of the key for keypairs). If you have a relatively granular cryptographic package, it should be possible to reconstruct a key from the blocks quite easily.
It uses PyCrypt and mxCrypto, which is not tremendously easy to install, but should pose no huge problems to anyone with a *nix shell, or CygWin.
See the RDF Lint directory. It contains some rules files that you can merge using CWM with any schemata and/or instances, and it will find deductions about the data. If you filter the output with sparser.n3, then you have an effective RDF Schema "validator", which could be tweaked to validate instances.
I've hacked up some Simpsons related stuff in RDF, to test out stuff like graph merging, inferences, conversion to XML RDF, and screen scraping into XHTML form. This little experiment has it all! It is probably the best demonstration that I've come up with so far as to how the Semantic Web can be useful, and it's just about the only time that the Semantic Web has helped me out considerably in a day-to-day context.
I proposed a basic stripped down syntax for RDF which I call BSWL, or the Basic Semantic Web Language. Basically, it let's you do simple triples, and other stuff. cf. News item on the Cover pages
Instead of just putting some information under each WikiPage, you have to supply a predicate (which can be a WikiName or a URI), and an object (which can be a WikiName or a Literal). The Wiki can be exported as N-Triples (and only N-Triples at the moment).
Coordinates: Processing Distances in RDF with CWM. I Managed to scrape (using a RegExp) a list of coordinates for the main cities in the USA into Notation3. i.e. Latitudes, longitudes, and information like the names of the cities.
Then, after modifying the math built-ins module for CWM, I was able to perform some calculations on the data, such that I can find the distance between any two cities in America (that are on the list).
SWN: Terms for the Semantic Web. Quite experimental in that it uses XML RDF + XSLT at the moment to display information about the terms. The interesting thing is how it uses CWM to keep track of and manage the terms in the namespace/vocabulary. The relevant files are in the swns directory.
This was meant to be a "fix" for what goes at the end of the namespace; XML RDF for the machines (should be served as application/rdf+xml), and a transformation into XHTML for the humans. What could be better? To those who doubt this method, swn-get-doc.n3 is a very simple Notation3 rules file that will follow the rdfs:isDefinedBy links on the hub page, and get the documentation for the terms. (result).
stuff.html is a general introduction to some of the files, and what they are supposed to do. It (and many more indexes like it) should probably be maintained as Notation3, e.g. stuff.n3 (no automatic conversion into XHTML yet).
The properties that one can use at the moment are:-
crypto:md5 a rdf:Property; rdfs:label "md5"; rdfs:comment "The MD5 hash of a string"; rdfs:domain string:String; rdfs:range string:String . crypto:sha a daml:UnambiguousProperty, daml:UniqueProperty; rdfs:label "sha"; rdfs:comment "The SHA hash of a string"; rdfs:domain string:String; rdfs:range string:String .
To get the hash of a file, you of course have to use log:content on it... I did consider just putting in a built-in function that would do that for you, but it seems more sensible to deploy one standard approach. I also considered using a new "hash:" URI scheme to identify hashes as first class objects on the Web, but after considering it carefully, decided not to.
"a.n3" and "b.n3" in that folder are taken to be files which are just found on the Web. "proof.n3" is a proof that involves these files, and "checksums.n3" is a kind of KB of checksums that could automatically have deduced from "a.n3" and "b.n3" using a proof-checking machine. Perhaps this functionality could be added to CWM. Note that this experiment could have been repeated using digital signatures rather than proofs, but that would have been more difficult.
What we want to do is prove some assertion using statements on the
Semantic Web. We do that by gathering data, validating it, and then
applying inference rules to it until we get to our answer. In this case, we
want to prove that "
:Joe :loves :Mary".
What it does is simply check that the files mentioned have a particular
checksum, and that they contain data relative to the inference rules. If
so, then we return a "
p:Success" for those particular checks.
Once we have a certain amount of checks that are successes, then we have our
From the announcement CWM can now do addition, multiplication, subtraction, division, remainders, negation, exponentiation, count the members in a DAML list, and do the normal truth checking functions, only sub classed for numeric values.
I was a co-founder of the Semantic Web Agreement Group, along with Seth Russell, Aaron Swartz, and William Loughborough. SWAG is a group comitted to ensuring data interoperability on the Semantic Web. We set up the WebNS dictionary, and published RDF Namespace Best Practices for example.
I also published a Python version of the SWAG Dictionary.
python cwm.py sbp-wai.n3 -think -bySubject -rdf > sbp-wai.rdf
Note that there's an embedded XML stylesheet PI in sbp-wai.rdf, so that if you have IE5 with MSXML3, or IE6, then you'll automatically get the DOT output. Otherwise, use the W3C's XSLT service (see the non-PI version).
William Loughborough and I set up UWIMP to investigate annotating and indexing Web content, making the service available for anyone.
I built a Semantic Web homepage for myself, but found out that most of the stuff about me was boring. But really, why didn't it work? Because it was too general.
[...] just scattering the Web with such remarks will in the end be very interesting, but in the short term won't produce repeatable results unless we restrict the expressiveness of documents to solve particular application problems. - Semantic Web Roadmap, Tim Berners-Lee
And there's no data really that people would want to infer from just the simple stuff such as my publications and so forth. I do however maintain a small Semantic Web KB for myself based on FOAF terms, just for novelty purposes. I also published an archive of a small percentage of the Notation3 files that I have worked on; cf. file index.
Sean's work is inspiring, interesting, exciting, and quite often completely useless. - Aaron Swartz