CWM is a popular Semantic Web program that can do the following tasks:-
CWM was written in Python from 2000-10 onwards by Tim Berners-Lee and Dan Connolly of the W3C.
This resource is provided so that people can use CWM, find out what it does (documentation used to be sparse), and perhaps even contribute to its development.
To install CWM, you will first need to install Python if you don't have it on your machine. The latest version is highly recommended: certainly upgrade if you are using a 1.x.x version, since CWM seems to depend upon SAX (the simple XML API). CWM worked properly with Python 2.2 as of 2002-08.
Next, you will need to get the CWM modules. All of the CWM material is developed and hosted on the W3C site, in the SWAP (Semantic Web Area for Play/Semantic Web Application Platform) directory. However, since not all of the files are publically accessible (many of the files return "403 Forbidden"), Dan Connolly opened up the directory through CVS: /2000/10/swap/ in W3C CVS. The CVS mirror is approximately 50 minutes behind the canonical SWAP directory.
Important note: the latest version of CWM (files of 2002-02-25) is quite slow (check out the bugs reports: 1, and 2). TimBL is gradually speeding it up, but note that the 1.82 distribution of CWM is relatively stable, and has the same kind of functionality. My personal recommendation is that you try the latest version first, but if that doesn't work, go for the .tar.gz of v1.82.
You should get the following modules and put them into a single directory:-
Reminder: the tar.gz file for CWM 1.82 is available as cwm1.82.tar.gz (Winzip should be able to read tar.gz, but if not, please let me know.
CWM relies upon the SAX XML parser in Python to parse XML RDF, but this usually does not come with the Python distribution, and has to be installed separately. On Debian you can run "apt-get install python-xml" (tip courtesy of DanB and DanC). The following details noted in sax2rdf.py may also help:-
If you run Python on CygWin, you can just grab the pre-compiled package from dbs. However, when you upgrade Python it seems that you have to reinstall it, which is very frustrating. If in doubt, try to use the standard Windows Python installation from CygWin (you can run Windows programs through CygWin).
There are a number of resources available to get you started on CWM and Notation3. The main ones come from TimBL himself: the Notation3 Primer a page of CWM Examples, and the CWM Homepage. The CWM examples page is very good if you want to get started, but only describes a portion of what CWM can do, and leaves you to figure out the rest.
CWM is run at the command line as normal, and takes a number of different flags. The most common of these are listed on the SWAP page under Command line parameters (now up to date).
One important command line flag left off of the list is
--strings
. This prints out strings such that: "--strings Dump :s
to stdout ordered by :k whereever [sic] { :k log:outputString :s }" (from
cwm.py).
As a matter of interest, there is a log file of every time the string "CWM" was mentioned on the RDF IG channel.
I created a batch file in Windows for CWM, and I found that I most commonly
used only a certain set of commands. You can of course customize your own
commands, and perhaps make an sh
file, but these are the commands
that I find I most commonly use:-
python cwm.py %1 --think > %1.think
python cwm.py %1 --think --purge >
%1.purge
python cwm.py %1 -rdf >
%1.rdf
python cwm.py -rdf %1 -n3 >
%1.n3
python cwm.py %1 -ntriples -bySubject > %1.temp
||
python cwmntclean.py
%1.temp > %1.nt
I wrote a short CWM Utility as a Windows BAT or a Python script that can run some of the most popular commands; it allows you to have CWM running on your desktop, or from a very simple menu based interface.
Because CWM does not produce valid NTriples (in version 1.82 and before, at least), I wrote a Python script that will clean it up: cwmntclean.py.
CWM can do many things: merge documents; apply rules (by adding or filtering); convert between XML RDF, NTriples, and Notation3; flatten contexts; perform queries (just a subset of rules); do math, string functions, and get os variables; and so on... It's difficult to know where to start getting into it.
The SWAP test directory contains a number of experiments run by Tim and Dan to test CWM's functions. Here is a summary of the pieces...
There is a large regression test which forms the basis of the things that get checked regularly in CWM.
One of the things that you notice when you have used CWM for a while is how close in philosophy it is to the set of *nix tools*. You tend to start setting up filters (N3 rules files etc.) that can do certain tasks. Here are just some of the files that have already been developed.
RDF Lint and check.n3 are basically things that assist with validation, but note that there is not really such a thing as "invalid" data on the Semantic Web: there is only farily consistent, and inconsistent data. The utilities mentioned go through the data and flag the inconsistencies.
Quote from the SWAP homepage, exposing a central design philosophy:-
Cwm will run as a unix command, and is designed to be usable as a simple data manipulator for RDF on the lines of sed, awk, etc or xsl. - SWAP
Note that according to DanC, "Closed World Machine" is a misnomer because it can get documents from outside of what it is fed, by way of HTTP GET. It has two built-ins for getting stuff from files, log:semantics (gets the file and parses it as a set of N3 formulae) and log:content (gets the file and parses it as a string). Note also that in the strict sense of the phrase "Closed World", being able to gather files via. HTTP GET is not a test of closed-worldness (thanks to Bijan Parsia for pointing that out).
CWM and Notation3 are synonymous, as N3 is the serialization of RDF that CWM was built up around. However, Notation3 itself was just designed as a Wiki RDF format by TimBL and DanC, and as such was never formally specified by a Working Group, nor recommended by the W3C. This has left implementations rather inconsistent, notwithstanding DanC's efforts to standardize the langauge.
I did a long survey of local N3 grammars, culminating with The Great QName Survey. The list of N3 implementations as of 2002-01 are as follows:-
23:04:31 <timbl> * timbl nurdles a p3pr:statement [ p:data [ p:ref :x ]] [...] 23:05:07 <sbp> nurdle? [...] 23:05:39 <DanC> danc:noodle = timbl:nurdle. it got garbled over the phone, I think. [...] 23:06:14 <timbl> local:nurdle = english:cogitatesALittleAbout" [...] 23:06:49 <sbp> Thanks. I'd always wondered about nurdle.n3.py, and all I could find were links to Tiddlywinks... 23:06:59 <timbl> In what langauge are your personal langauges expressed? - http://ilrt.org/discovery/chatlogs/rdfig/2001-12-05.txt
There are currently a couple of Web services for CWM, including one on this very page.
SWAG have set up an N3 to RDF online service, which proves to be quite popular. It can also, in fact, convert from RDF to N3, and think about the stuff, so that you can enter rules etc.
The W3C maintain a small service, running Notation3.py as a CGI, referenced at the top of the N3 spec., and largely obsoleted by the SWAG service.
Here is a form for you to paste some N3, and convert into XML RDF (powered by some Aaron Swartz magic):-
There are a number of modules being written for CWM that let CWM do "special" things when it finds a rule with a certain predicate in it. For example, if a rule contains "<somefile.n3> log:content ?y", then CWM will actually open up "somefile.n3" and return its content as a string literal for ?y (N.B. adding the ? before a name is a shorthand for universally quantified variables).
The "log:" namespace is very important:-
http://www.w3.org/2000/10/swap/log#
It contains the log:implies, log:forSome, and log:forAll pseudo-properties that are used for First Order Predicate Logic. However, there are a number of other terms in the namespace that do a certain amount of stuff (from $Id: llyn.py,v 1.4 2001/11/19 15:26:14 timbl Exp $):-
Note on using some of these together:-
{ ( [ is log:semantics of <../daml-ex.n3> ] [ is log:semantics of <../invalid-ex.n3> ] [ is log:semantics of <../schema-rules.n3> ] ) log:conjunction [ log:conclusion :G]} log:implies { :result :is :G }.The above is a much more complicated way of writing the cwm command line "cwm daml-ex.n3 invalid-ex.n3 schema-rules.n3 --think".- http://dev.w3.org/cvsweb/2000/10/swap/test/includes/conjunction.n3
http://www.w3.org/2000/10/swap/string#
This module contains built-ins that let you process strings. The following properties are defined:-
Try the string schema, and the cwm_string.py module for more information.
http://www.w3.org/2000/10/swap/os#
Try the os schema, and the cwm_os.py module; it contains a property that will make CWM get the appropriate OS environment variable.
http://www.w3.org/2000/10/swap/crypto#
The module is available as cwm_crypto.py. cf. Cryptography In CWM: Hashes. You'll need to add a couple of obvious lines to llyn.py in order to register the built-ins.
The properties that one can use at the moment are just hash functions, that is, CWM will return the hash of the string:-
crypto:md5 a rdf:Property; rdfs:label "md5"; rdfs:comment "The MD5 hash of a string"; rdfs:domain string:String; rdfs:range string:String . crypto:sha a daml:UnambiguousProperty, daml:UniqueProperty; rdfs:label "sha"; rdfs:comment "The SHA hash of a string"; rdfs:domain string:String; rdfs:range string:String .
Note how SHA is assigned a higher trust level than its MD5 counterpart.
CWM: Mathematical Built-Ins. "CWM can now do addition, multiplication, subtraction, division, remainders, negation, exponentiation, count the members in a DAML list, and do the normal truth checking functions, only sub classed for numeric values."
For example:-
{ :x math:sumOf ([ math:quotientOf ("7" "2") ] [ math:exponentiationOf ([ math:remainderOf ("7" "2")] "10000000") ] [ is math:memberCount of ("a" "b" "c" "d" "e") ]) } log:implies { :x :valueOf "(7 / 2) + ((7 % 2)^10000000) + 5 [should be 9.5]" } .
gives the correct output:-
"9.5" :valueOf "(7 / 2) + ((7 % 2)^10000000) + 5 [should be 9.5]" .
In development by Mark Nottingham.
20:51:57 <mnot> I'm working on a cwm_uri module, but I need to be able to instantiate complex, anonymous objects based on the subject, so it's slow going - http://ilrt.org/discovery/chatlogs/rdfig/2001-12-01.txt
Also in development by mnot, but this time with running code and tests! Try: CWM built-in for XPath. It requires PyXML to make it run (as does CWM in general for XML RDF processing).
In fact, the builtin systax is rather simple - anyone with a fundamental knowledge of Python should be able to create a new builtin module just by going through the current builtins modules.
Deploying CWM on a large scale was never really on the cards, although lately it appears to be outgrowing its "play/demonstration code" status. CWM has managed to get implemented in a few projects.
NTriples is a special fixed subset of Notation3; without formulae, multiple object or po combinations, multiline literals, blank bNodes, or QNames; just good ol' triples, one per line. The NTriples specification is available from the W3C, edited by Dave Beckett and Art Barstow. NTriples is more expressive that XML RDF, easy to parse, and is an excellent lowest common denominator for serializations.
TimBL's "log", "string", "os", and "crypto" schemata, the PIM doc and contact schemata, and the EARL 0.95 schema.
The Euler proof mechanism is a proof engine written in Java by Jos De Roo, that uses Euler paths to infer without fear of endless loops. In can parse Notation3, including N3 rules.
CWMClone is an implemenation of CWM in Prolog, under development by Bijan Parsia. The CWMClone page itself contains some useful intructions on not only running the program, but also for rolling your own CWM.
CWMClone is a development project at this stage, but does work rather solidly.
I also wrote a CWM clone (that wasn't initially meant to be a CWM clone): Eep RDF API, Inference Engine, and NTriples/N3 Parser
There are plenty of things to consider when rolling your own version of CWM, besides the obvious "how many features should I implement?". Basically, CWM is comprised of the following parts:-
Each of thwse parts can be treated as essentially separate units.
@@ TODO: more stuff in this section.
I hope to reach cwm-enlightenment eventually, but I'm not holding my breath. - DanC, #rdfig 2001-05-07 00:18