Pondering RDF Path

Most RDF Path proposals to date are sketchy, and do not provide clear equivalents of the facilities in XPath, or do not account for the fact that selections of a resource occuring as a node and an arc causes problems.

The notable exception to lack of XPath inheritance is Stefan Kokkelink's proposal, where he correctly identifies primary selection (selection of context nodes), filtering (tests on nodes), and location steps (routes from context nodes) as universal path constructs.

The nodes & arcs problem is more subtle. If an RDFPath language only provides mechanisms for obtaining nodes, then it has a severe drawback. If, on the other hand, it does not distinguish nodes from arcs, then making a location step may result in ambiguity: if you want to gain the rdf:type of rdf:type, and you do that using rdf:type/*, you will get both rdf:type and rdfs:Class as a result, which is clearly incorrect.

The following is an attempt at an RDF Path language, triple oriented, which is very clear in its approach, and tries as far as possible to mimic that of XPath, treating a graph as an extended tree with no root. It also does not conflate arcs and nodes.

The Syntax

The following functions are used to select--or, if an argument is passed, to type--context nodes. For typing, node() is used as a default.

The following are used for selection of context nodes only.

arc(pred()) is equivalent to arc(), of course, and arc(subj()) is valid, and will return, as an arc, any resource that occurs as a subject.

QNames will be resolved in context; if RDF Path is being used in XML, they can be taken from the XML Namespaces in the document. The default namespace is given using a colon, :.

You can OR the selectors together using a pipe |. The ...() selectors can also take single arguments which AND together. And we define a couple of other abbreviations, too.

The / slash denotes moving to an arc if a node is currently selected, and a node if an arc is currently selected. In other words, typing is striped depending upon the root context's type. For example, node()/* will always be typed as an arc(), and arc()/* will always be typed as a node().

Filtering, i.e. tests on nodes, can be performed using []. Paths are checked to be true for any triple.

Todo: special syntax for handling rdf:List etc., and typed nodes.

Examples

We use the following example graph:-

:Quacky :name "Quacky" .
:Quacky :owner :Bob .
:Quacky :owner :Mary .
:Bob :name "Bob" .
:Mary :name "Mary" .

For example, :Quacky on the example graph gives:-

:Quacky :name "Quacky" .
:Quacky :owner :Bob .
:Quacky :owner :Mary .
:Bob :name "Bob" .
:Mary :name "Mary" .

:Quacky/* gives:-

:Quacky -> :name "Quacky" .
:Quacky -> :owner :Bob .
:Quacky -> :owner :Mary .
:Bob :name "Bob" .
:Mary :name "Mary" .

pred(:owner) will not return anything on this graph, since :owner only occurs as a predicate. However, arc(:owner) will give:-

:Quacky :name "Quacky" .
:Quacky :owner :Bob .
:Quacky :owner :Mary .
:Bob :name "Bob" .
:Mary :name "Mary" .

And arc(:owner)/* gives:-

:Quacky :name "Quacky" .
:Quacky :owner -> :Bob .
:Quacky :owner -> :Mary .
:Bob :name "Bob" .
:Mary :name "Mary" .

objt(:Bob) gives:-

:Quacky :name "Quacky" .
:Quacky :owner :Bob .
:Quacky :owner :Mary .
:Bob :name "Bob" .
:Mary :name "Mary" .

objt(:Quacky) would return nothing, since :Quacky only ever occurs as a subject.

Now suppose we wanted each of :Quacky's :owner's :name. That would be :Quacky/:owner/*/:name/*.

:Quacky :name "Quacky" .
:Quacky -> :owner -> :Bob .
:Quacky -> :owner -> :Mary .
:Bob -> :name "Bob" .
:Mary -> :name "Mary" .

If we wanted to get all of the owners of the resource whose name is "Quacky", we would do *[:name/"Quacky"]/:owner/*

:Quacky :name "Quacky" .
:Quacky -> :owner -> :Bob .
:Quacky -> :owner -> :Mary .
:Bob :name "Bob" .
:Mary :name "Mary" .

Summary of the Examples

Each RDF Path query can be said to return a list of either arcs or nodes. We denote this here by either arcs[..., ...] or nodes[..., ...].

QueryResult
:Quackynodes[:Quacky]
:Quacky/*arcs[:name, :owner]
pred(:owner)nodes[]
arc(:owner)arcs[]
arc(:owner)/*nodes[:Bob, :Mary]
objt(:Bob)nodes[:Bob]
objt(:Quacky)nodes[]
:Quacky/:owner/*/:name/*nodes["Bob", "Mary"]
*[:name/"Quacky"]/:owner/*nodes[:Bob, :Mary]

References

Other RDF Path attempts include...