Summary: This is a survey of the inconsistencies between the implementations of the Notation3 grammar, using the QName production as an example.
The following are relevant pieces excerpted from 8/9 Notation3 grammars and parsers. The /DesignIssues/Notation3 BNF is usually taken as definitive (but is broken). A conclusion and recommendation on the actual QName production follows the various lists.
/DesignIssues/Notation3 alpha = [A-Za-z] alphanumeric = [A-Za-z0-9_] prefix = ( alpha alphanumeric* ) | '_' localname = alpha alphanumeric* qname = prefix ":" localname /2000/10/swap/notation3.py _namechars = [a-z] + [A-Z] + [0-9] + '_-' qname() => _namechars* ':' _namechars* # v 1.87+ 2001/08/23 /2000/10/swap/rdfn3.g PREFIX: r'[a-zA-Z0-9_-]*:' QNAME: r'([a-zA-Z][a-zA-Z0-9_-]*)?:[a-zA-Z0-9_-]+' EXVAR: r'_:[a-zA-Z0-9_-]+' /2000/10/n3/notation3.py _namechars = [a-z] + [A-Z] + [0-9] + '_-' qname() => ( _namechars* ':' _namechars* ) | _namechars* /2000/10/swap/n3spark.py qname: r' [a-zA-Z0-9_-]*:[a-zA-Z0-9_-]* ' /2001/03/flaten3/lexer.l wordchar ([_A-Za-z$!]|[0-9]) {wordchar}*":"{wordchar}+ /cvsweb/~checkout~/2001/blindfold/sample/n3.bnf alpha ::= [a-zA-Z]; alphanumeric ::= alpha | [0-9] | "_"; nprefix ::= "" | ((alpha | "_") alphanumeric*); localname ::= alpha alphanumeric*; qname ::= nprefix ":" localname; RDF::Notation3/Notation3.pm $tk =~ /^([_a-zA-Z]\w*)?:$/o) $tk =~ /^([_a-zA-Z]\w*)?:[a-zA-Z]\w*$/o eep/n3.py Name = r'[A-Za-z0-9_]+' bNode = r'_:' + Name QName = r'[A-Za-z0-9]*:' + Name Prefix = r'[A-Za-z0-9]*:'
To summarize the various productions in a canonical format:-
/DesignIssues/Notation3 prefix = [A-Za-z][A-Za-z0-9_]* | '_' name = [A-Za-z][A-Za-z0-9_]* /2000/10/swap/notation3.py prefix = [A-Za-z0-9_-]* name = [A-Za-z0-9_-]* /2000/10/swap/rdfn3.g prefix = [A-Za-z0-9_-]* | ([A-Za-z][A-Za-z0-9_-]*)? # ??? name = [A-Za-z0-9_-]+ /2000/10/swap/n3spark.py prefix = [A-Za-z0-9_-]* name = [A-Za-z0-9_-]* /2001/03/flaten3/lexer.l prefix = [A-Za-z0-9_$!]* name = [A-Za-z0-9_$!]+ /cvsweb/~checkout~/2001/blindfold/sample/n3.bnf prefix = '' | [A-Za-z_][A-Za-z0-9_]* name = [A-Za-z][A-Za-z0-9_]* RDF::Notation3/Notation3.pm prefix = ([A-Za-z_]\w*)? name = [A-Za-z]\w* eep/n3.py prefix = [A-Za-z0-9]* | '_' name = [A-Za-z0-9_]+
For comparison:-
Prefixes DesignIssues = [A-Za-z][A-Za-z0-9_]* | '_' notation3.py = [A-Za-z0-9_-]* rdfn3.g = [A-Za-z0-9_-]* | ([A-Za-z][A-Za-z0-9_-]*)? # ??? n3spark.py = [A-Za-z0-9_-]* lexer.l = [A-Za-z0-9_$!]* n3.bnf = [A-Za-z_][A-Za-z0-9_]* | '' Notation3.pm = ([A-Za-z_]\w*)? Eep n3.py = [A-Za-z0-9]* | '_' Names DesignIssues = [A-Za-z][A-Za-z0-9_]* notation3.py = [A-Za-z0-9_-]* rdfn3.g = [A-Za-z0-9_-]+ n3spark.py = [A-Za-z0-9_-]* lexer.l = [A-Za-z0-9_$!]+ n3.bnf = [A-Za-z][A-Za-z0-9_]* Notation3.pm = [A-Za-z]\w* Eep n3.py = [A-Za-z0-9_]+
It is interesting that only notation3.py and n3spark.py agree on the QName production. As already mentioned, the DesignIssues BNF is slightly ambigous in that it lists "alpha", "alphanum*" and "_" as the prefixes, which doesn't make sense, and even disallows "void". The rdfn3.g grammar is broken since it allows (for example) "_0" to be declared as a prefix, but not used as a QName. The Eep n3.py version was my initial interpretation of the production, and I will change it to that which I recommend below.
In general, it is better to "be conservative in what you write, and liberal in what you accept" (to paraphrase Tim), so Notation3 parsers should probably use the most liberal of the productions above, and Notation3 writers (including humans) the most conservative. However, it would be nice if everyone could agree on a production.
One implementation question is whether or not "_" as the bNode prefix should be overridable. CWM allows one to do this, but I think that this is confusing for people who are trying to learn Notation3, and not all that difficult to ban in a parser. The recommendation below is based upon all of the productions above.
Recommendation prefix = ([A-Za-z][A-Za-z0-9_]*)? | '_' name = [A-Za-z0-9_]+
Notes on the recommendation: The hyphen-minus "-" character is generally disallowed since the DesignIssues note excludes it for its grammar, reserving the character for future use. I have allowed "_" as the first character of a name since I have seen this used in various Notation3 files already - notwithstanding the fact that Notation3, n3.bnf, and Notation3.pm disallow it.
Todo: now repeat for all of the Notation3 productions!
Sean B. Palmer