This document describes the Compact Syntax for RELAX NG (a schema language for XML). The design goals of this syntax are:
The syntax is similar to the type syntax in the XQuery 1.0 Formal Semantics W3C Working Draft.
The syntax is defined by the following BNF:
topLevel ::= decl* topLevelBody topLevelBody ::= pattern | grammar decl ::= "namespace" identifier "=" (literal | "inherit") | "default" "namespace" identifier? "=" (literal | "inherit") | "datatypes" identifier "=" literal pattern ::= particle | particle ("|" particle)+ | particle ("," particle)+ | particle ("&" particle)+ | exceptParticle particle ::= annotations primary followAnnotations occurrence? exceptParticle ::= annotations datatypeName params? "-" annotations primary followAnnotations primary ::= "(" pattern ")" | "element" nameClass "{" pattern "}" | "attribute" nameClass "{" pattern "}" | "mixed" "{" pattern "}" | "empty" | "notAllowed" | "text" | "list" "{" pattern "}" | datatypeName params? | datatypeName? datatypeValue | "grammar" "{" grammar "}" | ref | "parent" ref | "externalRef" literal inherit? occurrence = ("*" | "+" | "?") followAnnotations nameClass ::= basicNameClass followAnnotations | basicNameClass followAnnotations ("|" basicNameClass followAnnotations)+ | openNameClass "-" basicNameClass followAnnotations basicNameClass ::= annotations (identifier | CName) | openNameClass | annotations "(" nameClass ")" openNameClass ::= annotations (nsName | anyName) ref ::= identifierNotKeyword datatypeName ::= CName | "string" | "token" datatypeValue ::= literal params ::= "{" (annotations identifier "=" literal)+ "}" grammar ::= (definition | include | div | annotationElementNotKeyword)* definition ::= annotations subject ("=" | "|=" | "&=") pattern subject ::= "start" | identifierNotKeyword include ::= annotations "include" literal inherit? includeBody? includeBody ::= "{" (definition | includeDiv | annotationElementNotKeyword)* "}" div ::= annotations "div" "{" grammar "}" includeDiv ::= annotations "div" includeBody inherit ::= "inherit" "=" identifier followAnnotations ::= (">>" annotationElement)* annotations ::= documentation* otherAnnotation? otherAnnotation ::= "[" prefixedAnnotationAttribute* annotationElement* "]" annotationAttribute ::= (identifier | CName) "=" literal prefixedAnnotationAttribute ::= CName "=" literal annotationElement ::= (identifier | CName) annotationElementBody annotationElementNotKeyword ::= (identifierNotKeyword | CName) annotationElementBody annotationElementBody ::= "[" annotationAttribute* (annotationElement | literal)* "]" identifierNotKeyword ::= identifier - keyword identifier ::= NCName | escapedIdentifier keyword ::= "attribute" | "default" | "datatypes" | "div" | "element" | "empty" | "externalRef" | "grammar" | "include" | "inherit" | "list" | "mixed" | "namespace" | "notAllowed" | "parent" | "start" | "string" | "text" | "token" CName ::= NCName ":" NCName escapedIdentifier ::= "\" NCName literal ::= literalSegment+ literalSegment ::= '"' [^"]* '"' | "'" [^']* "'" nsName ::= NCName ":*" anyName ::= "*" documentation ::= "##" [^#xA]* (#xA [#x9#x20]* "##" [^#xA])*
The contents of consecutive literalSegment
s in a
literal
are concatenated.
Comments start with a #
followed by anything other than #
and continue to the end of the line.
element
is defined in the XML 1.0 Recommendation;
NCName
is defined in the
XML Namespaces
Recommendation.
Note that keywords are case-sensitive. To use a keyword as the name
of a definition, the keyword must be escaped with \
. It
is not necessary to escape a keyword that is used as the name of an
element, attribute or datatype parameter.
Before parsing against the above grammar, the input string is first
preprocessed by interpreting escapes. A sequence of characters
\x{N}
, where N
consists of one or more hexadecimal digits, is replaced by the Unicode
character with code N
. For example,
element \x{66}\x{6f}\x{6f} { empty }
is equivalent to
element foo { empty }
The correspondence between the compact syntax and RELAX NG's XML syntax is shown by the following tables.
Compact Syntax | RELAX NG Syntax |
---|---|
p1 | p2
|
<choice> p1 p2 </choice> |
p1 , p2
|
<group> p1 p2 </group> |
p1 & p2
|
<interleave> p1 p2 </interleave> |
p*
|
<zeroOrMore> p </zeroOrMore> |
p+
|
<oneOrMore> p </oneOrMore> |
p?
|
<optional> p </optional> |
(p)
|
p |
element QName { p }
|
<element name="QName"> p </element>
|
element nameClass { p }
|
<element> nameClass p </element>
|
attribute QName { p }
|
<attribute name="QName"> p </attribute>
|
attribute nameClass { p }
|
<attribute> nameClass p </attribute>
|
empty |
<empty/> |
notAllowed |
<notAllowed/> |
text |
<text/> |
mixed { p }
|
<mixed> p </mixed>
|
list { p }
|
<list> p </list>
|
identifierNotKeyword |
<ref name="identifierNotKeyword"/> |
\identifier |
<ref name="identifier"/> |
externalRef "uri" |
<externalRef href="uri"/> |
parent identifier |
<parentRef name="identifier"/> |
grammar { defs } |
<grammar> defs </grammar> |
"string" |
<value>string</value> |
string |
<data type="string"/> |
token |
<data type="token"/> |
prefix:localName |
<data type="localName" datatypeLibrary="uri"/> |
prefix:localName "string" |
<value type="localName" datatypeLibrary="uri">string</value> |
prefix:localName - p |
<data type="localName" datatypeLibrary="uri"><except> p </except></data> |
prefix:localName { params } |
<data type="localName"
datatypeLibrary="uri">
params
</data> |
Compact Syntax | RELAX NG Syntax |
---|---|
QName |
<name>QName</name> |
prefix:* |
<nsName ns="uri"/> |
prefix:* - nameClass |
<nsName ns="uri"<except> nameClass </except></nsName> |
* |
<anyName/> |
* - nameClass |
<anyName><except> nameClass </except></anyName> |
nameClass1 | nameClass2
|
<choice> nameClass1 nameClass2 </choice> |
(nameClass)
|
nameClass |
Compact Syntax | RELAX NG Syntax |
---|---|
localName = "string" |
<param name="localName">string</param> |
Compact Syntax | RELAX NG Syntax |
---|---|
identifierNotKeyword = p |
<define name="identifierNotKeyword"> p </define> |
identifierNotKeyword |= p |
<define name="identifierNotKeyword" combine="choice"> p </define> |
identifierNotKeyword &= p |
<define name="identifierNotKeyword" combine="interleave"> p </define> |
start = p |
<start> p </start> |
\identifier = p |
<define name="identifier"> p </define> |
include "uri" |
<include href="uri"/> |
include "uri" { defs } |
<include href="uri"> defs </include> |
A datatypes
declaration declares a prefix used in a
QName identifying a datatype. For example,
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" element height { xsd:double }
A namespace
declaration declares a prefix used in a
QName specifying the name of an element or attribute. For
example,
namespace rng = "http://relaxng.org/ns/structure/1.0" element rng:text { empty }
A default namespace
declaration declares
the namespace used for unprefixed names specifying the name
of an element (but not of an attribute). For example,
default namespace = "http://example.com" element foo { attribute bar { string } }
is equivalent to
namespace ex = "http://example.com" element ex:foo { attribute bar { string } }
A default namespace
declaration may have a prefix
as well. For example,
default namespace ex = "http://example.com"
is equivalent to
default namespace = "http://example.com" namespace ex = "http://example.com"
The URI may be empty. This makes the prefix stand for the absent namespace URI. This is necessary for specifying a name class that matches any name with an absent namespace URI. For example:
namespace local = "" element foo { attribute * - local:* { string }* }
is equivalent to
<element xmlns="http://relaxng.org/ns/structure/1.0"" name="foo" ns="http://example.com"> <zeroOrMore> <attribute> <anyName> <except> <nsName ns=""/> </except> </anyName> <data type="string"/> </attribute> <zeroOrMore> </element>
RELAX NG has the feature that if a file does not specify an
ns
attribute then the ns
attribute can be inherited from the including file. To support this
feature, the keyword inherit
can be specified in place of
the namespace URI in a namespace declaration. For example,
default namespace this = inherit element foo { element * - this:* { string }* }
is equivalent to
<element xmlns="http://relaxng.org/ns/structure/1.0"" name="foo"> <zeroOrMore> <element> <anyName> <except> <nsName/> </except> </anyName> <data type="string"/> </element> <zeroOrMore> </element>
In addition, the include
and externalRef
patterns can specify inherit = prefix
to
specify the namespace to be inherited by the referenced file. For
example,
namespace x = "http://www.example.com" externalRef "foo.rng" inherit = x
is equivalent to
<externalRef href="foo.rng" ns="http://www.example.com" xmlns="http://relaxng.org/ns/structure/1.0"/>
In the absence of an inherit
parameter on
include
or externalRef
, the default
namespace will be inherited by the referenced file.
In the absence of a default namespace
declaration, a
declaration of
default namespace = inherit
is assumed.
RELAX NG supports two kinds of annotation: element annotations and
attribute annotations. In the compact syntax, attribute annotations
are written in a similar way to the XML syntax. For example,
xml:lang = "en"
. Element annotations are written
using the syntax
elementName [ attributesAndContent ]
where elementName
is the QName of the
element and attributesAndContent
is a list of
attributes followed by a list of elements and literals.
Annotations are attached in one of the following ways:
>>
and
then an element annotation; this is equivalent to a following sibling
element in the XML syntaxFor example,
namespace a = "http://relaxng.org/ns/compatibility/annotations/1.0" [ a:documentation [ "Represents a foo" ] ] element foo { [ a:defaultValue = "42" ] attribute bar { text }?, empty }
turns into
<element name="foo" xmlns="http://relaxng.org/ns/structure/1.0" xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"> <a:documentation>Represents a foo</a:documentation> <optional> <attribute a:defaultValue="42" name="bar"> <text/> </attribute> </optional> <empty/> </element>
Here's another example using the RelaxNGCC annotations:
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" namespace c = "http://www.xml.gr.jp/xmlns/relaxngcc" [ c:class="sample1" ] start = element team { element player { attribute number { [ c:alias="number" ] xsd:positiveInteger >> c:java [ "System.out.println(number);" ] }, element name { [ c:alias="name" ] text >> c:java [ "System.out.println(name);" ] } }+ }
turns into
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" xmlns:c="http://www.xml.gr.jp/xmlns/relaxngcc"> <start c:class="sample1"> <element name="team"> <oneOrMore> <element name="player"> <attribute name="number"> <data c:alias="number" type="positiveInteger"/> <c:java>System.out.println(number);</c:java> </attribute> <element name="name"> <text c:alias="name"/> <c:java>System.out.println(name);</c:java> </element> </element> </oneOrMore> </element> </start> </grammar>
In addition, there is a special syntax for specifying
documentation
elements from the
http://relaxng.org/ns/compatibility/annotations/1.0
namespace as described in RELAX NG DTD Compatibility. For example,
## Represents a foo element foo { empty }
turns into
<element name="foo" xmlns="http://relaxng.org/ns/structure/1.0" xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"> <a:documentation>Represents a foo</a:documentation> <empty/> </element>
value
There is a problem in translating a schema such as
<element xmlns="http://relaxng.org/ns/structure/1.0"" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" name="foo"> <choice> <value type="QName" xmlns:bar="http://example.com/1">bar:baz</value> <value type="QName" xmlns:bar="http://example.com/2">bar:baz</value> </choice> </element>
into the compact syntax. Although this can be translated, for example, into
namespace bar1 = "http://example.com/1" namespace bar2 = "http://example.com/2" datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" element foo { xsd:QName "bar1:baz" | xsd:QName "bar2:baz" }
doing so requires that the translator have knowledge of the QName datatype.
James Clark