This document defines SHACL Rules.
SHACL, the Shapes Constraint Language, is a language for describing the structure of RDF graphs. SHACL may be used for a variety of purposes such as validating, inferencing, modeling domains, generating ontologies to inform other agents, building user interfaces, generating code, and integrating data.
SHACL Rules provides inferencing with the generation of new RDF data from a combination of a set of rules and a base data graph. Rules can be expressed as RDF or in the SHACL Rules Language (SRL).
This specification is published by the Data Shapes Working Group.
This document introduces inference rules for SHACL 1.2, a mechanism for deriving new RDF triples from existing RDF data through declarative rules. The document defines the syntax and semantics of rule-based inference.
Implementations of SHACL Rules provide two operations. The infer operation that applies the rules to a given base graph and produces an inference graph containing the RDF triples derived by rule execution. Combining the inference graph with the base graph is optional and left to users. The query operation determines whether a given goal pattern can be derived from the base graph using the rules
The inference graph may contain RDF triples with IRIs, blank nodes or literals that were not present in the base graph. Users are responsible to ensure that iterative applications of rules that generate new blank nodes or literals do not result in infinite loops.
SHACL Rules also support constructs, such as negation as failure, that could lead to different inferred graphs depending on the order in which rules are executed. To avoid this, rules are evaluated using the technique of stratification, which establishes a single, implicit ordering among rules, ensuring that the same inference graph is always produced.
Connect to definitions in RDF 1.2 Concepts.
The following definitions from other specifications are used in this document: @@
Some examples in this document use Turtle [[turtle]]. The reader is expected to be familiar with SHACL [[shacl]] and SPARQL [[sparql-query]].
Within this document, the following namespace prefix bindings are used:
| Prefix | Namespace |
|---|---|
rdf: |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: |
http://www.w3.org/2000/01/rdf-schema# |
sh: |
http://www.w3.org/ns/shacl# |
srl: |
http://www.w3.org/ns/shacl-rules# |
shnex: |
http://www.w3.org/ns/shacl-node-expr# |
xsd: |
http://www.w3.org/2001/XMLSchema# |
ex: |
http://example.com/ |
Throughout the document, color-coded boxes containing RDF graphs in Turtle will appear. These fragments of Turtle documents use the prefix bindings given above.
@@Needs adjusting and links
This specification defines conformance criteria for:
A conforming
Shapes Rules Language Document is an
RDF string that
conforms to the grammar, starting with the
RuleSet
production, and conforming to the additional constraints defined in
This specification does not define how SHACL Rules processors handle non-conforming input documents.
SHACL rules infer new triples given a [=base graph=] and a [=rule set=]. The output of evaluation is an [=inference graph=] containing the derived triples that do not appear in the base graph.
Each [=rule=] has a pattern, called the [=body=], and a result template, called the [=head=]. A rule is executed by finding the values for variables in the body so that the body matches the combined base graph and any inferred triples from the execution up to this point. These values are then used to instantiate the triple templates in the rule head to produce new inferred triples.
The rules are executed until no more triples are inferred, and rules may be executed more than once as new inferred triples become available.
SHACL Rules execution is defined so that the order of rule execution does not lead to to different outcomes when creating new RDF terms, including new blank nodes, nor in the testing for the absence of a pattern. In other words, the same inference graph is produced regardless of the order of rule execution.
In this first example, we have the following data graph and rule set:
:A :fatherOf :X . :B :motherOf :X . :C :motherOf :A .
RULE { ?x :childOf ?y } WHERE { ?y :fatherOf ?x }
RULE { ?x :childOf ?y } WHERE { ?y :motherOf ?x }
The above rules, applied to the data, will conclude that: `:X` is the `:childOf` `:A` and `:B`:
:X :childOf :A . :X :childOf :B . :A :childOf :C .
We can then derive `:descendedFrom` relationships by adding a rule that depends on `:childOf` triples produced by the other rules:
RULE { ?x :childOf ?y } WHERE { ?y :fatherOf ?x }
RULE { ?x :childOf ?y } WHERE { ?y :motherOf ?x }
RULE { ?x :descendedFrom ?y } WHERE { ?x :childOf ?y }
The outcome is:
:A :descendedFrom :C . :X :descendedFrom :B . :X :descendedFrom :A . :X :childOf :B . :X :childOf :A . :A :childOf :C .
We can add a rule that depends on `:descendedFrom` triples to infer that `:X` is `:descendedFrom` `:C`:
RULE { ?x :childOf ?y } WHERE { ?y :fatherOf ?x }
RULE { ?x :childOf ?y } WHERE { ?y :motherOf ?x }
RULE { ?x :descendedFrom ?y } WHERE { ?x :childOf ?y }
RULE { ?x :descendedFrom ?y } WHERE { ?x :childOf ?z . ?z :descendedFrom ?y }
giving:
:A :descendedFrom :C . :X :descendedFrom :C . :X :descendedFrom :B . :X :descendedFrom :A . :X :childOf :B . :X :childOf :A . :A :childOf :C .
This adds the triple `:X :descendedFrom :C`.
This last rule is a recursive rule, the body of the rule depends on the head of the rule.
We can use expressions in the body of rules to restrict the values of variables in the matching of the body. For example, given data about towns and their populations, we can infer a class for towns with a population greater than 1500:
:town1 :population 1000 . :town2 :population 2000 .
RULE { ?x rdf:type :largeTown } WHERE { ?x :population ?p . FILTER(?p > 1500) }
:town2 rdf:type :largeTown .
`FILTER` evaluates an expression and keeps the current set of variable bindings if the expression evaluates to true, and it discards the current set of variable bindings if the expression evaluates to false. This is the same as the `FILTER` operation of SPARQL and SHACL Rules provides many of the same functions and operators as SPARQL.
Negation allows you to specify a pattern that must not match. This is called "negation as failure".
In order to evaluate a negation element, the rules evaluation algorithm ensures that all the rules that could produce triples matching the pattern in the negation element have been completed. This is called [=stratification=] and ensures that the negation is based on all the relevant possible triples, whether from the data or inferred from other rules.
:X1 rdf:type :Place ;
:population 1000 .
:X2 rdf:type :Place ;
:population 2000 .
:X3 rdf:type :Place .
RULE { ?x rdf:type :UnclassifiedSize } WHERE {
?x rdf:type :Place .
NOT { ?x :population ?p . }
}
:X3 rdf:type :UnclassifiedSize .
Assignment allows you to assign the result of an expression to a variable in the body of a rule. This can be used to create new RDF terms based on the data.
RULE { ?x :distanceKm ?kilometers }
WHERE {
?x :distanceMiles ?miles .
SET ( ?kilometers := ?miles * 1.60934 )
}
This can be combined with testing for the absence of a triple already recording the distance in kilometers.
RULE { ?x :distanceKm ?kilometers }
WHERE {
?x :distanceMiles ?miles
NOT { ?x :distanceKm ?km }
SET ( ?kilometers := ?miles * 1.60934 )
}
Rules involving assignment are [=run-once rules=]. Rules with assignment are run after all the rules that could produce the data that they depend on and before any rules that depend on the data they produce. Rules that involve blank node in the [=rule head=] are also creating new RDF terms and are run-once rules.
This condition ensures that rules that create RDF terms do not generate new terms multiple times, with potentially different outcomes and also that such rules do not loop back to themselves and cause an unbounded number of RDF terms.
A SHACL [=rule set=] can incorporate other rule sets by including their URLs in the [=rule imports=] of the rule set. This allows rules to be structured into libraries shared between rule sets.
The `IMPORTS` statements of a rule set are processed before any of the rules in the rule set are evaluated. During the importing step, if an imported rule set has its own imports, those are also processed recursively. Traversing `IMPORTS` statements during the processing of rule sets may lead to cyclic imports. A rule set is imported only once; cycles in the import statements graph does not lead to infinite loops.
SHACL Rules and SPARQL have a close relationship. SHACL Rules are designed to be compatible with SPARQL, and many of the constructs in SHACL Rules are inspired by SPARQL. However, there are some differences:
At risk:
Rule tuples are disjoint from triples. They are tuples of RDF terms (no variables) and exist only during evaluation of a rule set. They can be used to record intermediate results during rule evaluation and to pass data between rules.
Syntax of tuple patterns, templates and tuples:
Often, the first argument will be a fixed name.
There is a tuple store which holds tuples for the lifetime of the evaluation. The tuple store holds duplicate data tuples (unlike an RDF graph which is a set).
The Shape Rules Abstract Syntax is the logical structure of SHACL Rules.
It is used to define the execution algorithm of SHACL Rules.
Each of the two concrete syntax forms of SHACL Rules, the SHACL Rules Language (SRL)
and the RDF syntax (SRL/RDF), provides a way to express the abstract syntax.
/p>
An [=expression=] is a function or a functional form;
the arguments are [=RDF terms=].
An expression is evaluated with respect to a [=solution mapping=] to give
an [=RDF term=] as the result.
Expressions are compatible with
SHACL list parameter functions
and with SPARQL expressions.
A [=condition=] is an [=expression=] that evaluates to true or false.
[=Conditions=] are used to restrict the values of variables in pattern matching.
In a [=triple pattern=] or a [=triple template=],
position 1 of the tuple is informally called the subject,
position 2 is informally called the predicate, and
position 3 is informally called the object.
Well-formedness is a set of conditions on the abstract syntax of
SHACL rules. Together, these conditions ensure that a [=variable=] in the
[=head=] of a rule has a value defined in the [=body=] of the rule;
that each variable in an [=condition element=]
or [=assignment expression=]
has a value at the point of evaluation; and that each
assignment in a rule introduces a new variable,
one that has not been used earlier in the rule body.
A [=rule=] is a well-formed rule if all of the following
conditions are met:
A [=rule set=] is "well-formed" if and only if all of the [=rules=] of the
rule set are "well-formed".
A rule `R1` depends on a rule `R2` if the output of the second rule
affects the evaluation of the body of the first rule. That is, the head of `R2`
has a [=triple template=] that might generate a triple that matches
a [=triple pattern=] in the body of `R1`,
either as a [=triple pattern element=] or inside a [=negation element=].
There are two kinds of dependencies:
a closed rule dependency
and an open rule dependency.
A closed dependency requires that rule `R2`
has generated all its possible output before rule `R1` can be executed.
This happens when `R1` has a [=triple pattern=] in a [=negation element=]
that matches a [=triple template=] of `R2` in a [=negation element=],
or if `R1` has any matching [=triple pattern=] depending on `R2`
and `R1` has an [=assignment element=], or `R1` has any matching [=triple pattern=]
that matches a [=triple template=] of `R2` that has a blank node.
If a rule dependency is not closed, it is an open dependency
which allows the first rule `R1` to be executed
while the rule `R2` might be run again to generate further triples
which can then cause `R1` to be reevaluated with the new triples from `R2`.
A [=triple pattern=] matches a [=triple template=] if
the triple template may generate a triple that matches the triple pattern.
A [=triple pattern=]
depends on a [=triple template=]
if the [=triple pattern=] could possibly match the [=triple template=].
A [=triple pattern=]
depends on a [=rule=]
if the [=triple pattern=] has dependency on any [=triple template=]
in the head of the rule.
Rule `R1` depends on rule `R2` if any [=triple pattern=]
of `R1` depends on `R2`.
The dependency is an [=open dependency=] if `R1` does not have an
[=assignment element=] and if all the [=triple patterns=]
of `R1` that depend `R2` occur only as [=triple pattern elements=]
and not as part of a [=negation element=], and if `R1` does not
have any [=triple template=] with a blank node.
Otherwise, if `R1` has an [=assignment element=],
or if a [=triple pattern=]
that might match triples generated by `R2` is present in a
[=negation element=], then the dependency is a [=closed dependency=].
A [=triple template=] with components `ts`, `tp`, `to`
can possibly generatce a triple with component RDF terms
`s`, `p`, `o` if
In addition, if any pair of `ts`, `tp`, and `to` are the same variable,
then the corresponding pair of `s`, `p`, and `o` must be the same.
The dependencies between rules are represented as a directed graph,
called the [=dependency graph=].
The vertices of the graph are the rules of the rule set, and the edges
are labeled either open or closed according to
whether the dependency is an [=open dependency=] or a [=closed dependency=].
A rule `R` has a [=recursive dependency=] if there is a cyclic path
in the [=dependency graph=] involving `R`.
The dependency graph is not affected by the data graph.
The following algorithm gives one
possible method for constructing the [=dependency graph=] from a
[=rule set=]. Conformance depends on producing a dependency graph
that meets the definitions of a dependency graph,
not on the use of this procedure.
Examples:
[=Stratification=] is the process of partitioning a [=rule set=]
into an ordered sequence of [=stratification layers=] (also known
as "strata", singular "stratum"). Rules in lower [=strata=] are
evaluated before rules in higher [=strata=].
[=Stratification=] imposes constraints on dependencies between [=rules=]
to ensure that [=negation elements=], [=assignment elements=], and
blank nodes created in a [=rule head=] depend only on results computed
using earlier (lower) [=strata=] and the [=base graph=].
This guarantees a single, well-defined, and finite
outcome from the evaluation of a [=rule set=] over a given [=base graph=].
A stratification process may also be used to make other evaluation
decisions. This document describes the necessary conditions for
consistent evaluation and gives one possible way to form a
stratification. Implementations need to meet the conditions
described here in order to get compatible behavior but they are not
required to implement the algorithm as presented.
A [=stratification layer=] `SL`, is a pair of disjoint sets of
rules (`SL.once`, `SL.general`) .
`SL.once` contains [=run-once rules=], which are
rules that use [=assignment elements=] or produce
blank nodes in the [=rule head=]; these rules are each evaluated exactly
once at the start of evaluation of the [=stratification layer=].
`SL.general` contains the remaining rules, which are evaluated
until no new triples are inferred.
[=Stratification=] is only defined when the following condition is
satisfied. If a [=rule set=] does not meet this condition, then this
specification does not define an outcome for the evaluation of
such a [=rule set=].
In other words, there is no `NOT` or run-once rule (assignment or rule [=triple template=]
involving a blank node) in any transitive dependency cycle of the [=dependency graph=].
The following algorithm gives one possible stratification based solely
on the rule set.
A consequence of the [=stratification condition=] is that once a
[=run-once rule=] is evaluated,
the data used to determine the outcome of the rule will not
change during further evaluation.
Elements of the Abstract Syntax
Well-formedness Conditions
Rule Dependency
Dependency Graph
Dependency Graph Algorithm
define mergeLabel(oldLabel, newLabel):
## Closed dependency overrides open dependency.
if oldLabel == "open" and newLabel == "open":
return "open"
else:
return "closed"
endif
enddefine
## output -- Dependency graph with rule vertices and labeled edges.
define buildDependencyGraph(ruleSet):
## edgeLabelMap maps (R1, R2) to "open" or "closed"
let edgeLabelMap be a map from pair (rule, rule) to label
foreach rule R1 in ruleSet:
## Classify each triple pattern TP in the rule as requiring "open" or "closed"
## depending on whether it is in a negation element or not.
let bodyDependencies = {}
foreach rule body element RBE in the body of R1:
if RBE is a negation element:
foreach triple pattern TP in RBE:
let item be a pair (TP, "closed")
add item to bodyDependencies
endfor
else if RBE is a triple pattern element of triple pattern TP:
let item be a pair (TP, "open")
add item to bodyDependencies
else if RBE is a condition element:
## Do nothing
else if RBE is an assignment element:
## Do nothing
endif
endfor
foreach pair (triple pattern TP, depLabel) in bodyDependencies:
if R1 has an assignment element:
set depLabel to "closed"
endif
if R1 has a triple template with a blank node:
set depLabel to "closed"
endif
## Find depenencies for this triple pattern element or negation element.
foreach rule R2 in ruleSet:
foreach triple template TT in head of R2:
## "possibly generate" / matching is defined in
## section 3.3
if TT can possibly match triple pattern TP:
let key = (R1, R2)
if edgeLabelMap contains key:
let oldLabel = edgeLabelMap.get(key)
let merged = mergeLabel(oldLabel, depLabel)
edgeLabelMap.set(key, merged)
else:
edgeLabelMap.set(key, depLabel)
endif
endif
endfor
endfor
endfor
endfor
let DP = { }
foreach entry ((R1, R2), label) in edgeLabelMap:
add edge (R1 -> R2) labeled label to DP
endfor
the result is DP
enddefine
@@ Examples of triple patttern dependencies.
@@ Examples of rule dependencies.
Stratification
Stratification Condition
Stratification Algorithm
## output -- Map: Integer -> Set of rules.
define stratification(ruleSet):
let DP = Dependency graph for the rule set.
let stratumMap be a map from rule to integer
## The dependency graph should satisfy the stratification condition.
## The check for unbounded stratification is a guard
## due to a violation of the stratification condition.
let limit = num rules + 1
let maxStratum = 0
## initialize stratumMap
foreach rule in ruleSet:
stratumMap.set(rule, 0)
endfor
boolean changed = true;
while changed:
changed = false;
foreach edge E in DP:
## Edge from pRule to qRule with a label
let pRule = source of edge
let qRule = destination of the edge
let label = edge label
if label == "open" :
if stratumMap.get(pRule) < stratumMap.get(qRule) :
stratumMap.set(pRule, stratumMap.get(qRule))
changed = true;
endif
endif
if label == "closed" :
if stratumMap.get(pRule) <= stratumMap.get(qRule) :
let xStratum = 1 + stratumMap.get(qRule)
if ( xStratum > limit )
## Stratification requirement violated
error "Stratification error"
endif
stratumMap.set(pRule, xStratum)
maxStratum = max(maxStratum, xStratum)
changed = true;
endif
endif
endfor
endwhile
## Initialize the result map.
let stratumRules be a map from integer to rules.
for i = 0 to maxStratum
stratumRules.set(i, {})
endfor
## Gather rules in stratumMap with the same level number
for rule R in map stratumMap:
let stratumNum = stratumMap.get(R)
add R to stratumRules.get(stratumNum)
endfor
## Partition each level into once and general
let stratumLevels be a sequence of pairs of sets of rules.
for i = 0 to maxStratum:
let rules = stratumLevels.get(i)
let once = { R in rules | R is a run-once rule }
let general = rules \ once
stratumLevels.set(i, pair(once, general))
endfor
the result is stratumLevels
enddefine
There are two concrete syntaxes.
Shape Rules Language:
PREFIX : <http://example/>
DATA { :x :p 1 ; :q 2 . }
RULE { ?x :bothPositive true . }
WHERE { ?x :p ?v1 FILTER ( ?v1 > 0 ) ?x :q ?v2 FILTER ( ?v2 > 0 ) }
RULE { ?x :oneIsZero true . }
WHERE { ?x :p ?v1 ; :q ?v2 FILTER ( ( ?v1 = 0 ) || ( ?v2 = 0 ) ) }
RDF Rules syntax:
PREFIX : <http://example/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX srl: <http://www.w3.org/ns/shacl-rules#>
PREFIX sparql: <http://www.w3.org/ns/sparql#>
:ruleSet-1
rdf:type srl:RuleSet;
srl:data (
[ srl:subject :x ; srl:predicate :p ; srl:object 1 ]
[ srl:subject :x ; srl:predicate :q ; srl:object 2 ]
);
srl:rules (
[
rdf:type srl:Rule;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :bothPositive ; srl:object true ]
) ;
srl:body (
[ srl:subject [ srl:varName "x" ]; srl:predicate :p ; srl:object [ srl:varName "v1" ] ]
[ srl:expr [ sparql:greaterThan ( [ srl:varName "v1" ] 0 ) ] ]
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "v2" ] ]
[ srl:expr [ sparql:greaterThan ( [ srl:varName "v2" ] 0 ) ] ]
);
]
[
rdf:type srl:Rule;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :oneIsZero ; srl:object true ]
) ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object [ srl:varName "v1" ] ]
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "v2" ] ]
[ srl:filter [ sparql:function-or (
[ sparql:equals ( [ srl:varName "v1" ] 0 ) ]
[ sparql:equals ( [ srl:varName "v2" ] 0 ) ]
) ]
]
);
]
) .
The grammar is given below.
Mapping the AST to the abstract syntax.
Additional helpers (short-hand abbreviations):
These allow for well-known rule patterns and also specialised implementations in basic engines.
TRANSITIVE(uri)SYMMETRIC(uri)INVERSE(uri)At risk:
`TRANSITIVE` has both implementation and concise expression advantages. Implementation advantages for `SYMMETRIC` and `INVERSE` are not clear.
Vocabulary: rdf-syntax-vocab.ttl
SHACL shapes: rdf-syntax-shapes.ttl
Well-formedness:
Describe how the abstract model maps to triples.
Process : accumulators, bottom up/ Walk the structure.
All triples not in the syntax are ignored. No other "srl:" predicates are allowed (??).
@@ Illustration: SHACL rule set in text and RDF syntaxes: all features:
PREFIX : <http://example/>
DATA { :s :p :o }
RULE { ?x :q :o } WHERE { ?x :p :o }
RULE { ?x :q :o } WHERE { ?x :p :o1 ; :p :o2 }
RULE { ?x :q :o } WHERE { ?x :p ?o . FILTER (?o < 18) }
RULE { ?x :q ?o } WHERE { ?x :p :o . BIND(18 AS ?o) }
RULE { ?x :q ?o } WHERE { ?x :p :o . NOT { ?s :p ?o . FILTER(?o < 18) } }
PREFIX : <http://example/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX sparql: <http://www.w3.org/ns/sparql#>
PREFIX srl: <http://www.w3.org/ns/shacl-rules#>
:ruleSet-1
rdf:type srl:RuleSet;
srl:data (
[ srl:subject :s ; srl:predicate :p; srl:object :o ; ]
);
srl:rules (
[
rdf:type srl:Rule;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o ; ]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object :o ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o1 ; ]
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o2 ; ]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object :o ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object [ srl:varName "o" ] ; ]
[
srl:filter [
sparql:less-than (
[ srl:varName "o" ]
18
)
]
]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object :o ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o ; ]
[
srl:assign [
srl:assignValue 18 ;
srl:assignVar [ srl:varName "o" ]
]
]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "o" ] ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o ; ]
[
srl:not (
[ srl:subject [ srl:varName "s" ] ; srl:predicate :p ; srl:object [ srl:varName "o" ] ; ]
[
srl:filter [
sparql:less-than (
[ srl:varName "o" ]
18
)
]
]
)
]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "o" ] ; ]
)
]
) .
This section defines the outcome of evaluating a rule set on given data. It does not prescribe the algorithm as the method of implementation. An implementation can use any algorithm that generates the same outcome.
Inputs: data graph G, called the base graph, and a rule set RS. Output: an RDF graph GI of inferred triples
The inferred triples do not include triples present in the set of triples of the [=base graph=].
μ : V → T,
where V is the set of all variables
and T is the set of all [=RDF terms=].
The domain of μ is denoted
by dom(μ), and it is the subset
of V for which μ is defined. We use the term
[=solution=] where it is clear that a [=solution mapping=] is meant.
Write μ0 for the solution mapping, such that
dom(μ0) is the empty set.
subst(μ, [=triple pattern=])
that returns a [=triple pattern=]
where each occurrence in the [=triple pattern=] of a variable that is in the
dom(μ)
is replaced by the [=RDF term=] given by the
[=solution mapping=] for var.
If the triple pattern result has no variables, then it is an [=RDF Triple=].
Let G be an [=RDF graph=] and TP be a triple pattern. The function `graphMatch(G, TP)` returns a set of all possible solutions that, when applied to the triple pattern, produce a triple that is in the [=evaluation graph=]
Let G be an [=RDF graph=] and TP be a triple pattern.
graphMatch(G, TP) = { μ | subst(μ, TP) is a triple in G }
Let S1 and S2 be solutions.
compatible(μ1, μ2) = true
if forall v in dom(μ1) intersection dom(μ2)
μ1(v) = μ2(v)
compatible(μ1, μ2) = false otherwise
merge(μ1, μ2) = { μ |
μ(v) = μ1(v) if v in dom(μ1)
μ(v) = μ2(v) otherwise }
merge(S1, S2) = { μ |
μ1 in S1, μ2 in S2
and compatible(μ1, μ2)
μ(v) = merge(μ1, μ2)
Say the domain is `dom(S1) ∪︀ dom(S2)`.
Say that two solutions that have no variables in common are compatible.
The first step in evaluating a [=rule set=] is to prepare a single, valid rule set. This involves gathering all imported rule sets, building a single, combined rule set, and then calculating the stratification for the combined rule set.
@@Walk imports - visit only once@@
@@ TODO: consider defining a rule set as having two components and a separate set of imports. i.e. imports are in the parsing and sorted out to give a usable "rule set" of "rules + data"
A [=rule set=] has three components: `R.rules`, `R.data`, and `R.imports`. The rule set merge of two rule sets, `RS1` and `RS2`, is a rule set, `MR`, defined as follows:
MR.rules = RS1.rules ∪︀ RS2.rules
MR.data = merge(RS1.data, RS2.data)
MR.imports = {}
where `merge` is the RDF merge operation.
define imports(rule set RS, set of URLs V), returning rule set
let I = the set of import URLs declared for the rule set RS
let RS2 = RS
foreach URL x in I:
if x ∉ V:
V = V ∪︀ { x }
read rule set RS3 from URL x
RS2 = rulesetMerge(RS2, imports(RS3, V))
endif
endfor
result is RS2
enddefine
let RS be a rule set
let V = {}
if RS has a location, V = { location of RS }
result is imports(RS, V)
Sketch
@@ Reference SPARQL expression evaluation Expression Evaluation @@ Reference SPARQL EBV Effective Boolean Value (EBV) define evalFunction(F, μ): let [x/μ] be if x is an RDF term, then [x/row] is x if x is a variable, then [x/row] is μ(x) ## By well-formedness, it is an error if x is not in the row. return evalFunction(F(expr1, expr2 ...), row) = F(evalFunction(expr1, row), evalFunction(expr2, row), ...) if an error is returned by evalFunction: return error return evalFunction(FF(expr1, expr2) , row) = ... things that are not functions like `IF`
let R be a well-formed rule.
let rule R = (H, B) where
H is the sequence of triple templates in the head
B is the sequence of triple pattern elements,
condition elements, negation elements,
and assignment elements in the body
# Solution sequence of one solution that does not map any variables.
let SEQ0: Solution sequence = { μ0 }
let G = evaluation graph
# Evaluate rule body
# This function returns a sequence of solutions
define evalRuleElements(B, SEQ, G):
for each rule element rElt in B:
if rElt is a triple pattern TP:
X = graphMatch(G, TP)
SEQ1 = {}
for each μ1 in X:
for each μ2 in SEQ:
if compatible(μ1, μ2)
μ3 = merge(μ1, μ2)
add μ3 to SEQ1
endif
endfor
endfor
endif
if rElt is a condition element with expression F:
SEQ1 = {}
for each solution μ in SEQ:
let x = evalFunction(F, μ)
if x is true:
add μ to SEQ1
endif
endfor
endif
if rElt is a negation expression with body elements N:
SEQ1 = {}
for each solution μ in SEQ:
S = sequence{ μ }
NEG = evalRuleElements(N, S, G)
if NEG is empty
add μ to SEQ1
endif
endfor
endif
if rElt is an assignment with variable V and expression expr
SEQ1 = {}
for each solution S in SEQ:
let x = evalFunction(expr, S)
if x is not an error:
add(V, x) to S
add S to SEQ1
else
# Error: drop solution S
endif
endfor
endif
if SEQ1 is empty
SEQ = {}
return SEQ
endif
SEQ = SEQ1
endfor
return SEQ
enddefine
let SEQ = evalRuleElements(B, SEQ0, G)
# Evaluate rule head
let H = empty set
for each μ in SEQ:
let S = {}
for each triple template TT in head
let triple = subst(μ, TT)
Add triple to S
endfor
H = H union S
endfor
result eval(R, G) is H
Note that `H` may contain triples that are also in the data graph.
let G0 be the input base graph
let RS be the rule set
let D be the graph of all DATA triples in RS
Apply stratification to RS
let LS be the sequence of layers after stratification
# Inference graph
let GI = { t ∈ D | t ∉ G0 }
# Evaluation graph.
let GE = G0 ∪︀ D
for each stratum ST in LS:
for each rule R in ST.once:
let X = eval(R, GE)
let Y = { t ∈ X | t ∉ GE }
GI = Y ∪︀ GI
GE = Y ∪︀ GE
endfor
let finished = false
while !finished:
finished = true
for each rule R in ST.general:
let X = eval(R, GE)
let Y = { t ∈ X | t ∉ GE }
if Y is not empty:
finished = false
GI = Y ∪︀ GI
GE = Y ∪︀ GE
endif
endfor
endwhile
endfor
the result is GI
A Shapes Rules Language document is an RDF string
encoded in UTF-8 [[!RFC3629]].
Only Unicode scalar values,
in the ranges U+0000 to U+D7FF
and U+E000 to U+10FFFF,
are allowed. This excludes
surrogate code points,
range U+D800 to U+DFFF.
White space
(production WS) is used
to separate two terminals which would otherwise be (mis-)recognized as one
terminal. Rule names below in capitals indicate where white space is
significant; these form a possible choice of terminals for constructing a
Shapes Rules Language parser.
White space is significant in the production
String.
Comments start with a # outside an
IRIREF,
STRING_LITERAL1,
STRING_LITERAL2,
STRING_LITERAL_LONG1, or
STRING_LITERAL_LONG2,
and continue to the end of line (marked by
LF, or
CR),
or end of file if there is no end of line after the comment marker.
Comments are treated as white space.
Relative IRI references are resolved with base IRIs as per [[[RFC3986]]] [[RFC3986]] using only the basic algorithm in section 5.2. Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) are performed. Characters additionally allowed in IRI references are treated in the same way that unreserved characters are treated in URI references, per section 6.5 of [[[RFC3987]]] [[RFC3987]].
The BASE
directive defines the Base IRI used to
resolve relative IRI
references per [[RFC3986]]
section 5.1.1, "Base URI Embedded in Content".
Section 5.1.2, "Base URI from the Encapsulating Entity"
defines how the In-Scope Base IRI may come from an encapsulating document,
such as a SOAP envelope with an `xml:base` directive or a MIME multipart document with a
`Content-Location` header.
The "Retrieval URI" identified in 5.1.3, Base "URI from the Retrieval URI",
is the URL from which a particular Shapes Rules Language document was retrieved.
If none of the above specifies the Base URI, the default
Base URI (section 5.1.4, "Default Base URI") is used.
Each BASE directive sets a new In-Scope Base URI,
relative to the previous one.
There are three forms of escapes used in Shapes Rules documents:
A numeric escape sequence represents the value of a Unicode code point.
A numeric escape sequence MUST NOT produce a code point value
in the range U+D800 to U+DFFF,
which is the range for
Unicode surrogates.
| Escape sequence | Unicode code point |
|---|---|
\u hex
hex
hex
hex |
A Unicode code point
in the ranges U+0000 to U+D7FF
and U+E000 to U+FFFF,
corresponding to the value encoded by the four hexadecimal digits interpreted
from most significant to least significant digit. |
\U hex
hex
hex
hex
hex
hex
hex
hex |
A Unicode code point
in the ranges U+0000 to
U+D7FF
and U+E000 to U+10FFFF,
corresponding to the value encoded by the eight hexadecimal digits
interpreted from most significant to least significant digit. |
where hex is a hexadecimal character
HEX ::= [0-9] | [A-F] | [a-f]
A string escape sequence represents a character traditionally escaped in string literals:
| Escape sequence | Unicode code point |
|---|---|
\t |
U+0009 |
\b |
U+0008 |
\n |
U+000A |
\r |
U+000D |
\f |
U+000C |
\" |
U+0022 |
\' |
U+0027 |
\\ |
U+005C |
A reserved character escape sequence consists of a
\ followed by
one of these characters ~.-!$&'()*+,;=/?#@%_, and
represents the character to the right of
the \.
| numeric escapes |
string escapes |
reserved character escapes |
|
|---|---|---|---|
IRIs,
used as RDF terms
PREFIX,
or BASE declarations |
yes | no | no |
| local names | no | no | yes |
| Strings | yes | yes | no |
%-encoded sequences are in the
character range for IRIs
and are explicitly allowed in local names.
These appear as a %
followed by two hex characters and represent that
same sequence of three characters. These sequences are not
decoded during processing.
A term written as <http://a.example/%66oo-bar>
designates the IRI http://a.example/%66oo-bar
and not IRI http://a.example/foo-bar.
A term written as ex:%66oo-bar with a prefix
PREFIX ex: <http://a.example/>
also designates the IRI http://a.example/%66oo-bar.
The EBNF used here is defined in XML 1.0 [[!EBNF-NOTATION]].
Notes:
a'
which is case-sensitive.
UCHAR
and ECHAR
are case sensitive.
RuleSet.
A text version of this grammar is available here.
This document uses some specific terminal literal strings [[EBNF-NOTATION]]. To clarify the Unicode code points used for these terminal literal strings, the following table describes specific characters used in this section.
| Code | Glyph | Description |
|---|---|---|
U+000A |
LF |
Line feed |
U+000D |
CR |
Carriage return |
U+0023 |
# |
Number sign |
U+0025 |
% |
Percent sign |
U+005C |
\ |
Backslash |
@@see the Turtle registration for format
The Internet Media Type (formerly known as MIME Type) for @@ is "text/shape-rules".
The information that follows has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.
TODO
TODO
TODO
TODO