This document defines SHACL Rules.
SHACL, the Shapes Constraint Language, is a language for describing the structure of RDF graphs. SHACL may be used for a variety of purposes such as validating, inferencing, modeling domains, generating ontologies to inform other agents, building user interfaces, generating code, and integrating data.
SHACL Rules provides inferencing with the generation of new RDF data from a combination of a set of rules and a base data graph. Rules can be expressed as RDF or in the SHACL Rules Language (SRL).
This specification is published by the Data Shapes Working Group.
This document introduces inference rules for SHACL 1.2, a mechanism for deriving new RDF triples from existing RDF data through declarative rules. The document defines the syntax and semantics of rule-based inference.
Implementations of SHACL Rules provide two operations. The infer operation that applies the rules to a given base graph and produces an inference graph containing the RDF triples derived by rule execution. Combining the inference graph with the base graph is optional and left to users. The query operation determines whether a given goal pattern can be derived from the base graph using the rules.
SHACL Rules allow the use of new RDF terms, including blank nodes, that can be used in triple templates in the head of rules.
SHACL Rules also support constructs, such as negation as failure, that could lead to different inferred graphs depending on the order in which rules are executed. To avoid this, rules are evaluated using the technique of stratification, which establishes a single, implicit ordering among rules, ensuring that the same inference graph is always produced.
Connect to definitions in RDF 1.2 Concepts.
The following definitions from other specifications are used in this document: @@
Some examples in this document use Turtle [[turtle]]. The reader is expected to be familiar with SHACL [[shacl]] and SPARQL [[sparql-query]].
Within this document, the following namespace prefix bindings are used:
| Prefix | Namespace |
|---|---|
rdf: |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: |
http://www.w3.org/2000/01/rdf-schema# |
sh: |
http://www.w3.org/ns/shacl# |
srl: |
http://www.w3.org/ns/shacl-rules# |
shnex: |
http://www.w3.org/ns/shacl-node-expr# |
xsd: |
http://www.w3.org/2001/XMLSchema# |
ex: |
http://example.com/ |
Throughout the document, color-coded boxes containing RDF graphs in Turtle will appear. These fragments of Turtle documents use the prefix bindings given above.
@@Needs adjusting and links
This specification defines conformance criteria for:
A conforming
Shapes Rules Language Document is an
RDF string that
conforms to the grammar, starting with the
RuleSet
production, and conforming to the additional constraints defined in
This specification does not define how SHACL Rules processors handle non-conforming input documents.
A version label is a string that identifies the syntax and semantics conformance for the Shacl Rules Language.
| Version Label |
|---|
| "1.2" |
For serializations supporting in-line version announcement, the version announcement SHOULD be made early in the document.
SHACL rules infer new triples given a [=base graph=] and a [=rule set=]. The output of evaluation is an [=inference graph=] containing the derived triples that do not appear in the base graph.
Each [=rule=] has a pattern, called the [=body=], and a result template, called the [=head=]. A rule is executed by finding the values for variables in the body so that the body matches the combined base graph and any inferred triples from the execution up to this point. These values are then used to instantiate the triple templates in the rule head to produce new inferred triples.
The rules are executed until no more triples are inferred, and rules may be executed more than once as new inferred triples become available.
SHACL Rules execution is defined so that the order of rule execution does not lead to different outcomes when creating new RDF terms, including new blank nodes, nor when testing for the absence of a pattern. In other words, the same inference graph is produced regardless of the order of rule execution.
SHACL Rules has both a RDF syntax, as well as a human-friendly syntax, inspired by [[[SPARQL12-QUERY]]]. Rule set evaluation contains elements similar to SPARQL, with differences in the details to ensure that the same inference graph is produced regardless of the order of rule execution.
In this first example, we have the following data graph and rule set:
:A :fatherOf :X . :B :motherOf :X . :C :motherOf :A .
RULE { ?x :childOf ?y } WHERE { ?y :fatherOf ?x }
RULE { ?x :childOf ?y } WHERE { ?y :motherOf ?x }
The above rules, applied to the data, will conclude that: `:X` is the `:childOf` `:A` and `:B`:
:X :childOf :A . :X :childOf :B . :A :childOf :C .
We can then derive `:descendedFrom` relationships by adding a rule that depends on `:childOf` triples produced by the other rules:
RULE { ?x :childOf ?y } WHERE { ?y :fatherOf ?x }
RULE { ?x :childOf ?y } WHERE { ?y :motherOf ?x }
RULE { ?x :descendedFrom ?y } WHERE { ?x :childOf ?y }
The outcome is:
:A :descendedFrom :C . :X :descendedFrom :B . :X :descendedFrom :A . :X :childOf :B . :X :childOf :A . :A :childOf :C .
We can add a rule that depends on `:descendedFrom` triples to infer that `:X` is `:descendedFrom` `:C`:
RULE { ?x :childOf ?y } WHERE { ?y :fatherOf ?x }
RULE { ?x :childOf ?y } WHERE { ?y :motherOf ?x }
RULE { ?x :descendedFrom ?y } WHERE { ?x :childOf ?y }
RULE { ?x :descendedFrom ?y } WHERE { ?x :childOf ?z . ?z :descendedFrom ?y }
giving:
:A :descendedFrom :C . :X :descendedFrom :C . :X :descendedFrom :B . :X :descendedFrom :A . :X :childOf :B . :X :childOf :A . :A :childOf :C .
This adds the triple `:X :descendedFrom :C`.
This last rule is a recursive rule, the body of the rule depends on the head of the rule.
We can use expressions in the body of rules to restrict the values of variables in the matching of the body. For example, given data about towns and their populations, we can infer a class for towns with a population greater than 1500:
:town1 :population 1000 . :town2 :population 2000 .
RULE { ?x rdf:type :largeTown } WHERE { ?x :population ?p . FILTER(?p > 1500) }
:town2 rdf:type :largeTown .
`FILTER` evaluates an expression and keeps the current set of variable bindings if the expression evaluates to true, and it discards the current set of variable bindings if the expression evaluates to false. This is the same as the `FILTER` operation of SPARQL and SHACL Rules provides many of the same functions and operators as SPARQL.
Negation allows you to specify a pattern that must not match. This is called "negation as failure".
In order to evaluate a negation element, the rules evaluation algorithm ensures that all the rules that could produce triples matching the pattern in the negation element have been completed. This is called [=stratification=] and ensures that the negation is based on all the relevant possible triples, whether from the data or from inferred triples inferred by other rules.
:X1 rdf:type :Place ;
:population 1000 .
:X2 rdf:type :Place ;
:population 2000 .
:X3 rdf:type :Place .
RULE { ?x rdf:type :UnclassifiedSize } WHERE {
?x rdf:type :Place .
NOT { ?x :population ?p . }
}
:X3 rdf:type :UnclassifiedSize .
Assignment allows you to assign the result of an expression to a variable in the body of a rule. This can be used to create new RDF terms based on the data.
RULE { ?x :distanceKm ?kilometers }
WHERE {
?x :distanceMiles ?miles .
SET ( ?kilometers := ?miles * 1.60934 )
}
This can be combined with testing for the absence of a triple already recording the distance in kilometers.
RULE { ?x :distanceKm ?kilometers }
WHERE {
?x :distanceMiles ?miles
NOT { ?x :distanceKm ?km }
SET ( ?kilometers := ?miles * 1.60934 )
}
Rules involving [=assignments=] and rules that create blank nodes in their [=rule head=] are [=run-once rules=]. Such rules are run after all the rules that could produce data that they depend on, and before any rules that depend on the data they produce. Rules that involve blank node in the [=rule head=] are also creating new RDF terms and are run-once rules.
This condition ensures that rules that create RDF terms do not generate new terms multiple times, with potentially different outcomes, and also that such rules do not loop back to themselves and cause an unbounded number of RDF terms.
If evaluating the expression in an [=assignment=] causes an error, then the current solution mapping is rejected by the [=assignment=].
A SHACL [=rule set=] can incorporate other rule sets by including their URLs in the [=rule imports=] of the rule set. This allows rules to be structured into libraries shared between rule sets.
The `IMPORTS` statements of a rule set are processed before any of the rules in the rule set are evaluated. During the importing step, if an imported rule set has its own imports, those are also processed recursively. Traversing `IMPORTS` statements during the processing of rule sets may lead to cyclic imports. A rule set is imported only once; cycles in the import statements graph does not lead to infinite loops.
SHACL Rules and SPARQL have a close relationship. SHACL Rules are designed to be compatible with SPARQL, and many of the constructs in SHACL Rules are inspired by SPARQL. However, there are some differences:
Data blocks allow concisely providing RDF triples directly to the rule set evaluation. Triples in datablocks are added to the inference graph and are available for matching in the body of rules.
@@examples to be updated
DATA {
:father rdfs:subClassOf :familyRelationship .
:mother rdfs:subClassOf :familyRelationship .
}
At risk:
Rule tuples are disjoint from triples. They are tuples of RDF terms (no variables) and exist only during evaluation of a rule set. They can be used to record intermediate results during rule evaluation and to pass data between rules.
Syntax of tuple patterns, templates and tuples:
Often, the first argument will be a fixed name.
There is a tuple store which holds tuples for the lifetime of the evaluation. The tuple store holds duplicate data tuples (unlike an RDF graph which is a set).
The Shape Rules Abstract Syntax is the logical structure of SHACL Rules. It is used to define the execution algorithm of SHACL Rules. Each of the two concrete syntax forms of SHACL Rules, the SHACL Rules Language (SRL) and the RDF syntax (SRL/RDF), provides a way to express the abstract syntax.
An [=expression=] is a function or a functional form; the arguments are [=RDF terms=]. An expression is evaluated with respect to a [=solution mapping=], giving an [=RDF term=] as the result. Expressions are compatible with SHACL list parameter functions and with SPARQL expressions.
A [=condition=] is an [=expression=] that evaluates to true or false. [=Conditions=] are used to restrict the values of variables in pattern matching.
In a [=triple pattern=] or a [=triple template=], position 1 of the tuple is informally called the subject, position 2 is informally called the predicate, and position 3 is informally called the object.
Well-formedness is a set of conditions on the abstract syntax of SHACL rules. Together, these conditions ensure that a [=variable=] in the [=head=] of a rule has a value defined in the [=body=] of the rule; that each variable in an [=condition element=] or [=assignment expression=] has a value at the point of evaluation; and that each assignment in a rule introduces a new variable, one that has not been used earlier in the rule body.
A [=rule=] is a well-formed rule if all of the following conditions are met:
A [=rule set=] is "well-formed" if and only if all of the [=rules=] of the rule set are "well-formed".
Revisit
A rule `R1` depends on a rule `R2` if the output of the second rule affects the evaluation of the body of the first rule. That is, the head of `R2` has a [=triple template=] that might generate a triple that matches a [=triple pattern=] in the body of `R1`, either as a [=triple pattern element=] or inside a [=negation element=].
There are two kinds of dependencies: [=closed dependencies=] and an [=open dependencies=]. A closed dependency ensures that rule `R2` has generated all its possible output before rule `R1` is executed. If a rule dependency is not closed, it is an open dependency which allows the first rule `R1` to be executed while the rule `R2` might be run again to generate further triples which can then cause `R1` to be reevaluated with the new triples from `R2`.
In this first example, the first rule depends on the second rule. It is an open dependency.
RULE { ?x :descendedFrom ?y } WHERE { ?x :childOf ?y }
RULE { ?x :childOf ?y } WHERE { ?y :fatherOf ?x }
In this second example, the first rule depends on the second rule. It is a closed dependency.
RULE @@
A [=triple pattern=] matches a [=triple template=] if the triple template may generate a triple that matches the triple pattern.
A [=triple pattern=] depends on a [=triple template=] if the [=triple pattern=] could possibly match the [=triple template=].
A [=triple pattern=] depends on a [=rule=] if the [=triple pattern=] has dependency on any of the [=triple templates=] in the [=head=] of the rule.
Rule `R1` [=depends on=] `R2` if any [=triple pattern=] in the body of `R1`, whether as a [=triple pattern element=] or inside a [=negation element=], depends on a [=triple template=] in the head of `R2`.
A [=rule dependency=] of rule `R1` on rule `R2` is a [=closed dependency=] if any of the following conditions hold:
A [=rule dependency=] of rule `R1` on rule `R2` is an [=open dependency=] if the dependencyis not a [=closed dependency=]. That is, any [=triple pattern=] of `R1` that depends on `R2` occurs only as a [=triple pattern element=].
A [=triple template=] with components `tSubj`, `tPred`, `tObj` can possibly generate a triple with component RDF terms `s`, `p`, `o` if
In addition, if any pair of `tSubj`, `tPred`, and `tObj` are the same variable, then the corresponding pair of `s`, `p`, and `o` must be the same.
The dependencies between rules are represented as a directed graph, called the [=dependency graph=]. The vertices of the graph are the rules of the rule set, and the edges are labeled either open or closed according to whether the dependency is an [=open dependency=] or a [=closed dependency=].
A rule `R` has a [=recursive dependency=] if there is a cyclic path in the [=dependency graph=] involving `R`.
The dependency graph is not affected by the data graph.
The following algorithm gives one possible method for constructing the [=dependency graph=] from a [=rule set=]. Conformance depends on producing a dependency graph that meets the definitions of a dependency graph, not on the use of this procedure.
define mergeLabel(oldLabel, newLabel):
## Closed dependency overrides open dependency.
if oldLabel == "open" and newLabel == "open":
return "open"
else:
return "closed"
endif
enddefine
## output -- Dependency graph with rule vertices and labeled edges.
define buildDependencyGraph(ruleSet):
## edgeLabelMap maps (R1, R2) to "open" or "closed"
let edgeLabelMap be a map from pair (rule, rule) to label
foreach rule R1 in ruleSet:
## Classify each triple pattern TP in the rule as requiring "open" or "closed"
## depending on whether it is in a negation element or not.
let bodyDependencies = {}
foreach rule body element RBE in the body of R1:
if RBE is a negation element:
foreach triple pattern TP in RBE:
let item be a pair (TP, "closed")
add item to bodyDependencies
endfor
else if RBE is a triple pattern element of triple pattern TP:
let item be a pair (TP, "open")
add item to bodyDependencies
else if RBE is a condition element:
## Do nothing
else if RBE is an assignment element:
## Do nothing
endif
endfor
foreach pair (triple pattern TP, depLabel) in bodyDependencies:
if R1 has an assignment element:
set depLabel to "closed"
endif
if R1 has a triple template with a blank node:
set depLabel to "closed"
endif
## Find depenencies for this triple pattern element or negation element.
foreach rule R2 in ruleSet:
foreach triple template TT in head of R2:
## "possibly generate" / matching is defined in
## section 3.3
if TT can possibly match triple pattern TP:
let key = (R1, R2)
if edgeLabelMap contains key:
let oldLabel = edgeLabelMap.get(key)
let merged = mergeLabel(oldLabel, depLabel)
edgeLabelMap.set(key, merged)
else:
edgeLabelMap.set(key, depLabel)
endif
endif
endfor
endfor
endfor
endfor
let DP = { }
foreach entry ((R1, R2), label) in edgeLabelMap:
add edge (R1 -> R2) labeled label to DP
endfor
the result is DP
enddefine
Examples:
@@ Examples of triple patttern dependencies.
@@ Examples of rule dependencies.
[=Stratification=] is the process of partitioning a [=rule set=] into an ordered sequence of [=stratification layers=] (also known as "strata", singular "stratum"). Rules in lower [=strata=] are evaluated before rules in higher [=strata=].
[=Stratification=] imposes constraints on dependencies between [=rules=] to ensure that [=negation elements=], [=assignment elements=], and blank nodes created in a [=rule head=] depend only on results computed using earlier (lower) [=strata=] and the [=base graph=]. This guarantees a single, well-defined, and finite outcome from the evaluation of a [=rule set=] over a given [=base graph=].
A stratification process may also be used to make other evaluation decisions. This document describes the necessary conditions for consistent evaluation and gives one possible way to form a stratification. Implementations need to meet the conditions described here in order to get compatible behavior but they are not required to implement the algorithm as presented.
A [=stratification layer=] `SL`, is a pair of disjoint sets of rules (`SL.once`, `SL.general`) . `SL.once` contains [=run-once rules=], which are rules that use [=assignment elements=] or produce blank nodes in the [=rule head=]; these rules are each evaluated exactly once at the start of evaluation of the [=stratification layer=]. `SL.general` contains the remaining rules, which are evaluated until no new triples are inferred.
[=Stratification=] is only defined when the following condition is satisfied. If a [=rule set=] does not meet this condition, then this specification does not define an outcome for the evaluation of such a [=rule set=].
In other words, there is no `NOT` or run-once rule (assignment or rule [=triple template=] involving a blank node) in any transitive dependency cycle of the [=dependency graph=].
The following algorithm gives one possible stratification based solely on the rule set.
## output -- Map: Integer -> Set of rules.
define stratification(ruleSet):
let DP = Dependency graph for the rule set.
let stratumMap be a map from rule to integer
## The dependency graph should satisfy the stratification condition.
## The check for unbounded stratification is a guard
## due to a violation of the stratification condition.
let limit = num rules + 1
let maxStratum = 0
## initialize stratumMap
foreach rule in ruleSet:
stratumMap.set(rule, 0)
endfor
boolean changed = true;
while changed:
changed = false;
foreach edge E in DP:
## Edge from pRule to qRule with a label
let pRule = source of edge
let qRule = destination of the edge
let label = edge label
if label == "open" :
if stratumMap.get(pRule) < stratumMap.get(qRule) :
stratumMap.set(pRule, stratumMap.get(qRule))
changed = true;
endif
endif
if label == "closed" :
if stratumMap.get(pRule) <= stratumMap.get(qRule) :
let xStratum = 1 + stratumMap.get(qRule)
if ( xStratum > limit )
## Stratification requirement violated
error "Stratification error"
endif
stratumMap.set(pRule, xStratum)
maxStratum = max(maxStratum, xStratum)
changed = true;
endif
endif
endfor
endwhile
## Initialize the result map.
let stratumRules be a map from integer to rules.
for i = 0 to maxStratum
stratumRules.set(i, {})
endfor
## Gather rules in stratumMap with the same level number
for rule R in map stratumMap:
let stratumNum = stratumMap.get(R)
add R to stratumRules.get(stratumNum)
endfor
## Partition each level into once and general
let stratumLevels be a sequence of pairs of sets of rules.
for i = 0 to maxStratum:
let rules = stratumRules.get(i)
let once = { R in rules | R is a run-once rule }
let general = rules \ once
stratumLevels.set(i, pair(once, general))
endfor
the result is stratumLevels
enddefine
A consequence of the [=stratification condition=] is that once a [=run-once rule=] is evaluated, the data used to determine the outcome of the rule will not change during further evaluation.
There are two concrete syntaxes.
Shape Rules Language:
PREFIX : <http://example/>
DATA { :x :p 1 ; :q 2 . }
RULE { ?x :bothPositive true . }
WHERE { ?x :p ?v1 FILTER ( ?v1 > 0 ) ?x :q ?v2 FILTER ( ?v2 > 0 ) }
RULE { ?x :oneIsZero true . }
WHERE { ?x :p ?v1 ; :q ?v2 FILTER ( ( ?v1 = 0 ) || ( ?v2 = 0 ) ) }
RDF Rules syntax:
PREFIX : <http://example/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX srl: <http://www.w3.org/ns/shacl-rules#>
PREFIX sparql: <http://www.w3.org/ns/sparql#>
:ruleSet-1
rdf:type srl:RuleSet;
srl:data (
[ srl:subject :x ; srl:predicate :p ; srl:object 1 ]
[ srl:subject :x ; srl:predicate :q ; srl:object 2 ]
);
srl:rules (
[
rdf:type srl:Rule;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :bothPositive ; srl:object true ]
) ;
srl:body (
[ srl:subject [ srl:varName "x" ]; srl:predicate :p ; srl:object [ srl:varName "v1" ] ]
[ srl:filter [ sparql:greater-than ( [ srl:varName "v1" ] 0 ) ] ]
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "v2" ] ]
[ srl:filter [ sparql:greater-than ( [ srl:varName "v2" ] 0 ) ] ]
);
]
[
rdf:type srl:Rule;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :oneIsZero ; srl:object true ]
) ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object [ srl:varName "v1" ] ]
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "v2" ] ]
[ srl:filter [ sparql:function-or (
[ sparql:equals ( [ srl:varName "v1" ] 0 ) ]
[ sparql:equals ( [ srl:varName "v2" ] 0 ) ]
) ]
]
);
]
) .
The grammar is given below.
Mapping the AST to the abstract syntax.
Additional helpers (short-hand abbreviations):
These allow for well-known rule patterns and also specialised implementations in basic engines.
TRANSITIVE(uri)SYMMETRIC(uri)INVERSE(uri, uri)At risk:
`TRANSITIVE` has both implementation and concise expression advantages. Implementation advantages for `SYMMETRIC` and `INVERSE` are not clear.
Vocabulary: rdf-syntax-vocab.ttl
SHACL shapes: rdf-syntax-shapes.ttl
Well-formedness:
Describe how the abstract model maps to triples.
Process : accumulators, bottom up/ Walk the structure.
All triples not in the syntax are ignored. No other "srl:" predicates are allowed (??).
@@ Illustration: SHACL rule set in text and RDF syntaxes: all features:
PREFIX : <http://example/>
DATA { :s :p :o }
RULE { ?x :q :o } WHERE { ?x :p :o }
RULE { ?x :q :o } WHERE { ?x :p :o1 ; :p :o2 }
RULE { ?x :q :o } WHERE { ?x :p ?o . FILTER (?o < 18) }
RULE { ?x :q ?o } WHERE { ?x :p :o . SET (18 AS ?o) }
RULE { ?x :q ?o } WHERE { ?x :p :o . NOT { ?s :p ?o . FILTER(?o < 18) } }
PREFIX : <http://example/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX sparql: <http://www.w3.org/ns/sparql#>
PREFIX srl: <http://www.w3.org/ns/shacl-rules#>
:ruleSet-1
rdf:type srl:RuleSet;
srl:data (
[ srl:subject :s ; srl:predicate :p; srl:object :o ; ]
);
srl:rules (
[
rdf:type srl:Rule;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o ; ]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object :o ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o1 ; ]
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o2 ; ]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object :o ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object [ srl:varName "o" ] ; ]
[
srl:filter [
sparql:less-than (
[ srl:varName "o" ]
18
)
]
]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object :o ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o ; ]
[
srl:assign [
srl:assignValue 18 ;
srl:assignVar [ srl:varName "o" ]
]
]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "o" ] ; ]
)
]
[
rdf:type srl:Rule ;
srl:body (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :p ; srl:object :o ; ]
[
srl:not (
[ srl:subject [ srl:varName "s" ] ; srl:predicate :p ; srl:object [ srl:varName "o" ] ; ]
[
srl:filter [
sparql:less-than (
[ srl:varName "o" ]
18
)
]
]
)
]
) ;
srl:head (
[ srl:subject [ srl:varName "x" ] ; srl:predicate :q ; srl:object [ srl:varName "o" ] ; ]
)
]
) .
This section defines the outcome of evaluating a rule set on given data. It does not prescribe the algorithm as the method of implementation. An implementation can use any algorithm that generates the same outcome.
Inputs: data graph G, called the base graph, and a rule set RS. Output: an RDF graph GI of inferred triples
The inferred triples do not include triples present in the set of triples of the [=base graph=].
μ : V → T,
where V is the set of all variables
and T is the set of all [=RDF terms=].
The domain of μ is denoted
by dom(μ), and it is the subset
of V for which μ is defined. We use the term
[=solution=] where it is clear that a [=solution mapping=] is meant.
Write μ0 for the solution mapping, such that
dom(μ0) is the empty set.
subst(μ, [=triple pattern=])
that returns a [=triple pattern=]
where each occurrence in the [=triple pattern=] of a variable that is in the
dom(μ)
is replaced by the [=RDF term=] given by the
[=solution mapping=] for var.
If the triple pattern result has no variables, then it is an [=RDF Triple=].
Let G be an [=RDF graph=] and TP be a triple pattern. The function `graphMatch(G, TP)` returns a set of all possible solutions that, when applied to the triple pattern, produce a triple that is in the [=evaluation graph=]
Let G be an [=RDF graph=] and TP be a triple pattern.
graphMatch(G, TP) = { μ | subst(μ, TP) is a triple in G }
Let S1 and S2 be solutions.
compatible(μ1, μ2) = true
if forall v in dom(μ1) intersection dom(μ2)
μ1(v) = μ2(v)
compatible(μ1, μ2) = false otherwise
merge(μ1, μ2) = { μ |
μ(v) = μ1(v) if v in dom(μ1)
μ(v) = μ2(v) otherwise }
merge(S1, S2) = { μ |
μ1 in S1, μ2 in S2
and compatible(μ1, μ2)
μ(v) = merge(μ1, μ2) }
Say the domain is `dom(S1) ∪︀ dom(S2)`.
Say that two solutions that have no variables in common are compatible.
The first step in evaluating a [=rule set=] is to prepare a single, valid rule set. This involves gathering all imported rule sets, building a single, combined rule set, and then calculating the stratification for the combined rule set.
@@Walk imports - visit only once@@
@@ TODO: consider defining a rule set as having two components and a separate set of imports. i.e. imports are in the parsing and sorted out to give a usable "rule set" of "rules + data"
A [=rule set=] has three components: `R.rules`, `R.data`, and `R.imports`. The rule set merge of two rule sets, `RS1` and `RS2`, is a rule set, `MR`, defined as follows:
MR.rules = RS1.rules ∪︀ RS2.rules
MR.data = merge(RS1.data, RS2.data)
MR.imports = {}
where `merge` is the RDF merge operation.
define imports(rule set RS, set of URLs V), returning rule set
let I = the set of import URLs declared for the rule set RS
let RS2 = RS
foreach URL x in I:
if x ∉ V:
V = V ∪︀ { x }
read rule set RS3 from URL x
RS2 = rulesetMerge(RS2, imports(RS3, V))
endif
endfor
result is RS2
enddefine
let RS be a rule set
let V = {}
if RS has a location, V = { location of RS }
result is imports(RS, V)
An expression, whether used in a [=condition element=] or an [=assignment element=], is evaluated with respect to a solution mapping which provides a value which is an RDF term, for each variable in the expression. The well-formedness requirements of ensure that all variables in the expression appear in the solution mapping.
define evalFunction(F, μ):
let [x/μ] be
if x is an RDF term, then [x/row] is x
if x is a variable, then [x/row] is μ(x):
## By well-formedness, it is an error if x is not in the row.
return evalFunction(F(expr1, expr2 ...), row) = F(evalFunction(expr1, row), evalFunction(expr2, row), ...)
if an error is returned by evalFunction: return error
@@return evalFunction(FF(expr1, expr2) , row) = ... things that are not functions like `IF`
enddefine
The function `EBV(x)` returns the effective boolean value for an RDF term.
A [=rule=] is evaluated by calculating a [=solution sequence=] from the [=rule body=] and then using each [=solution mapping=] of the [=solution sequence=] to generate triples using [=rule head=].
let R be a well-formed rule.
let rule R = (H, B) where
H is the sequence of triple templates in the head
B is the sequence of triple pattern elements,
condition elements, negation elements,
and assignment elements in the body
# Solution sequence of one solution that does not map any variables.
let SEQ0: Solution sequence = { μ0 }
let G = evaluation graph
# Evaluate rule body
# This function returns a sequence of solutions
define evalRuleElements(B, SEQ, G):
for each rule element rElt in B:
if rElt is a triple pattern TP:
X = graphMatch(G, TP)
SEQ1 = {}
for each μ1 in X:
for each μ2 in SEQ:
if compatible(μ1, μ2)
μ3 = merge(μ1, μ2)
add μ3 to SEQ1
endif
endfor
endfor
endif
if rElt is a condition element with expression F:
SEQ1 = {}
for each solution μ in SEQ:
let x = evalFunction(F, μ)
if EBV(x) is true:
add μ to SEQ1
endif
endfor
endif
if rElt is a negation expression with body elements N:
SEQ1 = {}
for each solution μ in SEQ:
S = sequence{ μ }
NEG = evalRuleElements(N, S, G)
if NEG is empty
add μ to SEQ1
endif
endfor
endif
if rElt is an assignment with variable V and expression expr
SEQ1 = {}
for each solution μ in SEQ:
let x = evalFunction(expr, μ)
if x is not an error:
## Add mapping V -> x to solution μ
let μ2 be a solution mapping μ ∪︀ { (V, x) }
add μ2 to SEQ1
else
# Error: drop solution μ
endif
endfor
endif
if SEQ1 is empty
SEQ = {}
return SEQ
endif
SEQ = SEQ1
endfor
return SEQ
enddefine
let SEQ = evalRuleElements(B, SEQ0, G)
# Evaluate rule head
let OUT = empty set
for each μ in SEQ:
let S = {}
for each triple template TT in H:
let triple = subst(μ, TT)
Add triple to S
endfor
OUT = OUT union S
endfor
result eval(R, G) is OUT
Note that `OUT` may contain triples that are also in the data graph.
Evaluation of a [=rule set=] is defined as the execution of each [=stratum=] of the [=stratification=] of the rule set, where each stratum is executed completely and in order before moving on to the next [=stratum=]. A [=stratum=] is evaluated by first evaluating each of the [=run-once rules=] of that stratum, and then evaluating [=general rules=] of the stratum repeatedly until no new triples are produced.
let G0 be the input base graph
let RS be the rule set
let D be the graph of all DATA triples in RS
Apply stratification to RS
let LS be the sequence of layers after stratification
# Inference graph
let GI = { t ∈ D | t ∉ G0 }
# Evaluation graph.
let GE = G0 ∪︀ D
for each stratum ST in LS:
for each rule R in ST.once:
let X = eval(R, GE)
let Y = { t ∈ X | t ∉ GE }
GI = Y ∪︀ GI
GE = Y ∪︀ GE
endfor
let finished = false
while !finished:
finished = true
for each rule R in ST.general:
let X = eval(R, GE)
let Y = { t ∈ X | t ∉ GE }
if Y is not empty:
finished = false
GI = Y ∪︀ GI
GE = Y ∪︀ GE
endif
endfor
endwhile
endfor
the result is GI
A Shapes Rules Language document is an RDF string
encoded in UTF-8 [[!RFC3629]].
Only Unicode scalar values,
in the ranges U+0000 to U+D7FF
and U+E000 to U+10FFFF,
are allowed. This excludes
surrogate code points,
range U+D800 to U+DFFF.
White space
(production WS) is used
to separate two terminals which would otherwise be (mis-)recognized as one
terminal. Rule names below in capitals indicate where white space is
significant; these form a possible choice of terminals for constructing a
Shapes Rules Language parser.
White space is significant in the production
String.
Comments start with a # outside an
IRIREF,
STRING_LITERAL1,
STRING_LITERAL2,
STRING_LITERAL_LONG1, or
STRING_LITERAL_LONG2,
and continue to the end of line (marked by
LF, or
CR),
or end of file if there is no end of line after the comment marker.
Comments are treated as white space.
Relative IRI references are resolved with base IRIs as per [[[RFC3986]]] [[RFC3986]] using only the basic algorithm in section 5.2. Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) are performed. Characters additionally allowed in IRI references are treated in the same way that unreserved characters are treated in URI references, per section 6.5 of [[[RFC3987]]] [[RFC3987]].
The BASE
directive defines the Base IRI used to
resolve relative IRI
references per [[RFC3986]]
section 5.1.1, "Base URI Embedded in Content".
Section 5.1.2, "Base URI from the Encapsulating Entity"
defines how the In-Scope Base IRI may come from an encapsulating document,
such as a SOAP envelope with an `xml:base` directive or a MIME multipart document with a
`Content-Location` header.
The "Retrieval URI" identified in 5.1.3, Base "URI from the Retrieval URI",
is the URL from which a particular Shapes Rules Language document was retrieved.
If none of the above specifies the Base URI, the default
Base URI (section 5.1.4, "Default Base URI") is used.
Each BASE directive sets a new In-Scope Base URI,
relative to the previous one.
There are three forms of escapes used in Shapes Rules documents:
A numeric escape sequence represents the value of a Unicode code point.
A numeric escape sequence MUST NOT produce a code point value
in the range U+D800 to U+DFFF,
which is the range for
Unicode surrogates.
| Escape sequence | Unicode code point |
|---|---|
\u hex
hex
hex
hex |
A Unicode code point
in the ranges U+0000 to U+D7FF
and U+E000 to U+FFFF,
corresponding to the value encoded by the four hexadecimal digits interpreted
from most significant to least significant digit. |
\U hex
hex
hex
hex
hex
hex
hex
hex |
A Unicode code point
in the ranges U+0000 to
U+D7FF
and U+E000 to U+10FFFF,
corresponding to the value encoded by the eight hexadecimal digits
interpreted from most significant to least significant digit. |
where hex is a hexadecimal character
HEX ::= [0-9] | [A-F] | [a-f]
A string escape sequence represents a character traditionally escaped in string literals:
| Escape sequence | Unicode code point |
|---|---|
\t |
U+0009 |
\b |
U+0008 |
\n |
U+000A |
\r |
U+000D |
\f |
U+000C |
\" |
U+0022 |
\' |
U+0027 |
\\ |
U+005C |
A reserved character escape sequence consists of a
\ followed by
one of these characters ~.-!$&'()*+,;=/?#@%_, and
represents the character to the right of
the \.
| numeric escapes |
string escapes |
reserved character escapes |
|
|---|---|---|---|
IRIs,
used as RDF terms
PREFIX,
or BASE declarations |
yes | no | no |
| local names | no | no | yes |
| Strings | yes | yes | no |
%-encoded sequences are in the
character range for IRIs
and are explicitly allowed in local names.
These appear as a %
followed by two hex characters and represent that
same sequence of three characters. These sequences are not
decoded during processing.
A term written as <http://a.example/%66oo-bar>
designates the IRI http://a.example/%66oo-bar
and not IRI http://a.example/foo-bar.
A term written as ex:%66oo-bar with a prefix
PREFIX ex: <http://a.example/>
also designates the IRI http://a.example/%66oo-bar.
The EBNF used here is defined in XML 1.0 [[!EBNF-NOTATION]].
Notes:
a'
which is case-sensitive.
UCHAR
and ECHAR
are case sensitive.
RuleSet.
A text version of this grammar is available here.
This document uses some specific terminal literal strings [[EBNF-NOTATION]]. To clarify the Unicode code points used for these terminal literal strings, the following table describes specific characters used in this section.
| Code | Glyph | Description |
|---|---|---|
U+000A |
LF |
Line feed |
U+000D |
CR |
Carriage return |
U+0023 |
# |
Number sign |
U+0025 |
% |
Percent sign |
U+005C |
\ |
Backslash |
The Internet Media Type (formerly known as MIME Type) for the
Shapes Rules Language is
"application/shape-rules".
The information that follows has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.
versionversion are defined in
Version Labels.
profileprofile parameter is a non-empty list of space-separated URIs.
For more information and background, please refer to [[RFC6906]].
TODO
TODO
TODO
TODO