This document describes Shapes Constraint Language (SHACL) Node Expressions.
This specification is published by the Data Shapes Working Group .
Node expressions
Some examples in this document use Turtle [[rdf12-turtle]]. The reader is expected to be familiar with SHACL [[shacl12-core]] and SPARQL [[sparql12-query]].
Within this document, the following namespace prefix bindings are used:
Prefix | Namespace |
---|---|
rdf: |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: |
http://www.w3.org/2000/01/rdf-schema# |
sh: |
http://www.w3.org/ns/shacl# |
skos: |
http://www.w3.org/2004/02/skos/core# |
xsd: |
http://www.w3.org/2001/XMLSchema# |
ex: |
http://example.com/ns# |
Formal definitions appear in blue boxes:
# This box contains textual definitions.
Grey boxes such as this include syntax rules that apply to the shapes graph.
true
denotes the RDF term
"true"^^xsd:boolean
.
false
denotes the RDF term
"false"^^xsd:boolean
.
TODO
This section is informative.
SHACL shapes graphs can declare node expressions as values of various properties where dynamic computation is useful,
such as sh:targetNode
, sh:values
and sh:deactivated
.
Node expressions are represented by RDF nodes and can be evaluated to produce a list of output nodes.
For example when used at sh:targetNode
, a node expression produces the list
of target nodes of a shape.
When used at sh:values
, a node expression produces the derived values for the property specified by sh:path
.
The following example contains a node expression that states that the target nodes of
the shape ex:EstonianCompanyShape
are the instances of ex:Company
where
the ex:headQuarterCountry
is ex:Estonia
.
The following diagram illustrates how this node expression is interpreted, from a logical point of view.
During validation, a SHACL processor will determine the target nodes of the shape
by evaluating the filterShape expression.
The filterShape expression, however, first evaluates its input expression, which is specified via sh:nodes
and is an instancesOf expression.
This will produce all instances of the given class ex:Company
.
The sh:filterShape
is then applied to all of these instances, to only keep the companies
that conform to the provided shape by having their headquarter in Estonia.
The same scenario as above can also be expressed using SPARQL select expressions.
Specific implementations of the SHACL node expressions may (for example for performance reasons)
internally convert node expressions such as the sh:filterShape
above to SPARQL.
The next example uses a node expression to compute the values of the property ex:employeeCount
as the number of values of the property ex:employee
at each instance of ex:Company
.
A difference between this example and the previous examples about sh:targetNode
is that these node expressions are evaluated against a given focus node.
So when a data visualization needs to render an instance of ex:Company
,
the currently displayed company is the focus node, from which the number of employees
will be fetched.
Clarify when these derived properties can be used (e.g. in sh:path expressions but not as triples elsewhere)
This section introduces the general syntax of SHACL node expressions.
The term node expression function refers to the kind or type
of a node expression.
For example, sh:FilterShapeExpression
is a node expression function
while a specific instance of this function in the graph is the node expression itself.
The most basic node expression functions are constant node expressions, which are either literals or IRIs and simply evaluate to these constants. All other node expressions are represented by blank nodes and come in the following two variations.
The node expression functions in this section are called constant node expressions. They were introduced in the SHACL Core specification but are repeated here to keep this document self-contained.
A node expression that is an IRI is called an IRI expression with the function name
sh:IRIExpression
.
A node in an RDF graph is a well-formed IRI expression if it is an IRI.
The output nodes of an IRI expression are the list consisting of exactly the node expression itself:
evalExpr(expr, focusGraph, focusNode, scope) -> [expr]
A node expression that is a literal is called a literal expression with the function name
sh:LiteralExpression
.
A node in an RDF graph is a well-formed literal expression if it is a literal.
The output nodes of a literal expression are the list consisting of exactly the node expression itself:
evalExpr(expr, focusGraph, focusNode, scope) -> [expr]
A named parameter function is a node expression function that is represented by a blank node that is the subject of at least one triple where the predicate can be used to uniquely identify the function, which is known as the key parameter.
The evaluation of a named parameter function can produce either:
For example, the named parameter function sh:FilterShapeExpression has
sh:filterShape
as its key parameter.
In this document, key parameters are marked in bold face.
Expressions based on named parameter functions often take other node expressions as arguments, evaluate those input node expressions and then produce a different list of nodes as output nodes.
The remainder of this section is informative.
This document includes many examples of named parameter functions, such as the Estonian Company Shape example.
A list parameter function is a node expression function that is represented by a blank node that is the subject of a single triple where the object is a SHACL list. The predicate of this triple is called the list parameter property.
The evaluation of a list parameter function can produce either:
Furthermore, all arguments of a list parameter function must evaluate to individual nodes and not lists of nodes. If an argument is a node expression then this node expression must evaluate to at most one output node. An evaluation failure must be produced if there is more than one output node. This is different from named parameter functions, where arguments may produce lists of multiple nodes.
The remainder of this section is informative.
Note that some named parameter functions such as sh:IntersectionExpression
also use SHACL lists as object of the key parameter, similar to list parameter functions which always have SHACL lists as object of their list parameter property.
However, these may produce more than one output nodes and also accept lists as input nodes.
The following example uses multiple (imaginary) list parameter functions
ex:coalesce
and ex:concat
to compute the ex:displayName
of a person as either the value of ex:fullName
or (if that doesn't exist)
as the concatenation of first and last names, with a space in between.
Node expressions may produce a failure instead of a list of output nodes.
Some node expressions evaluate other, nested node expressions.
For example, If Expressions evaluate nested expressions for
sh:if
, sh:then
and sh:else
.
In general, if any such nested expressions produce a failure then the surrounding
expression also produces the same failure.
The remainder of this section is informative.
Note that this policy impacts the evaluation order of node expressions.
For example, sh:if
expressions are evaluated first and sh:then
will be evaluated only if the sh:if
has returned ( true )
.
Even if the sh:else
branch would produce a failure, the output would
still only be the output nodes of the sh:then
branch.
This section defines all node expression functions that are built into SHACL engines that implement this specification.
The syntax definitions of node expression functions that are based on blank nodes typically use a table of properties that these blank nodes can or must have. Such blank nodes are only well-formed when they are not the subject of any other triples, and when none of these properties is used more than once. The tables may also list SHACL constraints with which the property values are required to conform. In the tables, mandatory properties are rendered in bold face.
A blank node that is the subject of the following properties
is called a var expression with the function name sh:VarExpression
:
Property
Constraints
Description
sh:var
sh:datatype xsd:string
sh:minLength 1
The variable name, e.g.
"focusNode"
.
Let var
be the value of sh:var
in the var expression.
The output nodes of the var expression are computed as follows, in order:
var
is "focusNode"
then evalExpr(expr, focusGraph, focusNode, scope) -> [focusNode]
var
is in the scope
then evalExpr(expr, focusGraph, focusNode, scope) -> [ scope[var] ]
evalExpr(expr, focusGraph, focusNode, scope) -> []
The remainder of this section is informative.
The following example illustrates the use of a var expression pointing at the current focus node
to state that the default value of the ex:loves
relationship is the current instance of ex:Person
,
creating a self-reference.
A blank node that is the subject of the following properties
is called a list expression with the function name sh:ListExpression
:
Property
Constraints
Description
rdf:first
The first member of the list.
rdf:rest
Must be a well-formed SHACL list.
The rest of the list, e.g.
rdf:nil
.
The output nodes of a list expression are the members of the SHACL list.
The remainder of this section is informative.
Note that rdf:nil
itself is not a list expression because it will be interpreted
as a IRI expression.
As a result, all well-formed list expressions have at least one member.
The following example declares a property for instances of rdfs:Class
where the values are derived from the values of the path rdfs:subClassOf*
but with the constants from the list ( owl:Thing rdfs:Resource )
removed using
sh:minus
.
A blank node that is the subject of the following properties
is called a path expression with the function name sh:PathExpression
:
Property
Constraints
Description
sh:path
Must be a well-formed SHACL property path.
The path to get the values from.
sh:nodes
Optional, must be a well-formed node expression.
A node expression producing the focus nodes, defaulting
to the current focus node from the evaluation context.
Let path
be the value of sh:path
,
and nodes
be the value of sh:nodes
in the path expression.
If sh:nodes
is not given, nodes
is the list consistent of exactly the focus node.
Let N
be the nodes produced by evalExpr(nodes, focusGraph, focusNode, scope)
.
The output nodes of the path expression are the list of nodes produced by concatenating
the value nodes of the path
at each node in N
.
TODO: Clarify if those can contain duplicates.
The remainder of this section is informative.
The following example illustrates the use of a path expression to compute the value
of the property ex:topConceptCount
.
The path expression returns the values of skos:hasTopConcept
at each skos:ConceptScheme
and these are processed by the sh:count
to return the
number of top concepts.
TODO: Add second example that uses sh:nodes
A blank node that is the subject of the following properties
is called an exists expression with the function name sh:ExistsExpression
:
Property
Constraints
Description
sh:exists
A well-formed node expression.
A node expression. If this evaluates to a list with at least one member
then the output nodes are
( true )
; otherwise, the output nodes are ( false )
.
Let exists
be the value of sh:exists
in the exists expression.
Let N
be the list of nodes produced by evalExpr(exists, focusGraph, focusNode, scope)
.
The output nodes of the exists expression are ( true )
if and only if
N
has at least one member; otherwise, the output nodes are ( false )
.
The remainder of this section is informative.
The Example for sh:if
uses sh:exists
.
A blank node that is the subject of the following properties
is called an if expression with the function name sh:IfExpression
:
Property
Constraints
Description
sh:if
A well-formed node expression.
A node expression. The
sh:then
branch is returned when the sh:if
expression
returns true
as its only output node, in all other cases sh:else
.
sh:then
A well-formed node expression.
Optional but at least one of
sh:then
or sh:else
is required.
The node expression that is returned when the
sh:if
evaluated to [true]
.
sh:else
A well-formed node expression.
Optional but at least one of
sh:then
or sh:else
is required.
The node expression that is returned when the
sh:if
did not evaluate to [true]
.
Let if
be the value of sh:if
,
then
be the value of sh:then
, and
else
be the value of sh:else
for the if expression.
Let IFs
be the nodes produced by evalExpr(if, focusGraph, focusNode, scope)
.
If IFs
is the list ( true )
, then the output nodes of the if expression
are the nodes produced by evalExpr(then, focusGraph, focusNode, scope)
, or the empty list if then
has no value.
Otherwise, the output nodes are the nodes produced by evalExpr(else, focusGraph, focusNode, scope)
, or the empty list if else
has no value.
Implementations MUST apply lazy evaluation techniques, so the sh:then
or
sh:else
branches are only evaluated when necessary.
The remainder of this section is informative.
The following example illustrates the use of sh:if
to compute the
values of a derived property ex:fillColor
that may be queried to
compute the colors of cities on a map.
In the example, instances of ex:City
that have a value for ex:capitalOf
will be displayed in "blue"
, while the others will be "red"
.
A blank node that is the subject of the following properties
is called a distinct expression with the function name sh:DistinctExpression
:
Property
Constraints
Description
sh:distinct
A well-formed node expression.
The node expression that shall be reduced to its distinct members.
Let distinct
be the value of sh:distinct
in the distinct expression.
Let input
be the output nodes of evalExpr(distinct, focusGraph, focusNode, scope)
.
The output nodes of the distinct expression are the list of nodes in input
in the same order but with duplicates eliminated (the first occurences of each node shall be kept, the others removed).
Nodes are compared using term equality, i.e. "01"^^xsd:integer
is distinct from "1"^^xsd:integer
.
The remainder of this section is informative.
The following example declares a derived property ex:superClassesIncludingRoot
that is computed as the union of the (transitive) values of rdfs:subClassOf
and the list expression ( rdfs:Resource )
.
Since the asserted values of rdfs:subClassOf
may already include rdfs:Resource
(for example, due to an active inference engine on the data graph), sh:distinct
will make
sure that the output nodes do not include rdfs:Resource
twice.
A blank node that is the subject of the following properties
is called an intersection expression with the function name sh:IntersectionExpression
:
Property
Constraints
Description
sh:intersection
A well-formed SHACL list where each member is a well-formed node expression.
The node expressions that shall be intersected.
Let members
be the members of the value of sh:intersection
in the intersection expression.
The output nodes of the intersection expression are the nodes that form the intersection of the output nodes
produced by each node expression NE
in members
, using evalExpr(NE, focusGraph, focusNode, scope)
.
Nodes must be equal using term equality, i.e. "01"^^xsd:integer
is distinct from "1"^^xsd:integer
.
The remainder of this section is informative.
The following example uses sh:intersection
as a sh:targetNode
node expression.
This shape will target all nodes that are SHACL instances of ex:Australian
and
ex:German
at the same time.
A blank node that is the subject of the following properties
is called a union expression with the function name sh:UnionExpression
:
Property
Constraints
Description
sh:union
A well-formed SHACL list where each member is a well-formed node expression.
The node expressions that shall be unioned.
Let members
be the members of the value of sh:union
in the union expression.
The output nodes of the union expression are the concatenation of all output nodes
for each node expression NE
in members
, using evalExpr(NE, focusGraph, focusNode, scope)
.
The order is preserved, evaluating the members
from left to right and keeping the order of each list of output nodes.
The remainder of this section is informative.
Note that a union expression may produce duplicate output nodes if the individual output nodes overlap. Use sh:distinct to eliminate duplicates.
The Example for sh:distinct
uses sh:union
.
A blank node that is the subject of the following properties
is called a minus expression with the function name sh:MinusExpression
:
Property
Constraints
Description
sh:minus
A well-formed node expression.
The nodes that shall be removed from the
sh:nodes
.
sh:nodes
A well-formed node expression.
The input nodes.
Let minus
be the value of sh:minus
and nodes
be the value of sh:nodes
in the minus expression.
Let M
be the output nodes of evalExpr(minus, focusGraph, focusNode, scope)
.
Let N
be the output nodes of evalExpr(nodes, focusGraph, focusNode, scope)
.
The output nodes of the minus expression are the nodes in N
except those that are also in M
, preserving the order of N
.
Nodes must be equal using term equality, i.e. "01"^^xsd:integer
is distinct from "1"^^xsd:integer
.
The remainder of this section is informative.
The List Expression example uses sh:minus
.
A blank node that is the subject of the following properties
is called a filterShape expression with the function name sh:FilterShapeExpression
:
Property
Constraints
Description
sh:filterShape
A well-formed shape.
The shape that all input nodes need to conform to.
sh:nodes
A well-formed node expression.
A node expression producing the nodes that are validated.
Let filterShape
be the value of sh:filterShape
,
and nodes
be the value of sh:nodes
in a filterShape expression.
The output nodes of the filterShape expression are the output nodes of
evalExpr(nodes, focusGraph, focusNode, scope)
except those that do not conform to
the shape filterShape
, preserving the order in the list.
The remainder of this section is informative.
The following example illustrates the use of sh:filterShape
to return a subset
of values of the ex:child
property where the ex:gender
property
has the value "male"
.
A blank node that is the subject of the following properties
is called a limit expression with the function name sh:LimitExpression
:
Property
Constraints
Description
sh:limit
sh:datatype xsd:integer
sh:minInclusive 0
The maximum number of nodes that shall be returned.
sh:nodes
A well-formed node expression.
The input nodes.
Let limit
be the value of sh:limit
and nodes
be the value of sh:nodes
in the limit expression.
Let N
be the output nodes of evalExpr(nodes, focusGraph, focusNode, scope)
.
The output nodes of the limit expression are the first limit
nodes in N
from left to right, in the same order.
The remainder of this section is informative.
The following example illustrates the use of sh:limit
to compute the
values of a derived property ex:oldestChildren
to be a sub-list of
values of ex:child
at the current focus node (which is an instance of
the class ex:Person
).
The values are computed by first fetching the values of ex:child
, then
ordering them by their ex:dateOfBirth
, and finally getting only
2
of these children at most.
A blank node that is the subject of the following properties
is called an offset expression with the function name sh:OffsetExpression
:
Property
Constraints
Description
sh:offset
sh:datatype xsd:integer
sh:minInclusive 0
The number of nodes that shall be skipped from the
sh:nodes
.
sh:nodes
A well-formed node expression.
The input nodes.
Let offset
be the value of sh:offset
and nodes
be the value of sh:nodes
in the offset expression.
Let N
be the output nodes of evalExpr(nodes, focusGraph, focusNode, scope)
.
The output nodes of the offset expression are the nodes in N
except for the first offset
nodes from left to right, in the same order.
The remainder of this section is informative.
The following example illustrates the use of sh:offset
to compute the
values of a derived property ex:remainingChildren
to be a sub-list of
values of ex:child
at the current focus node (which is an instance of
the class ex:Person
).
The values are computed by first fetching the values of ex:child
, then
ordering them by their ex:dateOfBirth
, and finally skipping the first
of these children.
TODO: This should be cleaned up from the SHACL-AF definition and requires thought
The Example of sh:limit
also illustrates sh:orderBy
.
A blank node that is the subject of the following properties
is called a count expression with the function name sh:CountExpression
:
Property
Constraints
Description
sh:count
A well-formed node expression.
The input nodes that shall be counted.
Let count
be the value of sh:count
in the count expression.
Let N
be the output nodes of evalExpr(count, focusGraph, focusNode, scope)
.
The output nodes of the count expression is the list consisting of exactly one
xsd:integer
literal that is computed as the length of N
.
The remainder of this section is informative.
The following example illustrates the use of sh:count
to derive a property
ex:topConceptCount
as the number of values of the skos:hasTopConcept
property in a skos:ConceptScheme
.
A blank node that is the subject of the following properties
is called a min expression with the function name sh:MinExpression
:
Property
Constraints
Description
sh:min
A well-formed node expression.
The input nodes from which the minimum value shall be returned.
Let min
be the value of sh:min
in the min expression.
Let N
be the output nodes of evalExpr(min, focusGraph, focusNode, scope)
.
The output nodes of the min expression is the list consisting of at most one node
that is computed as the minimum value from N
.
Clarify exactly how that is computed, maybe via reference to SPARQL MIN
The remainder of this section is informative.
The following example illustrates the use of sh:min
to derive a property
ex:minStartDate
as the smallest value of the values that can be reached using the
property path ex:exployee/ex:startDate
.
In other words, it walks through all employees of the given company and returns the earliest
date on which an employee started.
A blank node that is the subject of the following properties
is called a max expression with the function name sh:MaxExpression
:
Property
Constraints
Description
sh:max
A well-formed node expression.
The input nodes from which the maximum value shall be returned.
Let max
be the value of sh:max
in the max expression.
Let N
be the output nodes of evalExpr(max, focusGraph, focusNode, scope)
.
The output nodes of the max expression is the list consisting of at most one node
that is computed as the maximum value from N
.
Clarify exactly how that is computed, maybe via reference to SPARQL MAX
The remainder of this section is informative.
The Example for sh:min
can be easily adapted
for sh:max
.
A blank node that is the subject of the following properties
is called a sum expression with the function name sh:SumExpression
:
Property
Constraints
Description
sh:sum
A well-formed node expression.
The input nodes from which the sum shall be returned.
Let sum
be the value of sh:sum
in the sum expression.
Let N
be the output nodes of evalExpr(sum, focusGraph, focusNode, scope)
.
The output nodes of the sum expression is the list consisting of exactly one node
that is computed as the sum of all nodes from N
.
Clarify exactly how that is computed, maybe via reference to SPARQL SUM
The remainder of this section is informative.
Note that sh:sum
needs to be used with care and may be often misunderstood,
when used with property paths.
The problem is that when a path expression is used as input to a sum expression,
the path expression will have eliminated duplicates before they can be processed by the sh:sum
.
As a result, only the distinct values will be added up.
TODO: Find good example, or drop the feature if none makes sense
This section enumerates node expression functions that did not fit into other categories.
A blank node that is the subject of the following properties
is called an instancesOf expression with the function name sh:InstancesOfExpression
:
Property
Constraints
Description
sh:instancesOf
sh:nodeKind sh:IRI
The class that the output nodes must be instances of.
Let type
be the value of sh:instancesOf
in an instancesOf expression.
The output nodes of the instancesOf expression are the nodes that are SHACL instances
of type
in the focus graph.
The remainder of this section is informative.
Note that the definition of SHACL instance
includes instances of subclasses of the given class.
So if the focus graph contains ex:SubClass rdfs:subClassOf ex:SuperClass
and ex:SubInstance a ex:SubClass
then ex:SubInstance
will also be returned as instance of ex:SuperClass
.
The interpretation of sh:instancesOf
is similar to sh:targetClass
and sh:class
.
Users of this node expression function should be aware that the list of output nodes may be very large.
The Example for sh:intersection
uses sh:instanceOf
.
TODO
TODO
Many people contributed to this document, including members of the RDF Data Shapes Working Group.
TODO