This document describes Shapes Constraint Language (SHACL) Node Expressions.

This specification is published by the Data Shapes Working Group .

Introduction

Node expressions

Terminology

Basic RDF Terminology
This document uses the terms RDF graph, RDF triple, IRI, literal, blank node, node of an RDF graph, datatype, RDF term, and term equality, and subject, predicate, and object of RDF triples as defined in RDF 1.2 Concepts and Abstract Syntax [[!rdf12-concepts]].
Basic SHACL Terminology
This document uses the terms focus node, value, value node, constraint, constraint component, parameter, mandatory parameter, optional parameter, parameter value, shape, node shape, property shape, SHACL property path, SPARQL property path, shapes graph, target, validator, node expression, node expression function, function name, output nodes, focus graph, evaluation, evaluation failure, conform, failure, SHACL instance, SHACL subclass, SHACL type, SHACL list, members, well-formed, as defined in the SHACL 1.2 Core specification [[!shacl12-core]].

Document Conventions

Some examples in this document use Turtle [[rdf12-turtle]]. The reader is expected to be familiar with SHACL [[shacl12-core]] and SPARQL [[sparql12-query]].

Within this document, the following namespace prefix bindings are used:

Prefix Namespace
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
sh: http://www.w3.org/ns/shacl#
skos: http://www.w3.org/2004/02/skos/core#
xsd: http://www.w3.org/2001/XMLSchema#
ex: http://example.com/ns#

Formal definitions appear in blue boxes:

TEXTUAL DEFINITIONS
          # This box contains textual definitions. 

Grey boxes such as this include syntax rules that apply to the shapes graph.

true denotes the RDF term "true"^^xsd:boolean . false denotes the RDF term "false"^^xsd:boolean .

TODO

Getting started with Node Expressions

This section is informative.

SHACL shapes graphs can declare node expressions as values of various properties where dynamic computation is useful, such as sh:targetNode, sh:values and sh:deactivated. Node expressions are represented by RDF nodes and can be evaluated to produce a list of output nodes. For example when used at sh:targetNode, a node expression produces the list of target nodes of a shape. When used at sh:values, a node expression produces the derived values for the property specified by sh:path.

The following example contains a node expression that states that the target nodes of the shape ex:EstonianCompanyShape are the instances of ex:Company where the ex:headQuarterCountry is ex:Estonia.

The following diagram illustrates how this node expression is interpreted, from a logical point of view. During validation, a SHACL processor will determine the target nodes of the shape by evaluating the filterShape expression. The filterShape expression, however, first evaluates its input expression, which is specified via sh:nodes and is an instancesOf expression. This will produce all instances of the given class ex:Company. The sh:filterShape is then applied to all of these instances, to only keep the companies that conform to the provided shape by having their headquarter in Estonia.

Illustration of the data flow between node expressions

The same scenario as above can also be expressed using SPARQL select expressions. Specific implementations of the SHACL node expressions may (for example for performance reasons) internally convert node expressions such as the sh:filterShape above to SPARQL.

The next example uses a node expression to compute the values of the property ex:employeeCount as the number of values of the property ex:employee at each instance of ex:Company.

Illustration of the data flow between node expressions computing the employee count

A difference between this example and the previous examples about sh:targetNode is that these node expressions are evaluated against a given focus node. So when a data visualization needs to render an instance of ex:Company, the currently displayed company is the focus node, from which the number of employees will be fetched.

Clarify when these derived properties can be used (e.g. in sh:path expressions but not as triples elsewhere)

Node Expression Syntax

This section introduces the general syntax of SHACL node expressions.

The term node expression function refers to the kind or type of a node expression. For example, sh:FilterShapeExpression is a node expression function while a specific instance of this function in the graph is the node expression itself.

The most basic node expression functions are constant node expressions, which are either literals or IRIs and simply evaluate to these constants. All other node expressions are represented by blank nodes and come in the following two variations.

Constant Node Expressions

The node expression functions in this section are called constant node expressions. They were introduced in the SHACL Core specification but are repeated here to keep this document self-contained.

IRI Expressions

A node expression that is an IRI is called an IRI expression with the function name sh:IRIExpression.

A node in an RDF graph is a well-formed IRI expression if it is an IRI.

EVALUATION OF IRI EXPRESSIONS

The output nodes of an IRI expression are the list consisting of exactly the node expression itself:

evalExpr(expr, focusGraph, focusNode, scope) -> [expr]

Literal Expressions

A node expression that is a literal is called a literal expression with the function name sh:LiteralExpression.

A node in an RDF graph is a well-formed literal expression if it is a literal.

EVALUATION OF LITERAL EXPRESSIONS

The output nodes of a literal expression are the list consisting of exactly the node expression itself:

evalExpr(expr, focusGraph, focusNode, scope) -> [expr]

Node Expressions based on Blank Nodes

Named Parameter Functions

A named parameter function is a node expression function that is represented by a blank node that is the subject of at least one triple where the predicate can be used to uniquely identify the function, which is known as the key parameter.

The evaluation of a named parameter function can produce either:

For example, the named parameter function sh:FilterShapeExpression has sh:filterShape as its key parameter. In this document, key parameters are marked in bold face.

Expressions based on named parameter functions often take other node expressions as arguments, evaluate those input node expressions and then produce a different list of nodes as output nodes.

The remainder of this section is informative.

This document includes many examples of named parameter functions, such as the Estonian Company Shape example.

List Parameter Functions

A list parameter function is a node expression function that is represented by a blank node that is the subject of a single triple where the object is a SHACL list. The predicate of this triple is called the list parameter property.

The evaluation of a list parameter function can produce either:

Furthermore, all arguments of a list parameter function must evaluate to individual nodes and not lists of nodes. If an argument is a node expression then this node expression must evaluate to at most one output node. An evaluation failure must be produced if there is more than one output node. This is different from named parameter functions, where arguments may produce lists of multiple nodes.

The remainder of this section is informative.

Note that some named parameter functions such as sh:IntersectionExpression also use SHACL lists as object of the key parameter, similar to list parameter functions which always have SHACL lists as object of their list parameter property. However, these may produce more than one output nodes and also accept lists as input nodes.

The following example uses multiple (imaginary) list parameter functions ex:coalesce and ex:concat to compute the ex:displayName of a person as either the value of ex:fullName or (if that doesn't exist) as the concatenation of first and last names, with a space in between.

Handling of Failures

Node expressions may produce a failure instead of a list of output nodes. Some node expressions evaluate other, nested node expressions. For example, If Expressions evaluate nested expressions for sh:if, sh:then and sh:else. In general, if any such nested expressions produce a failure then the surrounding expression also produces the same failure.

The remainder of this section is informative.

Note that this policy impacts the evaluation order of node expressions. For example, sh:if expressions are evaluated first and sh:then will be evaluated only if the sh:if has returned ( true ). Even if the sh:else branch would produce a failure, the output would still only be the output nodes of the sh:then branch.

Node Expressions Library

This section defines all node expression functions that are built into SHACL engines that implement this specification.

The syntax definitions of node expression functions that are based on blank nodes typically use a table of properties that these blank nodes can or must have. Such blank nodes are only well-formed when they are not the subject of any other triples, and when none of these properties is used more than once. The tables may also list SHACL constraints with which the property values are required to conform. In the tables, mandatory properties are rendered in bold face.

Basic Node Expressions

Var Expressions

A blank node that is the subject of the following properties is called a var expression with the function name sh:VarExpression:

Property Constraints Description
sh:var sh:datatype xsd:string
sh:minLength 1
The variable name, e.g. "focusNode".

EVALUATION OF VAR EXPRESSIONS

Let var be the value of sh:var in the var expression. The output nodes of the var expression are computed as follows, in order:

  1. if var is "focusNode" then evalExpr(expr, focusGraph, focusNode, scope) -> [focusNode]
  2. if var is in the scope then evalExpr(expr, focusGraph, focusNode, scope) -> [ scope[var] ]
  3. otherwise evalExpr(expr, focusGraph, focusNode, scope) -> []

The remainder of this section is informative.

The following example illustrates the use of a var expression pointing at the current focus node to state that the default value of the ex:loves relationship is the current instance of ex:Person, creating a self-reference.

List Expressions

A blank node that is the subject of the following properties is called a list expression with the function name sh:ListExpression:

Property Constraints Description
rdf:first The first member of the list.
rdf:rest Must be a well-formed SHACL list. The rest of the list, e.g. rdf:nil.

EVALUATION OF LIST EXPRESSIONS

The output nodes of a list expression are the members of the SHACL list.

The remainder of this section is informative.

Note that rdf:nil itself is not a list expression because it will be interpreted as a IRI expression. As a result, all well-formed list expressions have at least one member.

The following example declares a property for instances of rdfs:Class where the values are derived from the values of the path rdfs:subClassOf* but with the constants from the list ( owl:Thing rdfs:Resource ) removed using sh:minus.

Path Expressions

A blank node that is the subject of the following properties is called a path expression with the function name sh:PathExpression:

Property Constraints Description
sh:path Must be a well-formed SHACL property path. The path to get the values from.
sh:nodes Optional, must be a well-formed node expression. A node expression producing the focus nodes, defaulting to the current focus node from the evaluation context.

EVALUATION OF PATH EXPRESSIONS

Let path be the value of sh:path, and nodes be the value of sh:nodes in the path expression. If sh:nodes is not given, nodes is the list consistent of exactly the focus node. Let N be the nodes produced by evalExpr(nodes, focusGraph, focusNode, scope). The output nodes of the path expression are the list of nodes produced by concatenating the value nodes of the path at each node in N. TODO: Clarify if those can contain duplicates.

The remainder of this section is informative.

The following example illustrates the use of a path expression to compute the value of the property ex:topConceptCount. The path expression returns the values of skos:hasTopConcept at each skos:ConceptScheme and these are processed by the sh:count to return the number of top concepts.

TODO: Add second example that uses sh:nodes

Exists Expressions

A blank node that is the subject of the following properties is called an exists expression with the function name sh:ExistsExpression:

Property Constraints Description
sh:exists A well-formed node expression. A node expression. If this evaluates to a list with at least one member then the output nodes are ( true ); otherwise, the output nodes are ( false ).

EVALUATION OF EXISTS EXPRESSIONS

Let exists be the value of sh:exists in the exists expression. Let N be the list of nodes produced by evalExpr(exists, focusGraph, focusNode, scope). The output nodes of the exists expression are ( true ) if and only if N has at least one member; otherwise, the output nodes are ( false ).

The remainder of this section is informative.

The Example for sh:if uses sh:exists.

If Expressions

A blank node that is the subject of the following properties is called an if expression with the function name sh:IfExpression:

Property Constraints Description
sh:if A well-formed node expression. A node expression. The sh:then branch is returned when the sh:if expression returns true as its only output node, in all other cases sh:else.
sh:then A well-formed node expression. Optional but at least one of sh:then or sh:else is required. The node expression that is returned when the sh:if evaluated to [true].
sh:else A well-formed node expression. Optional but at least one of sh:then or sh:else is required. The node expression that is returned when the sh:if did not evaluate to [true].

EVALUATION OF IF EXPRESSIONS

Let if be the value of sh:if, then be the value of sh:then, and else be the value of sh:else for the if expression. Let IFs be the nodes produced by evalExpr(if, focusGraph, focusNode, scope). If IFs is the list ( true ), then the output nodes of the if expression are the nodes produced by evalExpr(then, focusGraph, focusNode, scope), or the empty list if then has no value. Otherwise, the output nodes are the nodes produced by evalExpr(else, focusGraph, focusNode, scope), or the empty list if else has no value. Implementations MUST apply lazy evaluation techniques, so the sh:then or sh:else branches are only evaluated when necessary.

The remainder of this section is informative.

The following example illustrates the use of sh:if to compute the values of a derived property ex:fillColor that may be queried to compute the colors of cities on a map. In the example, instances of ex:City that have a value for ex:capitalOf will be displayed in "blue", while the others will be "red".

List Operator Expressions

Distinct Expressions

A blank node that is the subject of the following properties is called a distinct expression with the function name sh:DistinctExpression:

Property Constraints Description
sh:distinct A well-formed node expression. The node expression that shall be reduced to its distinct members.

EVALUATION OF DISTINCT EXPRESSIONS

Let distinct be the value of sh:distinct in the distinct expression. Let input be the output nodes of evalExpr(distinct, focusGraph, focusNode, scope). The output nodes of the distinct expression are the list of nodes in input in the same order but with duplicates eliminated (the first occurences of each node shall be kept, the others removed). Nodes are compared using term equality, i.e. "01"^^xsd:integer is distinct from "1"^^xsd:integer.

The remainder of this section is informative.

The following example declares a derived property ex:superClassesIncludingRoot that is computed as the union of the (transitive) values of rdfs:subClassOf and the list expression ( rdfs:Resource ). Since the asserted values of rdfs:subClassOf may already include rdfs:Resource (for example, due to an active inference engine on the data graph), sh:distinct will make sure that the output nodes do not include rdfs:Resource twice.

Intersection Expressions

A blank node that is the subject of the following properties is called an intersection expression with the function name sh:IntersectionExpression:

Property Constraints Description
sh:intersection A well-formed SHACL list where each member is a well-formed node expression. The node expressions that shall be intersected.

EVALUATION OF INTERSECTION EXPRESSIONS

Let members be the members of the value of sh:intersection in the intersection expression. The output nodes of the intersection expression are the nodes that form the intersection of the output nodes produced by each node expression NE in members, using evalExpr(NE, focusGraph, focusNode, scope). Nodes must be equal using term equality, i.e. "01"^^xsd:integer is distinct from "1"^^xsd:integer.

The remainder of this section is informative.

The following example uses sh:intersection as a sh:targetNode node expression. This shape will target all nodes that are SHACL instances of ex:Australian and ex:German at the same time.

Union Expressions

A blank node that is the subject of the following properties is called a union expression with the function name sh:UnionExpression:

Property Constraints Description
sh:union A well-formed SHACL list where each member is a well-formed node expression. The node expressions that shall be unioned.

EVALUATION OF UNION EXPRESSIONS

Let members be the members of the value of sh:union in the union expression. The output nodes of the union expression are the concatenation of all output nodes for each node expression NE in members, using evalExpr(NE, focusGraph, focusNode, scope). The order is preserved, evaluating the members from left to right and keeping the order of each list of output nodes.

The remainder of this section is informative.

Note that a union expression may produce duplicate output nodes if the individual output nodes overlap. Use sh:distinct to eliminate duplicates.

The Example for sh:distinct uses sh:union.

Minus Expressions

A blank node that is the subject of the following properties is called a minus expression with the function name sh:MinusExpression:

Property Constraints Description
sh:minus A well-formed node expression. The nodes that shall be removed from the sh:nodes.
sh:nodes A well-formed node expression. The input nodes.

EVALUATION OF MINUS EXPRESSIONS

Let minus be the value of sh:minus and nodes be the value of sh:nodes in the minus expression. Let M be the output nodes of evalExpr(minus, focusGraph, focusNode, scope). Let N be the output nodes of evalExpr(nodes, focusGraph, focusNode, scope). The output nodes of the minus expression are the nodes in N except those that are also in M, preserving the order of N. Nodes must be equal using term equality, i.e. "01"^^xsd:integer is distinct from "1"^^xsd:integer.

The remainder of this section is informative.

The List Expression example uses sh:minus.

FilterShape Expressions

A blank node that is the subject of the following properties is called a filterShape expression with the function name sh:FilterShapeExpression:

Property Constraints Description
sh:filterShape A well-formed shape. The shape that all input nodes need to conform to.
sh:nodes A well-formed node expression. A node expression producing the nodes that are validated.

EVALUATION OF FILTERSHAPE EXPRESSIONS

Let filterShape be the value of sh:filterShape, and nodes be the value of sh:nodes in a filterShape expression. The output nodes of the filterShape expression are the output nodes of evalExpr(nodes, focusGraph, focusNode, scope) except those that do not conform to the shape filterShape, preserving the order in the list.

The remainder of this section is informative.

The following example illustrates the use of sh:filterShape to return a subset of values of the ex:child property where the ex:gender property has the value "male".

Limit Expressions

A blank node that is the subject of the following properties is called a limit expression with the function name sh:LimitExpression:

Property Constraints Description
sh:limit sh:datatype xsd:integer
sh:minInclusive 0
The maximum number of nodes that shall be returned.
sh:nodes A well-formed node expression. The input nodes.

EVALUATION OF LIMIT EXPRESSIONS

Let limit be the value of sh:limit and nodes be the value of sh:nodes in the limit expression. Let N be the output nodes of evalExpr(nodes, focusGraph, focusNode, scope). The output nodes of the limit expression are the first limit nodes in N from left to right, in the same order.

The remainder of this section is informative.

The following example illustrates the use of sh:limit to compute the values of a derived property ex:oldestChildren to be a sub-list of values of ex:child at the current focus node (which is an instance of the class ex:Person). The values are computed by first fetching the values of ex:child, then ordering them by their ex:dateOfBirth, and finally getting only 2 of these children at most.

Offset Expressions

A blank node that is the subject of the following properties is called an offset expression with the function name sh:OffsetExpression:

Property Constraints Description
sh:offset sh:datatype xsd:integer
sh:minInclusive 0
The number of nodes that shall be skipped from the sh:nodes.
sh:nodes A well-formed node expression. The input nodes.

EVALUATION OF OFFSET EXPRESSIONS

Let offset be the value of sh:offset and nodes be the value of sh:nodes in the offset expression. Let N be the output nodes of evalExpr(nodes, focusGraph, focusNode, scope). The output nodes of the offset expression are the nodes in N except for the first offset nodes from left to right, in the same order.

The remainder of this section is informative.

The following example illustrates the use of sh:offset to compute the values of a derived property ex:remainingChildren to be a sub-list of values of ex:child at the current focus node (which is an instance of the class ex:Person). The values are computed by first fetching the values of ex:child, then ordering them by their ex:dateOfBirth, and finally skipping the first of these children.

OrderBy Expressions

TODO: This should be cleaned up from the SHACL-AF definition and requires thought

The Example of sh:limit also illustrates sh:orderBy.

Aggregation Expressions

Count Expressions

A blank node that is the subject of the following properties is called a count expression with the function name sh:CountExpression:

Property Constraints Description
sh:count A well-formed node expression. The input nodes that shall be counted.

EVALUATION OF COUNT EXPRESSIONS

Let count be the value of sh:count in the count expression. Let N be the output nodes of evalExpr(count, focusGraph, focusNode, scope). The output nodes of the count expression is the list consisting of exactly one xsd:integer literal that is computed as the length of N.

The remainder of this section is informative.

The following example illustrates the use of sh:count to derive a property ex:topConceptCount as the number of values of the skos:hasTopConcept property in a skos:ConceptScheme.

Min Expressions

A blank node that is the subject of the following properties is called a min expression with the function name sh:MinExpression:

Property Constraints Description
sh:min A well-formed node expression. The input nodes from which the minimum value shall be returned.

EVALUATION OF MIN EXPRESSIONS

Let min be the value of sh:min in the min expression. Let N be the output nodes of evalExpr(min, focusGraph, focusNode, scope). The output nodes of the min expression is the list consisting of at most one node that is computed as the minimum value from N. Clarify exactly how that is computed, maybe via reference to SPARQL MIN

The remainder of this section is informative.

The following example illustrates the use of sh:min to derive a property ex:minStartDate as the smallest value of the values that can be reached using the property path ex:exployee/ex:startDate. In other words, it walks through all employees of the given company and returns the earliest date on which an employee started.

Max Expressions

A blank node that is the subject of the following properties is called a max expression with the function name sh:MaxExpression:

Property Constraints Description
sh:max A well-formed node expression. The input nodes from which the maximum value shall be returned.

EVALUATION OF MAX EXPRESSIONS

Let max be the value of sh:max in the max expression. Let N be the output nodes of evalExpr(max, focusGraph, focusNode, scope). The output nodes of the max expression is the list consisting of at most one node that is computed as the maximum value from N. Clarify exactly how that is computed, maybe via reference to SPARQL MAX

The remainder of this section is informative.

The Example for sh:min can be easily adapted for sh:max.

Sum Expressions

A blank node that is the subject of the following properties is called a sum expression with the function name sh:SumExpression:

Property Constraints Description
sh:sum A well-formed node expression. The input nodes from which the sum shall be returned.

EVALUATION OF SUM EXPRESSIONS

Let sum be the value of sh:sum in the sum expression. Let N be the output nodes of evalExpr(sum, focusGraph, focusNode, scope). The output nodes of the sum expression is the list consisting of exactly one node that is computed as the sum of all nodes from N. Clarify exactly how that is computed, maybe via reference to SPARQL SUM

The remainder of this section is informative.

Note that sh:sum needs to be used with care and may be often misunderstood, when used with property paths. The problem is that when a path expression is used as input to a sum expression, the path expression will have eliminated duplicates before they can be processed by the sh:sum. As a result, only the distinct values will be added up.

TODO: Find good example, or drop the feature if none makes sense

Miscellaneous Node Expressions

This section enumerates node expression functions that did not fit into other categories.

InstancesOf Expressions

A blank node that is the subject of the following properties is called an instancesOf expression with the function name sh:InstancesOfExpression:

Property Constraints Description
sh:instancesOf sh:nodeKind sh:IRI The class that the output nodes must be instances of.

EVALUATION OF INSTANCESOF EXPRESSIONS

Let type be the value of sh:instancesOf in an instancesOf expression. The output nodes of the instancesOf expression are the nodes that are SHACL instances of type in the focus graph.

The remainder of this section is informative.

Note that the definition of SHACL instance includes instances of subclasses of the given class. So if the focus graph contains ex:SubClass rdfs:subClassOf ex:SuperClass and ex:SubInstance a ex:SubClass then ex:SubInstance will also be returned as instance of ex:SuperClass.

The interpretation of sh:instancesOf is similar to sh:targetClass and sh:class.

Users of this node expression function should be aware that the list of output nodes may be very large.

The Example for sh:intersection uses sh:instanceOf.

Security Considerations

TODO

Privacy Considerations

TODO

Acknowledgements

Many people contributed to this document, including members of the RDF Data Shapes Working Group.

Internationalization Considerations

TODO