This document defines the Core of the SHACL Shapes Constraint Language.
SHACL is a language for validating RDF graphs against a set of conditions.
These conditions are provided as shapes and other constructs expressed in the form of an RDF graph.
RDF graphs that are used in this manner are called "shapes graphs" in SHACL and
the RDF graphs that are validated against a shapes graph are called "data graphs".
As SHACL shape graphs are used to validate that data graphs satisfy a set of conditions
they can also be viewed as a description of the data graphs that do satisfy these conditions.
Such descriptions may be used for a variety of purposes beside validation, including
user interface building, code generation and data integration.
Sections 2 and 3 cover SHACL shapes and constraints, as well as property paths.
Section 4 introduces node expressions, while section 5 defines validation in SHACL.
Section 6 defines the built-in SHACL Core constraint components, and section 7 discusses non-validating properties.
The syntax of SHACL is RDF.
The examples in this document use Turtle [[rdf12-turtle]], JSON-LD [[json-ld]], and SHACL-C [[?shacl12-shacl-c]].
Other RDF serializations such as RDF/XML may be used in practice.
The reader should be familiar with basic RDF concepts [[rdf12-concepts]] such as triples.
Introduction
This document specifies the Core of SHACL (Shapes Constraint Language), a language for describing and validating RDF graphs.
This section introduces SHACL with an overview of the key terminology and an example to illustrate basic concepts.
Terminology
Throughout this document, the following terminology is used.
Terminology that is linked to portions of RDF 1.2 Concepts and Abstract
Syntax is used in SHACL as defined there. Terminology that is linked to
portions of SPARQL 1.2 Query Language is used in SHACL as defined there. A
single linkage is sufficient to provide a definition for all occurences of a
particular term in this document.
Definitions are complete within this document, i.e., if there is no rule to
make some situation true in this document then the situation is false.
Basic RDF Terminology
This document uses the terms
RDF graph,
RDF triple,
IRI,
literal,
datatype,
blank node,
triple term,
reifier,
node of an RDF graph,
RDF term,
subject,
predicate, and
object of RDF triples
as defined in RDF 1.2 Concepts and Abstract Syntax [[!rdf12-concepts]].
Language tags are defined as in [[!BCP47]].
Property Value and Path
A property is an IRI.
An RDF termn has a value v
for property p in an RDF graph if there is an RDF triple in the graph
with subjectn, predicatep, and objectv.
The phrase "Every value of P in graph G ..." means "Every object of a triple in G with predicate P ...".
(In this document, the verbs specify or declare are sometimes used to express the fact that an RDF term has values for a given predicate in a graph.)
SPARQL property paths are defined as in SPARQL 1.2.
An RDF term n has value v for SPARQL property path expression
p in an RDF graph G if there is a solution mapping in the result of the SPARQL query
SELECT ?s ?o WHERE { ?s p' ?o } on G that binds ?s to
n and ?o to v, where p' is SPARQL surface syntax for p.
SHACL Lists
A SHACL list in an RDF graph G is an IRI or a blank node
that is either rdf:nil (provided that rdf:nil has no value
for either rdf:first or rdf:rest), or has exactly one value
for the property rdf:first in G and exactly one value
for the property rdf:rest in G that is also a SHACL list in G,
and the list does not have itself as a value of the property path rdf:rest+ in G.
The members of any SHACL list except rdf:nil in an RDF
graph G consist of its value for rdf:first in G followed by
the members in G of its value for rdf:rest in G.
The SHACL list rdf:nil has no members in any RDF graph.
Binding, Solution
A binding is a pair (variable, RDF term), consistent with the term's use in [[!sparql12-query]].
A solution is a set of bindings, informally often understood as one row in the body of the result table of a SPARQL query.
Variables are not required to be bound in a solution.
SHACL Subclass, SHACL superclass
A nodeSub in an RDF graph is a SHACL subclass of another nodeSuper
in the graph if there is a sequence of triples in the graph each with predicate rdfs:subClassOf such that the subject of the first triple is Sub,
the object of the last triple is Super, and the object of each triple except the last is the subject of the next.
If Sub is a SHACL subclass of Super in an RDF graph then Super
is a SHACL superclass of Sub in the graph.
For a noden in a graphsourceGraph,
the deep copy of n in a graphtargetGraph
is n in targetGraph plus, if n is a blank node,
any triples from sourceGraph that can be reached by transitively traversing
the blank nodes that appear in the object position of a triple that can be reached
starting with n as the subject. This is similar to
the Concise Bounded Description, but without reification.
Document Conventions
Within this document, the following namespace prefix definitions are used:
Note that the URI of the graph defining the SHACL vocabulary itself is equivalent to
the namespace above, i.e., it includes the #.
References to the SHACL vocabulary, e.g., via owl:imports should include the #.
Throughout the document, color-coded boxes containing RDF graphs in Turtle, JSON-LD and SHACL-C will appear.
These fragments of Turtle documents use the prefix bindings given above.
The JSON-LD document fragments use the context given above.
Only the Turtle documents may highlight certain parts.
The SHACL-C specification is unstable - SHACL-C document fragments in this document are informative
# This box represents an input shapes graph
<s> ex:p <o> .
// This box represents an input shapes graph
{
"@id": "s",
"ex:p": {
"@id": "o"
}
}
# This box represents an input data graph.
# When highlighting is used in the examples:
# Elements highlighted in blue are focus nodesex:Bob a ex:Person .
# Elements highlighted in red are focus nodes that fail validation
ex:Alice a ex:Person .
Grey boxes such as this include syntax rules that apply to the shapes graph.
true denotes the RDF term "true"^^xsd:boolean.
false denotes the RDF term "false"^^xsd:boolean.
This document defines the SHACL Core language, also referred to as just SHACL.
This specification describes conformance criteria for:
SHACL Core processors as processors that support validation with the SHACL Core Language
This document includes syntactic rules that shapes and other nodes need to fulfill in the shapes graph.
These rules are typically of the form A shape must have... or The values of X are literals or All objects of triples with predicate P must be IRIs.
The complete list of these rules can be found in the appendix.
Nodes that violate any of these rules are called ill-formed.
Nodes that violate none of these rules are called well-formed.
A shapes graph is ill-formed if it contains at least one ill-formed node.
Relationship between SHACL and RDFS inferencing
SHACL uses the RDF and RDFS vocabularies, but full RDFS inferencing is not required.
However, SHACL processors MAY operate on RDF graphs that include entailments [[!sparql12-entailment]],
whether pre-computed before being submitted to a SHACL processor or performed on the fly as
part of SHACL processing (without modifying either data graph or shapes graph).
To support processing of entailments, SHACL includes the property
sh:entailment to indicate the inferencing that is required
by a given shapes graph.
The values of the property sh:entailment are IRIs.
Common values for this property are covered by [[!sparql12-entailment]].
SHACL implementations MAY, but are not required to, support entailment regimes.
If a shapes graph contains any triple with the predicatesh:entailment and objectE
and the SHACL processor does not support E as an entailment regime for the given data graph,
then the processor MUST signal a failure.
Otherwise, the SHACL processor MUST provide the entailments for all of the values of sh:entailment in the shapes graph,
and any inferred triples MUST be returned by all queries against the data graph during the validation process.
Getting Started with SHACL Core
In this section, we will walk you through a simple example that introduces the basics of SHACL Core.
You will learn to describe how your data should look, and how a SHACL processor checks whether your data meets that description.
Use Case
Imagine you have a set of entities (Alice, Bob, Calvin) and you want explain to a computer or another human that
there is a class called ex:Person that is the type of these entities
every instance of ex:Person has at most one Social Security Number (SSN), and that SSN needs to be a properly formatted text (like 123-45-6789)
every instance of ex:Person can work for one or more companies, but those companies must be typed as a ex:Company in your data
no other properties are allowed for an ex:Person, other than the SSN (ex:ssn), the work affiliation (ex:worksFor), and the mandatory typing (rdf:type)
Our Sample Data (Data Graph)
Here is the data we want to describe and validate:
Writing the Shapes and Classes (Shapes Graph)
Here is a self-contained example of how to represent our domain of interest.
In SHACL terminology, this is called a shapes graph, but you can also think
of this as a domain model or an ontology.
Running the Validation (Validation Report)
When we run SHACL validation on our data graph using our shapes graph, the validator checks each Person against the constraints that we wrote.
In plain English, here is what it finds:
Alice: SSN does not match the expected pattern (987-65-432A has a letter where a digit should be).
Bob: Has more than one SSN (two values found, but sh:maxCount says only one is allowed).
Calvin:
Works for something (ex:UntypedCompany) that is not declared as a ex:Company.
Has an extra property ex:birthDate that is not allowed by the shape.
Here is what a SHACL validation report for this example might look like (simplified for readability):
How to read the report:
sh:ValidationReport is the overall report, with sh:conforms false meaning that there was at least one violation.
Each sh:ValidationResult is one problem found:
sh:resultSeverity — tells you how serious the problem is. In this example, all issues are sh:Violation (the highest and default severity).
sh:focusNode — the data node that failed.
sh:resultPath — the property involved.
sh:value — the actual value that triggered the failure.
sh:sourceConstraintComponent — which kind of constraint was broken (max count, pattern, class, etc.).
sh:sourceShape — the shape that defined the constraint.
sh:resultMessage — a human-readable explanation.
Syntactic Variations of Shapes and Classes
While SHACL is primarily designed to represent shapes, it also borrows terms and concepts such as
rdfs:Class and rdfs:subClassOf from the RDF Schema namespace.
Some people prefer to keep those concepts separate, as shown in the original example above
which had separate entities for ex:Person and ex:PersonShape.
However, it is also possible to couple them more closely together, and use sh:ShapeClass
to declare both a class and a shape at the same time.
Furthermore, sometimes you will see property shapes declared as blank nodes instead of IRIs.
This is a more compact notation, but it means that the property shape cannot easily be referenced from
the outside; for example, if some other graph wants to reuse a node shape but deactivate a property shape.
The following Turtle example shows these two syntactic variations in action.
Introducing some SHACL Terminology
We can use the shape declarations above to introduce some of the formal terminology used by SHACL.
This may help you read the remainder of this specification.
The target for the shapeex:PersonShape is the set of all SHACL instances of the classex:Person.
This is specified using the property sh:targetClass.
During the validation, these target nodes become focus nodes for the shape.
The shapeex:PersonShape is a node shape, which means that it applies to the focus nodes.
It declares constraints on the focus nodes, for example using the parameterssh:closed and sh:ignoredProperties.
The node shape also declares two other constraints with the property sh:property,
and each of these is backed by a property shape.
These property shapes declare additional constraints using parameters such as sh:datatype and sh:maxCount.
The following informal diagram provides an overview of some of the key classes in the SHACL vocabulary.
Each box represents a class.
The boxes under the class name list a small subset of the frequently used properties
that instances of these classes may have, together with their value types.
The arrows indicate rdfs:subClassOf triples.
Note that the definition above does not include all of the syntax rules of well-formed shapes.
Those are found throughout the document and summarized in Appendix .
For example, shapes that have literals as values for sh:targetClass are ill-formed.
Informally, a shape determines how to validate a focus node based on the values of properties and other characteristics of the focus node.
For example, shapes can declare the condition that a focus node be an IRI or that a focus node has a particular value for a property and also a minimum number of values for the property.
The SHACL Core language defines two types of shapes:
shapes about the values of a particular property or path for the focus node, called property shapes
sh:Shape is the SHACL superclass of those two shape types in the SHACL vocabulary.
Its subclasses sh:NodeShape and sh:PropertyShape can be used as SHACL type of node and property shapes, respectively.
A constraint component is an IRI.
Each constraint component has one or more mandatory parameters, each of which is a property.
Each constraint component has zero or more optional parameters, each of which is a property.
The parameters of a constraint component are its mandatory parameters plus its optional parameters.
For example, the componentsh:MinCountConstraintComponent declares the parametersh:minCount to represent the restriction
that a node has at least a minimum number of values for a particular property.
Some constraint components declare only a single parameter.
For example sh:ClassConstraintComponent has the single parameter sh:class.
These parameters may be used multiple times in the same shape,
and each value of such a parameter declares an individual constraint.
The interpretation of such declarations is conjunction, i.e. all constraints apply.
The following example specifies that the values of ex:customer have to be SHACL instances of both
ex:Customer and ex:Person.
Some constraint components such as sh:PatternConstraintComponent declare more than one parameter.
Shapes that have more than one value for any of the parameters of such components are ill-formed.
One way to bypass this syntax rule is to spread the constraints across multiple (property) shapes, as illustrated in the following example.
Constraint components are associated with validators, which provide instructions (for example expressed via SPARQL queries)
on how the parameters are used to validate data.
Validating an RDF term against a shape involves validating the term against each constraint where the
shape has values for all mandatory parameters of the component of the constraint,
using the validators associated with the respective component.
specified as explicit input to the SHACL processor for validating a specific RDF term against a shape
Targets
Target declarations of a shape in a shapes graph are
triples with the shape as the subject and certain properties described in this document
(e.g., sh:targetClass) as predicates.
Furthermore, sh:shapetriples can declare targets in the data graph.
Target declarations can be used to produce focus nodes for a shape.
The target of a target declaration is the set of RDF terms produced
by applying the rules described in the remainder of this section to the data graph.
The target of a shape is the union of all RDF terms produced by the individual
targets that are declared for the shape.
SHACL Core includes the following kinds of targets:
The remainder of this introduction is informative.
RDF terms produced by targets are not required to exist as nodes in the data graph.
Targets of a shape are ignored whenever a focus node is provided directly as input to the validation process for that shape.
This includes the cases where the shape is a value of one of the
shape-expecting constraint parameters (such as sh:node) and
a focus node is determined during the validation of the corresponding constraint component (such as sh:NodeConstraintComponent).
In such cases, the provided focus node does not need to be in the target of the shape.
Node targets (sh:targetNode)
A node target is specified using the sh:targetNode predicate.
Each value of sh:targetNode in a shape is a well-formed node expression.
With the example data below, only ex:Alice is the target of the provided shape:
Class-based Targets (sh:targetClass)
A class target is specified with the sh:targetClass predicate.
Each value of sh:targetClass in a shape is an IRI.
TEXTUAL DEFINITION
If s is a shape in a shapes graph SG and s has valuec for
sh:targetClass in SG then the set of SHACL instances of c in a data graph
DG is a target from DG for s in SG.
The remainder of this section is informative.
In this example, only ex:Alice and ex:Bob are focus nodes.
Note that, according to the SHACL instance definition, all the rdfs:subClassOf declarations needed to walk the class hierarchy need to exist in the data graph.
However, the ex:Person a rdfs:Class triple is not required to exist in either graphs.
In the following example, the selected focus node is only ex:Who.
Note that the rdfs:subClassOf triples may be queried from the shapes graph
(see ) in which case the rdfs:subClassOf triple
from the example above would not be required to be in the data graph.
If s is a SHACL instance of sh:NodeShape or sh:PropertyShape
in an RDF graph G and s is also a SHACL instance of
rdfs:Class in G and s is not an IRI then s is an ill-formed shape in G.
The SHACL namespace includes a dedicated class sh:ShapeClass that can serve as a syntactic shortcut for the implicit class targets pattern.
TEXTUAL DEFINITION
The class sh:ShapeClass is an rdfs:subClassOf of both sh:NodeShape and rdfs:Class.
If s is a SHACL instance of sh:ShapeClass in a shapes graphSG
then the set of SHACL instances of s in a data graph DG is a target from DG for s in SG.
Please keep in mind that sh:ShapeClass may not be understood to be a subclass of rdfs:Class by some SHACL-unaware implementations.
It is therefore recommended (but not required) that graphs that use sh:ShapeClass include an owl:imports sh: statement.
The remainder of this section is informative.
In the following example, ex:Alice is a focus node, because it is a SHACL instance of
ex:Person which is both a class and a shape in the shapes graph.
In the following variation of the example above, ex:Person is declared as an instance of sh:ShapeClass,
with the same interpretation.
Subjects-of targets (sh:targetSubjectsOf)
A subjects-of target is specified with the predicate sh:targetSubjectsOf.
The values of sh:targetSubjectsOf in a shape are IRIs.
TEXTUAL DEFINITION
If s is a shape in a shapes graph SG and s has valuep for sh:targetSubjectsOf in SG then the set of nodes in a
data graph DG that are subjects of triples in DG with predicatep is a target from DG for s in SG.
The remainder of this section is informative.
In the example above, only ex:Alice is validated against the given shape,
because it is the subject of a triple that has ex:knows as its predicate.
Objects-of targets (sh:targetObjectsOf)
An objects-of target is specified with the predicate sh:targetObjectsOf.
The values of sh:targetObjectsOf in a shape are IRIs.
TEXTUAL DEFINITION
If s is a shape in a shapes graph SG and s has valuep for sh:targetObjectsOf in SG then the set of nodes in a
data graph DG that are objects of triples in DG with predicatep is a target from DG for s in SG.
The remainder of this section is informative.
In the example above, only ex:Bob is validated against the given shape,
because it is the object of a triple that has ex:knows as its predicate.
Where Targets (sh:targetWhere)
A where target is specified with the sh:targetWhere predicate.
Each value of sh:targetWhere in a shape is a well-formedshape.
TEXTUAL DEFINITION
If s is a shape in a shapes graph SG and s has valuew for
sh:targetWhere in SG then the set of nodes in a data graph
DG that conform to w is a target from DG for s in SG.