This specification defines elements of the SHACL Shapes Constraint Language created to allow for profiles of SHACL and profiling with SHACL.

SHACL is a language for validating RDF graphs against a set of conditions, so this document's scope is limited to profiling of RDF graphs, including graphs containing SHACL Shapes.

The namespace for SHACL Profiling terms is http://www.w3.org/ns/shpr#

The suggested prefix for the SHACL Profiling namespace is shpr

Document Outline

The introduction provides background concepts of profiling and states this specification's scope. It also provides term definitions and describes document conventions.

Section 2 covers packaging of SHACL for management.

Sections 3 & 4 cover the two main modes of profiling with SHACL, building on packaging.

Introduction

What is profiling?

Profiling is the act of creating a "profile" of something.

Generically, in English, a "profile" of something is as follows:

The outline of a physical object or feature, or a representation of this

- Oxford English dictionary, use of the word "profile" since the 17th century

Within the world of data, a derived definition of "profile" consistent with the above is:

A summary or an extraction

In this definition, the essence of the English word is retained, since a summary or extraction of or from a data object may be an outline of it; for example, a 2D representation of a 3D spatial object. or a statistical summary of a dataset having lots of parts.

Warning: This background of profiling up to this point in the Introduction may be up-streamed to an updated version of [[dx-prof]].

By definition, SHACL constrains RDF data. Therefore, any data that is valid according to a shapes graph will be a profile of the data graph that was validated. If a shapes graph validates all elements of a data graph, the resulting valid data will be a "null" profile of the data graph, meaning it is identical to the original data graph.

[[[dx-prof]]] defines a profile in the context of a specification to be:

a specification that constrains, extends, combines, or provides guidance or explanation about the use of other specifications

If a shapes graph is taken to be a "specification," then not only is the data that is valid according to the shapes graph a profile of the validated data graph, but the shapes graph itself also serves as a profile of the data model used for the data graph.

Within this document, we describe how to package SHACL information for optimal profiling and we exemplify this with packaging of the SHACL shapes created for each of the SHACL Specifications.

Scope

With the above section's concepts in mind, this specification defines the following:

  1. packaging SHACL
  2. profiles of SHACL
  3. profiling with SHACL

Terminology

Terminology used throughout this specification is either defined here or taken from one of several sources. Those other sources are:

[[[dx-prof]]]
general profiling terminology and profiling ontology terms
SHACL 1.2 Core
SHACL technical terminology
[[[rdf12-concepts]]]
RDF terminology

Terms taken from other sources are linked to their definitions in text.

The terms defined here are:

null profile
a profile of a specification in which implements some, and likely all the specification's rules, but no other rules

Warning: The definition of null profile may be up-streamed to an updated version of [[dx-prof]].

All the defined terms used in this specification are listed in the index.

Document Conventions

Within this specification, the following namespace prefix definitions are used:

Prefix Namespace
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
sh: http://www.w3.org/ns/shacl#
xsd: http://www.w3.org/2001/XMLSchema#
ex: http://example.com/ns#

Within this specification, the following JSON-LD context is used:

{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "sh": "http://www.w3.org/ns/shacl#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "ex": "http://example.com/ns#"
  }
}

Note that the URI of the graph defining the SHACL vocabulary itself is equivalent to the namespace above, i.e., it includes the #. References to the SHACL vocabulary, e.g., via owl:imports should include the #.

Throughout the specification, color-coded boxes containing RDF graphs in Turtle and JSON-LD will appear. The color and title of a box indicate whether it is a Shapes graph, a Data graph, or something else. The Turtle specification fragments use the prefix bindings given above. The JSON-LD specification fragments use the context given above. Only the Turtle specifications will have parts highlighted.

# This box represents an input shapes graph <s> <p> <o> .
// This box represents an input shapes graph
{
  "@id": "ex:s",
  "ex:p": {
    "@id": "ex:o"
  }
}
# This box represents an input data graph. # When highlighting is used in the examples: # Elements highlighted in blue are focus nodes ex:Bob a ex:Person . # Elements highlighted in red are focus nodes that fail validation ex:Alice a ex:Person .
// This box represents an input data graph
{
	"@graph": [
		{
			"@id": "ex:Alice",
			"@type": "ex:Person"
		},
		{
			"@id": "ex:Bob",
			"@type": "ex:Person"
		}
	]
}
# This box represents an output results graph
// This box represents an output results graph

Grey boxes such as this include syntax rules that apply to the shapes graph.

true denotes the RDF term "true"^^xsd:boolean. false denotes the RDF term "false"^^xsd:boolean.

This document defines extensions to the data model of the SHACL Core specification [shacl12-core] for the purposes of profiling.

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The syntactic rules specified here are given in English and use keywords MAY, MUST, MUST NOT, RECOMMENDED, SHOULD, and SHOULD NOT that are to be interpreted as described in BCP 14 [[RFC2119]] [[RFC8174]] when, and only when, they appear in all capitals, as shown here.

Conformance claims of data to this specification's rules can be tested by validation using the SHACL-SHACL Profiling validator in the SHACL-SHACL Annex of the SHACL Overview specification [shacl12-overview]

Packaging SHACL

Motivation

For the efficient distribution and reuse of SHACL elements, we recommend packaging conventions. We make recommendations based on common SHACL packaging practice observed since the first release of SHACL in 2017 and follow generic Semantic Web definitional resource packaging practice too. However, achieve the level consistency required to allow for some of the aims of the following sections of this document to be met, we introduce some minor additions, such as the directive to use a particular part indicator predicate for the ontology → shapes relationship.

Use of named graphs in files

SHACL elements SHOULD be serialized using graph-aware RDF formats.

SHACL, as with much other RDF data, is often managed and distributed in files. Files of RDF data in graph-unaware formats uch as act as Turtle [[turtle]], N-Triples [[rdf12-n-triples]] etc., act as default graphs that cannot be identified or referenced directly in RDF data: they can only be referenced by file name and when loaded into RDF databases, they do not have a guaranteed identity as they may be placed into a that system's default graph or any other which has no way of being consistently identified across systems.

If RDF data is stored in named graph in files using graph-aware RDF formats, such as TriG [[trig]], TriX, JSON-LD [[json-ld]], N-Quads [[rdf12-n-quads]], then the set of RDF data will be identified with an IRI within the RDF data and this may be both referenced within RDF databases and also dereferenced (resolved) on the Internet.

An example of three SHACL Shapes, ex:shape-1 & ex:shape-2 stored in named graph ex:graph-x and ex:shape-3 stored in named graph ex:graph-y serialized as RDF using the TriG & JSON-LD formats, is as follows:

ex:graph-x { ex:shape-1 a sh:NodeShape ; schema:name "Shape 1" ; . ex:shape-2 a sh:NodeShape ; schema:name "Shape 2" ; . } ex:graph-y { ex:shape-3 a sh:NodeShape ; schema:name "Shape 3" ; . }
[
  {
  "@context": {
    "ex": "http://example.com/ns#",
    "schema": "https://schema.org/",
    "sh": "http://www.w3.org/ns/shacl#"
  } ,
    "@graph": [
      {
        "@id": "https://example.com/shape-y",
        "@type": [
          "https://sh.com/Shape"
        ],
        "https://schema.org/name": [
          {
            "@value": "Shape Y"
          }
        ]
      }
    ],
    "@id": "https://example.com/graph-x"
  }
]
Remove "Shapes graph" heading and convert the above "turtle" to "trig".

Use of owl:Ontology containers

SHACL elements SHOULD be contained within instances of owl:Ontology and instances of SHACL classes should be linked to the ontology they are first defined in by use of the rdfs:isDefinedBy predicate and linked to subsequent ontologies using the rdfs:member predicate.

RDF information, like much other RDF definitional data, is often contained within instances of OWL's [[owl2-rdf-based-semantics]] owl:Ontology class.

For example, the Open Geospatial Consortium's GeoSPARQL 1.1 standard for spatial RDF data provides a SHACL validator containing 27 sh:NodeShape instances within a Turtle file that presents metadata for the validator as properties of an owl:Ontology instance.

Often, as per the GeoSPARQL example, the SHACL elements are not explicitly linked to the ontology object using RDF predicates but seen to be grouped within it by virtue of being declared within the same file.

Better than making this assumption is to explicitly link the SHACL elements to ontologies using predicates so that, within RDF databases and without resorting to graph-aware RDF data groupings, the SHACL elements can be known to be contained by ontology objects.

The recommendation to differentiate the relationship between a SHACL element and the ontology that defined it, and other ontologies is made to allow for a simple form of provenance: a SHACL element characterised in this way may be seen to have originated in one ontology and reused in others.

An example of two SHACL Shapes, ex:shape-1 & ex:shape-2, defined in one ontology, ex:ont-m, with one ex:shape-2 being re-used in another ontology, ex:ont-n, is as follows:

ex:graph-x { ex:ont-m a owl:Ontology ; . ex:shape-1 a sh:NodeShape ; rdfs:isDefinedBy ex:ont-m ; . ex:shape-2 a sh:NodeShape ; rdfs:isDefinedBy ex:ont-m ; . } # perhaps in a different file from the one containing the above data... ex:graph-y { ex:ont-n a owl:Ontology ; . ex:shape-2 a sh:NodeShape ; rdfs:isDefinedBy ex:ont-m ; rdfs:member ex:ont-n ; . }
{
  "@graph": [
    {
      "@graph": [
        {
          "@id": "ex:ont-m",
          "@type": "owl:Ontology"
        },
        {
          "@id": "ex:shape-1",
          "@type": "https://sh/Shape",
          "rdfs:isDefinedBy": {
            "@id": "ex:ont-m"
          }
        },
        {
          "@id": "ex:shape-2",
          "@type": "sh:NodeShape",
          "rdfs:isDefinedBy": {
            "@id": "ex:ont-m"
          }
        }
      ],
      "@id": "ex:graph-x"
    },
    {
      "@graph": [
        {
          "@id": "ex:ont-n",
          "@type": "owl:Ontology"
        },
        {
          "@id": "ex:shape-2",
          "@type": "sh:NodeShape",
          "rdfs:isDefinedBy": {
            "@id": "ex:ont-m"
          },
          "rdfs:member": {
            "@id": "ex:ont-n"
          }
        }
      ],
      "@id": "ex:graph-y"
    }
  ]
}
Remove "Shapes graph" heading and convert the above "turtle" to "trig".

In the example above, the ontology defining the shapes, ex:ont-m, contains no references to the reusing ontology, ex:ont-n and likely would have been created before it.

The reusing ontology retains the indication of ex:shape-2 having been defined by ex:ont-m and then adds an indicator of it being reused within its own ontology: use of the predicate rdfs:member.

When a set of SHACL information is contained within an owl:Ontology, it may be useful to identify the graph used to package the content within a graph using the same IRI as the ontology.

Profiles of SHACL

This section describes some profiles of SHACL 1.2 that were created alongside the SHACL 1.2 Specifications. It also indicates how to create more profiles of SHACL.

To validate RDF data against any of these Specification Profiles, or the Union Profile defined below, use the validators listed in the SHACL 1.2 Overview's SHACL-SHACL appendix.

Specification Profiles

The Specification Profiles are profiles of SHACL 1.2 defined in each of the specifications listed above.

When described using terminology from [[[dx-prof]]], the online document form of each specification is just one of many resources that altogether make the specification. It plays the resource role of Specification - "Defining the profile in human-readable form" and other resources play other roles. Of particular interest to SHACL users is the SHACL validator for each profile that plays the role of Validation - "Supplies instructions about how to verify conformance of data to the profile". These validation resources are listed in SHACL 1.2 Overview's SHACL-SHACL appendix.

Together, these Base Profiles cover all of SHACL 1.2 and they are bundled together within the Union Profile, defined below.

Union Profile

The Union profil

Creating other Profiles

Profiling with SHACL

This section describes how to profile specifications with SHACL.

Profile Part Roles

SHACL is a language that can implement constraints and inference rules for RDF data. These two things are not the complete set of things that a profile designer may wish to do; for example, they may want to include documentation about why and how the profile was created, or provide new model elements (classes, predicates, etc.) within an extended schema.

Since SHACL cannot be used for all possible profile parts, profile designers need to look to specifications outside the SHACL family of specifications for guidance on a profile's total set of parts and how to relate them to one another. [[[dx-prof]]] is expected to be used for this.

Referencing [[[dx-prof]]]'s vocabulary of Resource Role Instances, it seems clear that SHACL graphs can be applied to the role of Validation within a profile and perhaps also to the role of Specification, since SHACL can be used to declare Node and Property Shapes just as OWL can be used to declare Classes and Properties.

SHACL resources can also document the constraints they implement validation shapes for, therefore they can also be applied to the Constraints role.

They could also be applied to the Mapping role, if SHACL Rules are implemented to transform data from one model to another.

SHACL lists of values required for use with a model could be applied to the Vocabulary role.

Profile Hierarchies

Creating Null Profiles

A null profile of a specification, created using SHACL, is a profile of that specification in which SHACL is used to implement some, and likely all the specification's rules, but no other rules, in SHACL.

The main purpose of creating a null profile of a specification using SHACL is to enable the testing of conformance to that specification by SHACL validation. This purpose exists because many RDF data models exist that have a model specification, perhaps an OWL model or only a natural-language document of a model, but do not provide a mechanism for data validation, such as a SHACL validator.

While null profiles of specifications can be created using mechanism other than SHACL, here we focus only on the use of SHACL.

An example of a non-SHACL validator for RDF data that acts as a null profile is the W3C's [[[prov-constraints]]] which provides a list of constraints that apply to provenance data formulated according to [[[prov-o]]]. [[[prov-constraints]]] implements no constraints beyond those stated or implied in [[[prov-o]]] and the conceptual [[[prov-dm]]], however it does include tests for things that the models do not explicitly model but whose proper use requires, e.g., for ordering of temporal entities. Implementations of those constraints have been made as Python scripts that execute SPARQL queries, allowing for RDF data validation.

Summary of Syntax Rules from this Specification

Security and Privacy Considerations

Like most RDF-based technologies, SHACL processors may operate on graphs that are assembled from various sources. Some applications may have an open "linked data" ("LD") architecture and dynamically assemble RDF triples from sources that are outside an organization's network of trust. Since RDF allows anyone to add statements about any resource, triples may modify the originally intended semantics of shape definitions or nodes in a data graph and thus feed into misleading results. Protection against this (and the following) scenario is achievable by using only trusted and verified RDF sources and eliminating the possibility that graphs are dynamically added via owl:imports and sh:shapesGraph.

When creating profiles of other specifications, profile creators need to ensure that their constraints do not violate those specification's rules. If any did so, and if only the profile's rules, but not the specification's rules, were used to check for data validity, by accident or by design, data could be wrongly calculated to be valid. This could lead to accidental data release or use, potentially introducing security issues.

Acknowledgements

Many people contributed to this specification, including members of the RDF Data Shapes Working Group.

Internationalization Considerations

TODO

Specification Profiles Listing

Descriptions of each SHACL 1.2 specification as a Profiles Vocabulary profile, including a listing of all their resources and the roles they play, is given in Appendix G below.

The individual profiles are as follows:

SHACL 1.2 Core Profile

http://www.w3.org/ns/shacl#shacl12-core-profile

The set of SHACL 1.2 elements defined in the SHACL 1.2 Core specification.

SHACL 1.2 SPARQL Profile

http://www.w3.org/ns/shacl#shacl12-sparql-profile

The set of SHACL 1.2 elements defined in the SHACL 1.2 SPARQL specification.