Skip to main content

SPARQL

SPARQL, short for “SPARQL Protocol and RDF Query Language” and pronounced “sparkle,” is a query language that allows users to query triplestores.

SPARQL queries take the form of a string. They are directed at a SPARQL endpoint, a location on the internet that is capable of receiving and processing SPARQL queries.

It is useful to think of a SPARQL query as a set of sentences with blanks. The database will take this query and find every set of matching statements that correctly fills in those blanks. In other words, the query is looking for data that follows a pattern that you have described. What makes SPARQL powerful is the ability to create complex queries that reference many variables at a time.

SPARQL queries can be used to query named graphs, such as those created and maintained by LINCS.

To do SPARQL queries, you will need to know:

  • How to construct queries
  • What sorts of questions can be asked with a query
info

Check out the LINCS SPARQL Endpoint and run queries right without leaving the LINCS site.

Construct a Query

A SPARQL query is like a recipe. There are four main ingredients:

  1. Prefix(es)
  2. Type of Query
  3. Query
  4. Modifier(s)

Prefixes

Prefixes are shorthand abbreviations for the full Internationalized Resource Identifiers (IRIs) that tell the SPARQL endpoint where to go to look for the data. Prefixes are placed at the top of your query so that you do not have to type out the full IRIs every time you want to refer to them.

In the following example, a prefix has been added for the CWRC ontology, the Resource Description Framework (RDF), the Resource Description Framework Schema (RDFS), and the Simple Knowledge Organization System (SKOS):

PREFIX identity: <http://id.lincsproject.ca/identity#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

Type of Query

Following your prefixes, you need to declare the type of SPARQL query. There are four types of queries: ASK, SELECT, DESCRIBE, and CONSTRUCT. Each type of SPARQL query includes the same essential components, but each serves a different purpose and will give you a different type of results.

ASK Query

ASK queries return a yes or no answer.

SELECT Query

SELECT queries return a list of all of the things that match your query item.

DESCRIBE Query

DESCRIBE queries return all known information about a particular entity.

CONSTRUCT Query

CONSTRUCT queries return new triples by pulling information from multiple triples.

info

Coming soon! Example queries will be provided once the LINCS SPARQL Endpoint is live.

Query

Triples

After you have declared which type of query you are going to construct, you need to fill in the structure of the query. The query structure is composed of triples: a subject, predicate, and object. Each component of a triple is either a query variable or a Uniform Resource Identifier (URI).

A query variable is the object that you are searching for. Variables are indicated with a question mark followed by a word. The word you choose for a variable is arbitrary, but should be human-readable for ease of understanding if shared with others. It is important that you use a variable consistently within a query.

?name

?item

A URI is a unique identifier that represents a thing that exists in the LINCS triplestore. It can be a property, entity, graph, class in the ontology (or ontologies), or even a vocabulary term (type). URIs are typically shortened using a prefix, or namespace. For example, the full URI for the identity property “woman,” &lt;http://id.lincsproject.ca/identity/woman>, can be shortened to identity:woman using the identity namespace.

WHERE Statement

Each query must have a WHERE statement. The WHERE statement follows the declaration of the query type and the list of predicates that will be used as headings in the table of results. It comes before the query pattern and indicates that what follows is WHERE to look for the pattern that the query must match.

Syntax

Your query will only work if you use the proper syntax.

Following your WHERE statement, use a curly bracket to enclose your query pattern. Curly brackets must appear in pairs. Every bracket you open must close later in your query. You can add extra curly brackets to help you organize your query, but each opening bracket must have a corresponding closing bracket.

Every triple in a query must end in a period.

If a subject used in one line is repeated in the next line, the subject can be omitted as long as there is a semicolon at the end of the first line to indicate use of the same subject in the second line.

info

Coming soon! Example queries will be added to the LINCS SPARQL Endpoint as data is published.

Modifiers

In a more complicated query, you can add modifiers to string together multiple criteria. For a list of all possible modifiers, see W3C’s Solution Sequence Modifiers. Common modifiers are described below.

OPTIONAL Modifier

The OPTIONAL modifier allows you to indicate something extra that you would like to have included in your results, if it is present in the data. For example, you can ask for optional images or optional additional information. Using the OPTIONAL modifier means that you will get results even for things that do not contain the information that you have marked as optional. For example, a query with optional images will return all correct results with and without images, but will include the images where they are available.

UNION Modifier

The UNION modifier allows you to combine the results of multiple graphs. You can use it to pull together multiple queries, or to ask the same query in more than one way. Asking the same query in more than one way can potentially broaden your results.

FILTER Modifier

The FILTER modifier allows you to filter your results so that you only see a subset of what appears in the data. For example, you can use the YEAR function within a filter to retrieve results corresponding to a specific time period, or the LANG function within a filter if you are querying a dataset that includes multiple languages, and you would only like to see results in one language.

ORDER BY ?variable Modifier

The ORDER BY ?variable modifier allows you to sort the order in which your results appear, for example in alphabetical or chronological order.

LIMIT ?number Modifier

The LIMIT ?number modifier allows you to limit the number of results that come back. This modifier is useful if you want to check if your query works without spooling out hundreds or thousands of results, as the more results your query generates, the slower it will run.

Determine Questions

To construct a SPARQL query, you first need to determine what you can ask:

  • Make a list of the things you want to know
  • Break down your question into as many smaller questions as possible
  • Come up with some potential correct and incorrect answers so you will be able to check that your query is working
  • Look through your data to see what information it has and make up a question that will lead you back to this information

Here is a simplified example of a graph showing the results generated by a SPARQL query. This example, from the University of Saskatchewan Art Collection, presents information about the painting “People Going into the Dancing Hall,” which was created by Allen Sapp in the twentieth century.

Ovals represent entities. Each entity is a URI and has a human-readable label. Rectangles represent literals—strings of characters that are human-readable rather than machine processable, such as names and vague dates.

SPARQL

While you may expect the graph to centre on the art object, the relationships of interest to our query (who made the object and when) are actually linked through an intermediary node (in blue), which represents an event, in this case a Production Event. This is because LINCS has adopted CIDOC CRM as its upper-level ontology. CIDOC-CRM is an event-centric model, so many of the triples within the datasets hosted by LINCS include event somewhere within them. Understanding the data model is essential to using SPARQL. If you are interested in learning more about the data model, here are some resources to get you started:

  • Ontologies
  • CIDOC-CRM
  • LINCS Application Profile [Coming soon!]
  • Project-Specific Application Profiles [Coming soon!]
  • Using SPARQL with LINCS Data [Coming soon!]
tip

Tips for learning to build queries:

  • Start small
  • Borrow components from other queries, and tweak them bit by bit
  • Make simple queries and then look for ways to make them more complex
  • Backtrack and try again if your query breaks

Summary

  • SPARQL is a query language that allows users to query triplestores.
  • SPARQL queries are directed at a SPARQL endpoint, a location on the internet that is capable of receiving and processing SPARQL queries.
  • SPARQL queries have four main ingredients: prefix(es); type of query; query; and modifier(s)
  • There are four types of SPARQL queries: ASK, SELECT, DESCRIBE, and CONSTRUCT.
  • To construct an effective SPARQL query, you first need to determine what you can ask.

Resources

To learn more about querying with SPARQL, see the following resources.

Introductory Information:

Beginner Information:

Intermediate Information:

Advanced Information:

Wikidata-Specific Information: