XSPARQL: Implementation and Test-cases

Authors:
Nuno Lopes - DERI, NUI Galway
Thomas Krennwallner - Institute of Information Systems, Vienna University of Technology
Axel Polleres - DERI, NUI Galway
Waseem Akhtar - DERI, NUI Galway
Stéphane Corlosquet - DERI, NUI Galway

This work is supported by Science Foundation Ireland under grants number SFI/02/CE1/I131 and SFI/08/CE/I1380 and under the European Commission European FP6 project inContext (IST-034718).


Abstract

XSPARQL is a query language combining XQuery and SPARQL for transformations between RDF and XML. This document provides a description of a prototype implementation of the language based on off-the-shelf XQuery and SPARQL engines. Along with a high-level description of the prototype the document presents a set of test queries and their expected output which are to be understood as illustrative help for possible other implementers.


Table of Contents


1. Introduction

This document introduces a simple implementation of the XSPARQL language as a proof of concept. The implementation described here can be found in the XSPARQLer Open Source project page at SourceForge.net.

2. XSPARQL Implementation

The main idea behind our implementation is translating XSPARQL queries to corresponding XQueries which possibly use interleaved calls to a SPARQL endpoint. The architecture of our prototype shown in Figure 1 consists of three main components: (1) a query rewriter, which turns an XSPARQL query into an XQuery; (2) a SPARQL endpoint, for querying RDF from within the rewritten XQuery; and (3) an XQuery engine for computing the result document.

The rewriter (Algorithm 1) takes as input a full XSPARQL QueryBody [XQUERYSEMANTICS] q (i.e., a sequence of FLWOR' expressions), a set of bound variables b and a set of position variables p, which we explain below. For a FL (or F', resp.) clause s, we denote by vars(s) the list of all newly declared variables (or the varlist, resp.) of s. We only sketch the core rewriting function rewrite() here; additional machinery handling the prolog including function, variable, module, and namespace declarations is needed in the full implementation. The rewriting is initiated by invoking rewrite(q, ∅, ∅) with empty bound and position variables an results in a syntactically valid XQuery that can be executed using an off-the-shelf XQuery implementation.

Algorithm 1: rewrite(q, b, p): Rewrite XSPARQL q to an XQuery
Input: XSPARQL query q, set of bounded variables b, set
of position variables p

Result: XQuery
1 if q is of form s1, ... , sk then
2     return rewrite(s1, b, p), ... , rewrite(sk, b, p)
3 else if q is of form for $x1in XPathExpr1, ... , $xkin XPathExprk s1then
4     return for $x1at $x1_pos in XPathExpr1,
... , $xkat $xk_pos in XPathExprk
5     rewrite(s1, b, p ∪ {$x1_pos, ... , $xk_pos})
6 else if q is of form for $x1... $xnfrom D where { pattern } M s1then
7     return let $aux query := sparql(D, {$x1, ... , $xn}, pattern, M, b)
8     for $aux_result in doc($aux_query)//sparql:result
9     auxvars({$x1, ... , $xn})    rewrite(s1, b ∪ vars(q), p)
10 else if q is of form construct {template} then
11     return return (rewrite-template(template, b, p) )
12 else
13     split q into its subexpressions s1, ... , sn
14     for j := 1, ... , n do bj= b
1=<i=<j-1 = vars(si)
15     if n > 1 then return q [s1/rewrite(s1, b1, p), ... , sn/rewrite(sn, bn, p)]
16     else return q
17 end

The rewriter is implemented as a Python script which is part of the XSPARQLer Open Source distribution. We provide an online interface where example queries can be found and tested at http://xsparql.deri.org/demo/. Figure 3 shows the output of our translation for the construct query in Figure 2. Let us explain the algorithm, which may be viewed as consisting of two parts, responsible for lifting and lowering (cf. Section 2 of [XSPARQLLANGUAGE]), respectively, using this sample output.


Figure 2: RDF-to-RDF mapping in XSPARQL
prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
construct {_:b foaf:name
             { fn:concat($N," ",$F) } . }
from <vc.rdf>
where { $P vc:Given $N. $P vc:Family $F.}
Figure 3: XQuery rewriting output for the query from Figure 2:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
25
26
27
28
29
30
31
32
33
34
35
36
37
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
 at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";
declare namespace vc = "http://www.w3.org/2001/vcard-rdf/3.0#";
declare namespace foaf = "http://xmlns.com/foaf/0.1/";
declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";
declare variable $_NS1 := "prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#> ";
declare variable $_NS2 := "prefix foaf: <http://xmlns.com/foaf/0.1/> "; 
_xsparql:_serialize(("@",$_NS1,".","@",$_NS2,".")), 
let $_aux1 := _xsparql:_serialize(( 
 "http://example.org/sparql?query=", fn:encode-for-uri(_xsparql:_serialize($_NS1, $_NS2, 
 "select $P $N $F from <vc.rdf> where {$P vc:Given $N. $P vc:Family $F.}")))) 
for $_aux_result1 at $_aux_result1_pos in doc($_aux1)//_sparql_result:result 
 let $_P_Node := $_aux_result1/_sparql_result:binding[@name="P"] 
 let $P := data($_P_Node/*)
 let $_P_NodeType := name($_P_Node/*)
 let $_P_NodeDatatype := string($_P_Node/*/@datatype)
 let $_P_NodeLang := string($_P_Node/*/@lang)
 let $_P_RDFTerm := _xsparql:_rdf_term($_P_NodeType,$P)
 let $_N_Node := $_aux_result1/_sparql_result:binding[@name="N"]
 let $N := data($_N_Node/*)
 let $_N_NodeType := name($_N_Node/*)
 let $_N_NodeDatatype := string($_N_Node/*/@datatype)
 let $_N_NodeLang := string($_N_Node/*/@lang)
 let $_N_RDFTerm := _xsparql:_rdf_term($_N_NodeType,$N)
 let $_F_Node := $_aux_result1/_sparql_result:binding[@name="F"]
 let $F := data($_F_Node/*)
 let $_F_NodeType := name($_F_Node/*)
 let $_F_NodeDatatype := string($_F_Node/*/@datatype)
 let $_F_NodeLang := string($_F_Node/*/@lang)
 let $_F_RDFTerm := _xsparql:_rdf_term($_F_NodeType,$F) 
 let $_validSubject1 := _xsparql:_serialize(("_:b", "_", data($_aux_result1_Pos))) 
 let $_validObject2 := _xsparql:_serialize(('"',   fn:concat($N   , " "   , $F   )   ,  '"')) 
 return if (_xsparql:_validSubject("",  $_validSubject1)) then (   
   if (_xsparql:_validObject("",  $_validObject2)) then (
     _xsparql:_serialize(($_validSubject1,  " foaf:name ", $_validObject2, " ." ))
   ) else "") else "" 

Before we rewrite the QueryBody q, we process the prolog (P) of the XSPARQL query and output every namespace declaration as Turtle string literals "@prefix ns: <URI>." After generating the prolog (lines 1-8 of the output), the rewriting of the QueryBody is performed recursively following the syntax of XSPARQL. During the traversal of the nested FLWOR' expressions, SPARQL-like bodies (lowering) or heads (lifting) will be replaced by XQuery expressions, which handle our two tasks. The lowering part is processed first:

Lowering Normal XQuery-like FLWO expressions are simply copied to the output and "decorated" (cf. [XSPARQLSEMANTICS]) with position variables, see lines 3-5 of Algorithm 1. The lowering part of XSPARQL, i.e., SPARQL-like F'DWM blocks, is "encoded" in XQuery with interleaved calls to an external SPARQL endpoint. To this end, we translate F'DWM blocks into equivalent XQuery FLWO expressions which retrieve SPARQL result XML documents [SPARQLRESULT] from a SPARQL engine; i.e., we "push" each F'DWM body to the SPARQL side, by translating it to a native select query string, see lines 6-9 of Algorithm 1. The auxiliary function sparql() in line 7 of our rewriter provides the functionality of transforming the where { pattern } part of F'DWM clauses to XQuery expressions which have all bound variables in pattern replaced by the values of the variables; "free" XSPARQL variables serve as binding variables for the SPARQL query result. The outcome of the sparql() function is a list of expressions, which is concatenated and URI-encoded using XQuery's XPath functions, and wrapped into a URI with http scheme pointing to the SPARQL query service (lines 9-11 of the output), cf. [SPARQLPROTOCOL]. Then we create a new XQuery for-loop that iterates over variable $aux_result, i.e., it iterates over the query answers extracted from the SPARQL XML result returned by the SPARQL query processor (line 12 of the output). For each variable $xivars(s) (i.e., in the (F') for clause of the original F'DWM body), new auxiliary variables are defined in separate let-expressions extracting its node, content, type (i.e., literal, uri, or blank), datatype URI or language tag if present, and the corresponding RDFTerm ($xi_Node, $xi, $xi_NodeType, $xi_NodeDataype, $xi_NodeLang and $xi_RDFTerm, resp.) by appropriate XPath expressions (lines 13-31 of Figure 3); the auxvars() helper function in line 9 of Algorithm 1 is responsible for this. Thereafter, the rewriter is called again recursively, with the newly declared variables added to b.

Lifting For the lifting part, i.e., SPARQL-like constructs in the R part, the transformation process is straightforward: Algorithm 1 is called on q and recursively decorates every for $Var expression by fresh position variables (line 12 of our example output); ultimately, construct templates are rewritten to an assembled string of the pattern's constituents, filling in variable bindings and evaluated subexpressions (lines 32-37 of the output): Blank nodes in constructs need special care, since, according to SPARQL's semantics, these must create new blank node identifiers for each solution binding. This is solved by "adorning" each blank node identifier in the construct part with the above-mentioned position variables from any enclosing for-loops, thus creating a new, unique blank node identifier in each loop (line 32 in the output). The auxiliary function rewrite-template() in line 11 of the algorithm provides this functionality by simply creating concatenations the lists of all position variable p as expressions to each blank node id; if there are nested expressions in the supplied construct {template}, rewrite-template() will return a sequence of nested FLWORs with each having rewrite() applied recursively on these expressions with the in-scope bound and position variables. rewrite-template() will create new variables for each of the RDFTerms that need to be dynamically evaluated for validity (variables $_validSubject1 and in $_validObject2 lines 32 and 33 in the output) and finally rewrite-template() generates a return clause which checks validity of the generated triples in Turtle syntax by respective helper function calls (validSubject(), validPredicate(), validObject(), which are declared - along with all other helper functions in the http://xsparql.deri.org/XSPARQLer/xsparql.xquery library), see lines 34-37 of the output.

Note that expressions involving SPARQL-like construct clauses create Turtle [TURTLE] output. Generating other output formats such as RDF/XML if needed is optionally done in our implementation by simple post-processing of the Turtle output by using standard RDF processing tools.

Finally, let us remark that that although both our implementation as well as XSPARQL's semantics definition [XQUERYSEMANTICS] use the SPARQL result format [SPARQLRESULT] for retrieving the results of a SPARQL query, other implementations can be conceived that do not require this intermediate step, but either use optimized internal data-structures to pass results between SPARQL and XQuery native processors, or implement a completely native, integrated processor.

3. Test cases

In the following, a set of XSPARQL test queries is presented. All these queries, along with the necessary input data, are also available in the Examples, Test cases and Use cases file which is part of the specification. Along with the original queries we also present the rewriting performed by our implementation as well as the expected query results.

3.1 foaf_lifting_naive.xsparql

This example query generates FOAF data from attribute values and element content in an input XML extracted by respectively simple XPath expressions. It is intended to demonstrate a simple lifting transformation from XML to RDF.

XSPARQL query:

declare namespace foaf = "http://xmlns.com/foaf/0.1/";
for $person in doc("relations.xml")//person,
    $nameA in $person/@name,
    $nameB in $person/knows
construct
{
[ foaf:name {data($nameA)}; a foaf:Person ]
foaf:knows
[ foaf:name {data($nameB)}; a foaf:Person ].
}

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";
declare namespace foaf = "http://xmlns.com/foaf/0.1/" ;

declare variable $_NS1 := "prefix  foaf:  <http://xmlns.com/foaf/0.1/>";

 _xsparql:_serialize(("  @", $_NS1, ".", "")),

for $person at $_person_Pos  in doc("relations.xml")//person  , 
 $nameA at $_nameA_Pos  in $person/@name  , 
 $nameB at $_nameB_Pos  in $person/knows   

let $_validObject1 := _xsparql:_serialize(('"', data($nameA), '"')) 
let $_validObject2 := _xsparql:_serialize(('"', data($nameB), '"')) 

  return (_xsparql:_removeEmpty(_xsparql:_serialize((
                 "[", 
                  if ( _xsparql:_validObject("",  $_validObject1)) 
                      then (_xsparql:_serialize(" foaf:name ", $_validObject1, ";")) else "",
                  _xsparql:_serialize(" a ",   'foaf:Person', ";"),  
                  _xsparql:_serialize("foaf:knows ", 
                            "[",
                            if ( _xsparql:_validObject( "", $_validObject2)) 
                               then (_xsparql:_serialize(" foaf:name ", $_validObject2, ";")) else "",  
                            _xsparql:_serialize(" a ",   'foaf:Person',";"), 
                           " ]") , 
                  " ] ."))))

Expected output:

@prefix foaf: <http://xmlns.com/foaf/0.1/>.

[ foaf:name "Alice"; a foaf:Person;foaf:knows [ foaf:name "Bob"; a foaf:Person; ] ] . 
[ foaf:name "Alice"; a foaf:Person;foaf:knows [ foaf:name "Charles"; a foaf:Person; ] ] . 
[ foaf:name "Bob"; a foaf:Person;foaf:knows [ foaf:name "Charles"; a foaf:Person; ] ] . 

3.2 foaf_lifting.xsparql

This query performs a very similar task as the previous one: generate FOAF data from input XML. It demonstrates a full lifting transformation from XML to RDF. Particularly, the difference to the previous transformation lies in the fact that the same blank node identifier is given to people with the same name (assuming that names uniquely identify people in the input XML file at hand). The blank node identifier is "computed" from the position of the first occurrence of the name node in the source XML tree.

XSPARQL query:

declare namespace foaf="http://xmlns.com/foaf/0.1/"; 
let $doc := doc("relations.xml")
let $persons := $doc//*[@name or ../knows] 
return 
 for $p in $persons 
 let $n := if( $p[@name]) then $p/@name else $p 
 let $id := count($p/preceding::*) + count($p/ancestor::*) 
 where not(exists($p/following::*[@name=$n or data(.)=$n])) 
 construct 
 { _:b{$id} a foaf:Person; 
            foaf:name {data($n)}. 
   { for $k in $persons 
     let $kn := if( $k[@name]) then $k/@name else $k 
     let $kid :=count($k/preceding::*) + count($k/ancestor::*) 
     where $kn = data($doc//*[@name=$n]/knows) and 
                 not(exists($kn/../following::*[@name=$kn or data(.)=$kn])) 
     construct 
     { _:b{$id}  foaf:knows _:b{$kid}. 
       _:b{$kid} a foaf:Person. }
   } 
 }

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";
declare namespace foaf = "http://xmlns.com/foaf/0.1/" ;

declare variable $_NS1 := "prefix  foaf:  <http://xmlns.com/foaf/0.1/>";

_xsparql:_serialize("  @", $_NS1, ".", ""),

let $doc  := doc("relations.xml")  

let $persons  := $doc//*[@name  or ../knows ]  
 return 
  for $p at $_p_Pos  in $persons  
   let $n  := if ( $p[@name  ]) then $p/@name   else $p  
   let $id  := count($p/preceding::*)+count($p/ancestor::*)  

   let $_validSubject4 := _xsparql:_serialize(("_:b",  data($id))) 
   let $_validObject5 := _xsparql:_serialize(( '"',   data($n)   ,  '"')) 
   
   where not(exists($p/following::*[@name =$n  or data(.) =$n ]))  

  return ( 
   if ( _xsparql:_validSubject( "",  $_validSubject4)) 
          then (_xsparql:_serialize(($_validSubject4,  " a ",   'foaf:Person',  " .")),  
                if ( _xsparql:_validObject( "",  $_validObject5)) 
                 then (_xsparql:_serialize(($_validSubject4,  " foaf:name ", $_validObject5, " ."))) else ""  
          ) else "" ,
 
   for $k at $_k_Pos  in $persons  
    let $kn  := if ( $k[@name  ]) then $k/@name   else $k  
    let $kid  := count($k/preceding::*)+count($k/ancestor::*)  

    let $_validSubject1 := _xsparql:_serialize(("_:b",  data($id))) 
    let $_validObject2 := _xsparql:_serialize(("_:b",  data($kid))) 
    let $_validSubject3 := _xsparql:_serialize("_:b",  data($kid))) 

    where $kn =data($doc//*[@name =$n  ]/knows) and not(exists($kn/../following::*[@name =$kn  or data(.) =$kn ]))  
    return ( 
     if ( _xsparql:_validSubject( "",  $_validSubject1)) then (
        if ( _xsparql:_validObject( "",  $_validObject2)) 
          then (_xsparql:_serialize(($_validSubject1,  " foaf:knows ", $_validObject2, " ."))) else ""  
     ) else "" ,
    if ( _xsparql:_validSubject( "",  $_validSubject3)) 
     then (_xsparql:_serialize(($_validSubject3,  " a ",   'foaf:Person',  " ."))) else ""  
   )  
  )

Expected output:

@prefix foaf: <http://xmlns.com/foaf/0.1/>. 

_:b1 a foaf:Person . 
_:b1 foaf:name "Alice" . 
_:b1 foaf:knows _:b4 . 
_:b4 a foaf:Person . 
_:b1 foaf:knows _:b6 . 
_:b6 a foaf:Person . 
_:b4 a foaf:Person . 
_:b4 foaf:name "Bob" . 
_:b4 foaf:knows _:b6 . 
_:b6 a foaf:Person . 
_:b6 a foaf:Person . 
_:b6 foaf:name "Charles" . 

3.3 vCard2foaf.xsparql

This query performs a simple mapping from vCard given and family name properties into FOAF full names; it shows the use of XPath and XQuery built-in functions for manipulating RDF.

XSPARQL query:

prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
construct { _:b foaf:name {fn:concat($N," ", $F)}.}
from <vCard.rdf>
where { $P vc:Given $N. $P vc:Family $F. }

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";
declare namespace vc = "http://www.w3.org/2001/vcard-rdf/3.0#";
declare namespace foaf = "http://xmlns.com/foaf/0.1/";

declare variable $_NS1 := "prefix  vc:  <http://www.w3.org/2001/vcard-rdf/3.0#>";
declare variable $_NS2 := "prefix  foaf:  <http://xmlns.com/foaf/0.1/>";

_xsparql:_serialize((  "@", $_NS1, ".", "  @", $_NS2, ".", "")),

let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize((  $_NS1,  $_NS2, "
select $N $P $F from <vCard.rdf>  where {    $P   vc:Given    $N   .    $P   vc:Family    $F   . } ")))))
for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result
 let $_N_Node := $_aux_result1/_sparql_result:binding[@name="N"]
 let $N := data($_N_Node/*)
 let $_N_NodeType := name($_N_Node/*)
 let $_N_NodeDatatype := string($_N_Node/*/@datatype)
 let $_N_NodeLang := string($_N_Node/*/@lang)
 let $_N_RDFTerm :=  _xsparql:_rdf_term($_N_NodeType,$N)
 let $_P_Node := $_aux_result1/_sparql_result:binding[@name="P"]
 let $P := data($_P_Node/*)
 let $_P_NodeType := name($_P_Node/*)
 let $_P_NodeDatatype := string($_P_Node/*/@datatype)
 let $_P_NodeLang := string($_P_Node/*/@lang)
 let $_P_RDFTerm := _xsparql:_rdf_term($_P_NodeType,$P)
 let $_F_Node := $_aux_result1/_sparql_result:binding[@name="F"]
 let $F := data($_F_Node/*)
 let $_F_NodeType := name($_F_Node/*)
 let $_F_NodeDatatype := string($_F_Node/*/@datatype)
 let $_F_NodeLang := string($_F_Node/*/@lang)
 let $_F_RDFTerm := _xsparql:_rdf_term($_F_NodeType,$F)

 let $_validSubject1 := _xsparql:_serialize(("_:b", "_", data($_aux_result1_Pos))) 
 let $_validObject2 := _xsparql:_serialize(( '"',   fn:concat($N   , " "   , $F)   ,  '"')) 

 return if ( _xsparql:_validSubject( "",  $_validSubject1)) 
           then (if ( _xsparql:_validObject( "",  $_validObject2)) 
                    then (_xsparql:_serialize(($_validSubject1,  " foaf:name ", $_validObject2, " ."))) else ""  
           ) else "" 

Expected output:

@prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#>. 
@prefix foaf: <http://xmlns.com/foaf/0.1/>. 

_:b_1 foaf:name "Axel Polleres" . 

3.4 foaf_lowering.xsparql

This query generates XML data from an input RDF file containing FOAF data. It demonstrates the lowering task, i.e., mapping from RDF to XML.

XSPARQL query:

declare namespace foaf = "http://xmlns.com/foaf/0.1/";
<relations>
{ for $Person $Name from <relations.rdf>
  where { $Person foaf:name $Name }
  order by $Name
  return <person name="{$Name}">
         { for $FName from <relations.rdf>
           where { $Person foaf:knows $Friend.
                   $Person foaf:name $Name.
                   $Friend foaf:name $FName. }
           return <knows> { $FName }</knows>
         }
         </person>
}
</relations>

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";
declare namespace foaf = "http://xmlns.com/foaf/0.1/" ;

declare variable $_NS1 := "prefix  foaf:  <http://xmlns.com/foaf/0.1/>";
<relations>{ 
 let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize((  $_NS1, "
  select $Person $Name from <relations.rdf>  where {    $Person   foaf:name    $Name   . } order by $Name")))))

 for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result
  let $_Person_Node := ($_aux_result1/_sparql_result:binding[@name = "Person"])
  let $Person := data($_Person_Node/*)
  let $_Person_NodeType := name($_Person_Node/*)
  let $_Person_NodeDatatype := string($_Person_Node/*/@datatype)
  let $_Person_NodeLang := string($_Person_Node/*/@lang)
  let $_Person_RDFTerm :=  _xsparql:_rdf_term($_Person_NodeType, $Person)
  let $_Name_Node := ($_aux_result1/_sparql_result:binding[@name = "Name"])
  let $Name := data($_Name_Node/*)
  let $_Name_NodeType := name($_Name_Node/*)
  let $_Name_NodeDatatype := string($_Name_Node/*/@datatype)
  let $_Name_NodeLang := string($_Name_Node/*/@lang)
  let $_Name_RDFTerm :=  _xsparql:_rdf_term($_Name_NodeType, $Name)

  return <person name = "{$Name}">{ 
   let $_aux2 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( 
                  _xsparql:_serialize((  $_NS1, "select $FName from <relations.rdf>  where {     ", 
                            $_Person_RDFTerm, " foaf:knows $Friend .", 
                            $_Person_RDFTerm, " foaf:name ", 
                            $_Name_RDFTerm, " . $Friend foaf:name $FName . }")))
                 ))

   for $_aux_result2 at $_aux_result2_Pos in doc($_aux2)//_sparql_result:result
    let $_FName_Node := ($_aux_result2/_sparql_result:binding[@name = "FName"])
    let $FName := data($_FName_Node/*)
    let $_FName_NodeType := name($_FName_Node/*)
    let $_FName_NodeDatatype := string($_FName_Node/*/@datatype)
    let $_FName_NodeLang := string($_FName_Node/*/@lang)
    let $_FName_RDFTerm :=  _xsparql:_rdf_term($_FName_NodeType, $FName)

    return <knows>{ $FName   }</knows>   
  }</person>   
}</relations>  

Expected output:

<relations>
   <person name="Alice">
      <knows>Charles</knows>
      <knows>Bob</knows>
   </person>
   <person name="Bob">
      <knows>Charles</knows>
      </person>
   <person name="Charles"/>
</relations> 

3.5 simple.xsparql

This query selects only persons "known by somebody" in the input RDF data. All these persons are then mapped to a class where the class URI is assigned to a variable using an XQuery let clause. The example demonstrates the combination of constructs from XQuery and SPARQL, more specifically the reuse of XQuery variables within SPARQL like construct clauses.

XSPARQL query:

prefix : <http://www.example.org>
prefix foaf: <http://xmlns.com/foaf/0.1/>
let $y := "http://www.example.org/knownPerson"
for $x from <foaf.rdf>
where {$s foaf:knows $x}
construct {$x a <{$y}> }

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";
declare default element namespace "http://www.example.org";
declare namespace foaf = "http://xmlns.com/foaf/0.1/";

declare variable $_NS1 := "prefix  :  <http://www.example.org>";
declare variable $_NS2 := "prefix  foaf:  <http://xmlns.com/foaf/0.1/>";

_xsparql:_serialize((  "  @", $_NS1, ".", "  @", $_NS2, ".", "" )),

let $y  := "<http://www.example.org/knownPerson>"  

let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize((  $_NS1,  $_NS2, "
 select $x from <http://www.polleres.net/foaf.rdf>  where {    $s   foaf:knows    $x   . } ")))))

for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result
 let $_x_Node := ($_aux_result1/_sparql_result:binding[@name = "x"])
 let $x := data($_x_Node/*)
 let $_x_NodeType := name($_x_Node/*)
 let $_x_NodeDatatype := string($_x_Node/*/@datatype)
 let $_x_NodeLang := string($_x_Node/*/@lang)
 let $_x_RDFTerm :=  _xsparql:_rdf_term($_x_NodeType, $x)

 let $_validObject1 := _xsparql:_serialize(( '"',  $y,  '"')) 

 return ( if ( _xsparql:_validSubject( "",  $_x_RDFTerm))
             then (if ( _xsparql:_validObject( "",  $_validObject1)) 
                      then (_xsparql:_serialize(($_x_RDFTerm,  " a ", $_validObject1, " ."))) else ""                 
             ) else ""  
        )

Expected output:

@prefix : <http://www.example.org>. 
@prefix foaf: <http://xmlns.com/foaf/0.1/>. 

_:b0 a <http://www.example.org/knownPerson> . 
<http://danbri.org/foaf.rdf#danbri> a <http://www.example.org/knownPerson> . 
_:b1 a <http://www.example.org/knownPerson> . 
<http://richard.cyganiak.de/foaf.rdf#cygri> a <http://www.example.org/knownPerson> . 
<http://nets.ii.uam.es/~rlara/foaf.rdf#me> a <http://www.example.org/knownPerson> . 
_:b2 a <http://www.example.org/knownPerson> . 
_:b3 a <http://www.example.org/knownPerson> . 
_:b4 a <http://www.example.org/knownPerson> . 
<http://eyaloren.org/foaf.rdf#me> a <http://www.example.org/knownPerson> . 
_:b5 a <http://www.example.org/knownPerson> . 
<http://harth.org/andreas/foaf#ah> a <http://www.example.org/knownPerson> . 
<http://www.aifb.uni-karlsruhe.de/Personen/viewPersonOWL/id2084instance> a <http://www.example.org/knownPerson> . 
<http://page.mi.fu-berlin.de/mochol/foaf.rdf#me> a <http://www.example.org/knownPerson> . 
_:b6 a <http://www.example.org/knownPerson> . 
<http://page.mi.fu-berlin.de/~nixon/foaf.rdf#nixon> a <http://www.example.org/knownPerson> . 
_:b7 a <http://www.example.org/knownPerson> . 
_:b8 a <http://www.example.org/knownPerson> . 
_:b9 a <http://www.example.org/knownPerson> . 
<http://www.postsubmeta.net/foaf.rdf#TK> a <http://www.example.org/knownPerson> . 
_:b10 a <http://www.example.org/knownPerson> . 
_:b11 a <http://www.example.org/knownPerson> . 
_:b12 a <http://www.example.org/knownPerson> . 
<http://sw.deri.org/~haller/foaf.rdf#ah> a <http://www.example.org/knownPerson> . 
_:b13 a <http://www.example.org/knownPerson> . 

3.6 simple_filter.xsparql

This query performs the same task as the previous one but removes persons only identified with a Blank Node using a SPARQL filter expression.

XSPARQL query:

prefix : <http://www.example.org>
prefix foaf: <http://xmlns.com/foaf/0.1/>
let $y := "http://www.example.org/knownPerson"
for * from <foaf.rdf>
where {$s foaf:knows $x filter (!isblank($x))}
construct {$x a <{$y}> }

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";
declare default element namespace "http://www.example.org";
declare namespace foaf = "http://xmlns.com/foaf/0.1/";

declare variable $_NS1 := "prefix  :  <http://www.example.org>";
declare variable $_NS2 := "prefix  foaf:  <http://xmlns.com/foaf/0.1/>";

_xsparql:_serialize((  "  @", $_NS1, ".", "  @", $_NS2, ".", "" )),

let $y  := "http://www.example.org/knownPerson"  

let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize((  $_NS1,  $_NS2, "
 select $s $x from <http://www.polleres.net/foaf.rdf>  where {    $s   foaf:knows    $x   . filter(!isblank($x))} ")))))

for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result
 let $_s_Node := ($_aux_result1/_sparql_result:binding[@name = "s"])
 let $s := data($_s_Node/*)
 let $_s_NodeType := name($_s_Node/*)
 let $_s_NodeDatatype := string($_s_Node/*/@datatype)
 let $_s_NodeLang := string($_s_Node/*/@lang)
 let $_s_RDFTerm :=  _xsparql:_rdf_term($_s_NodeType, $s)
 let $_x_Node := ($_aux_result1/_sparql_result:binding[@name = "x"])
 let $x := data($_x_Node/*)
 let $_x_NodeType := name($_x_Node/*)
 let $_x_NodeDatatype := string($_x_Node/*/@datatype)
 let $_x_NodeLang := string($_x_Node/*/@lang)
 let $_x_RDFTerm :=  _xsparql:_rdf_term($_x_NodeType, $x)
 
 let $_validObject1 := _xsparql:_serialize(("<" , $y   , ">")) 

 return ( if ( _xsparql:_validSubject( "",  $_x_RDFTerm)) 
           then (if ( _xsparql:_validObject( "",  $_validObject1)) 
                  then (_xsparql:_serialize(( $_x_RDFTerm,  " a ",     $_validObject1  ,  " ."))) else ""
                ) else ""  
        )

Expected output:

@prefix : <http://www.example.org>. 
@prefix foaf: <http://xmlns.com/foaf/0.1/>. 

<http://danbri.org/foaf.rdf#danbri> a <http://www.example.org/knownPerson> . 
<http://richard.cyganiak.de/foaf.rdf#cygri> a <http://www.example.org/knownPerson> . 
<http://nets.ii.uam.es/~rlara/foaf.rdf#me> a <http://www.example.org/knownPerson> . 
<http://eyaloren.org/foaf.rdf#me> a <http://www.example.org/knownPerson> . 
<http://www.aifb.uni-karlsruhe.de/Personen/viewPersonOWL/id2084instance> a <http://www.example.org/knownPerson> . 
<http://harth.org/andreas/foaf#ah> a <http://www.example.org/knownPerson> . 
<http://page.mi.fu-berlin.de/mochol/foaf.rdf#me> a <http://www.example.org/knownPerson> . 
<http://page.mi.fu-berlin.de/~nixon/foaf.rdf#nixon> a <http://www.example.org/knownPerson> . 
<http://www.postsubmeta.net/foaf.rdf#TK> a <http://www.example.org/knownPerson> . 
<http://sw.deri.org/~haller/foaf.rdf#ah> a <http://www.example.org/knownPerson> . 

3.7 distribution_simple.xsparql

Given a set of dated entries, this example extracts the distribution of these entries over time, grouping the entries by day and counting the entries for each day. A typical scenario where such data could apply would be a set of IRC logs in a given month, annotated in RDF, where one wants to get an overview over the activity on a channel over the month. This example shows how XQuery features can be used to perform aggregation of data, which is not possible with pure SPARQL. In our current, implementation the naive rewriting generated leaves some room for improvement and we expect future XSPARQL engines to optimize such queries.

XSPARQL query:

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix dct:  <http://purl.org/dc/terms/>

let $results :=
  for $entry $date
  from <sample_distribution_data.nt>
  where {$entry dct:created $date}
  return <entry date="{$date}"/>
return
    let $days := for $day in data($results/@date)
             return day-from-dateTime(xs:dateTime($day))
    for $day in distinct-values($days)
    order by $day
    return <day d="{$day}">{count($results[day-from-dateTime(xs:dateTime(@date)) = $day])}</day>

Observe that assuming and optimal sorting algorithm for evaluating the order by clause (i.e., O(n.log(n) ), this query runs in O(n2.log(n)), where n is the number of dated entries in the input data.

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";

declare namespace foaf = "http://xmlns.com/foaf/0.1/";
declare namespace dct = "http://purl.org/dc/terms/";


declare variable $_NS1 := "prefix foaf: <http://xmlns.com/foaf/0.1/>";
declare variable $_NS2 := "prefix dct: <http://purl.org/dc/terms/>";

let $results :=
let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, $_NS2, "
select $entry $date from <sample_distribution_data.nt> where { $entry dct:created $date . } ")))))
for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result
let $_entry_Node := ($_aux_result1/_sparql_result:binding[@name = "entry"])
let $entry := data($_entry_Node/*)
let $_entry_NodeType := name($_entry_Node/*)
let $_entry_NodeDatatype := string($_entry_Node/*/@datatype)
let $_entry_NodeLang := string($_entry_Node/*/@lang)
let $_entry_RDFTerm := _xsparql:_rdf_term($_entry_NodeType, $entry )
let $_date_Node := ($_aux_result1/_sparql_result:binding[@name = "date"])
let $date := data($_date_Node/*)
let $_date_NodeType := name($_date_Node/*)
let $_date_NodeDatatype := string($_date_Node/*/@datatype)
let $_date_NodeLang := string($_date_Node/*/@lang)
let $_date_RDFTerm := _xsparql:_rdf_term($_date_NodeType, $date )
return <entry date = "{$date}"/>
return
let $days := for $day at $_day_Pos in data($results/@date ) return day-from-dateTime(xs:dateTime($day ) )
for $day at $_day_Pos in distinct-values($days ) 
order by $day return <day d = "{$day}">{ count($results[day-from-dateTime(xs:dateTime(@date ) ) = $day ] ) }</day>

Expected output:

<day d="12">41</day>
<day d="13">22</day>
<day d="14">166</day>
<day d="15">252</day>

3.8 distribution.xsparql

This query is similar to the previous one, i.e. it performs the same task, except that a custom function is used to improve the complexity of the algorithm. This example shall show that there is obviously a lot of room for query optimizers for XSPARQL.

XSPARQL query:

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix dct:  <http://purl.org/dc/terms/>

declare function local:_distribution_count($s, $i, $c) {
  let $x :=
    if ($i > count($s)) then
      ()
    else if ($s[$i] eq $s[$i + 1]) then
      local:_distribution_count($s, $i + 1, $c + 1)
    else
      fn:concat( fn:concat($s[$i], ", ", $c) , "
", local:_distribution_count($s, $i + 1, 1) )
  return $x
};

let $days  := 
  for $entry $date
  from <sample_distribution_data.nt>
  where {$entry dct:created $date}
  let $day := day-from-dateTime(xs:dateTime($date))
  order by $day
  return $day

return local:_distribution_count($days, 1, 1)

Observe here that again the sorting in the first let takes O(n.log(n) steps for an optimal engine. The recursive custom function for counting is then called upon the messages sorted per day. Since this function boils down to a simple iteration over the sorted days, it runs in O(n) and the overall complexity thus stays within O(n.log(n) which is an improvement over the previous query. An intelligent query optimizer could possibly catch such cases.

Rewritten query:

import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"
at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery";

declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#";

declare namespace foaf = "http://xmlns.com/foaf/0.1/";
declare namespace dct = "http://purl.org/dc/terms/";


declare function local:_distribution_count ( $s , $i , $c ) {
let $x := if ( $i > count($s ) ) then () else if ( $s[$i ] eq $s[$i+1 ] ) 
then local:_distribution_count($s , $i+1 , $c+1 ) else fn:concat(fn:concat($s[$i ] , ", " , $c ) , "
" , local:_distribution_count($s , $i+1 , 1 ) )
return $x } ;
declare variable $_NS1 := "prefix foaf: <http://xmlns.com/foaf/0.1/>";
declare variable $_NS2 := "prefix dct: <http://purl.org/dc/terms/>";

let $days :=
let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, $_NS2, "
select $entry $date from <sample_distribution_data.nt> where { $entry dct:created $date . } ")))))
for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result
let $_entry_Node := ($_aux_result1/_sparql_result:binding[@name = "entry"])
let $entry := data($_entry_Node/*)
let $_entry_NodeType := name($_entry_Node/*)
let $_entry_NodeDatatype := string($_entry_Node/*/@datatype)
let $_entry_NodeLang := string($_entry_Node/*/@lang)
let $_entry_RDFTerm := _xsparql:_rdf_term($_entry_NodeType, $entry )
let $_date_Node := ($_aux_result1/_sparql_result:binding[@name = "date"])
let $date := data($_date_Node/*)
let $_date_NodeType := name($_date_Node/*)
let $_date_NodeDatatype := string($_date_Node/*/@datatype)
let $_date_NodeLang := string($_date_Node/*/@lang)
let $_date_RDFTerm := _xsparql:_rdf_term($_date_NodeType, $date )

let $day := day-from-dateTime(xs:dateTime($date ) )
order by $day return $day
return local:_distribution_count($days , 1 , 1 ) 

Expected output:

12, 41
13, 22
14, 166
15, 252

4. References

[SPARQLPROTOCOL]
Kendall Grant Clark, Lee Feigenbaum, and Elias Torres. SPARQL Protocol for RDF, November 2007. W3C Proposed Recommendation, available at http://www.w3.org/TR/2007/PR-rdf-sparql-protocol-20071112/.
[SPARQLRESULT]
Dave Beckett and Jeen Broekstra. SPARQL Query Results XML Format, November 2007. W3C Proposed Recommendation, available at http://www.w3.org/TR/2007/PR-rdf-sparql-XMLres-20071112/.
[TURTLE]
David Beckett and Tim Berners-Lee. Turtle - Terse RDF Triple Language, W3C Team Submission, 14 January 2008, Available at http://www.w3.org/TeamSubmission/turtle/.
[XQUERYSEMANTICS]
Denise Draper, Peter Fankhauser, Mary Fernández, Ashok Malhotra, Kristoffer Rose, Michael Rys, Jérôme Siméon, and Philip Wadler. XQuery 1.0 and XPath 2.0 Formal Semantics. W3c recommendation, W3C, January 2007. W3C Recommendation, available at http://www.w3.org/TR/xquery-semantics/.
[XSPARQLLANGUAGE]
XSPARQL Language Specification. Document included in the present specification.
[XSPARQLSEMANTICS]
XSPARQL: Semantics. Document included in the present specification.