diff options
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 150 |
1 files changed, 79 insertions, 71 deletions
@@ -34,32 +34,11 @@ Combined approach for Conjunctive Query answering in RSA | |||
34 | 34 | ||
35 | </p> | 35 | </p> |
36 | 36 | ||
37 | <!-- TABLE OF CONTENTS --> | ||
38 | <details close="close"> | ||
39 | <summary>Table of Contents</summary> | ||
40 | <ol> | ||
41 | <li><a href="#about">About</a></li> | ||
42 | <li><a href="#preliminaries">Preliminaries</a></li> | ||
43 | <li> | ||
44 | <a href="#using-the-software">Using the software</a> | ||
45 | <ul> | ||
46 | <li><a href="#provide-rdfox-license">Provide RDFox license</a></li> | ||
47 | <li><a href="#compiling-and-running-the-project">Compiling and running the project</a></li> | ||
48 | <li><a href="#running-tests">Running tests</a></li> | ||
49 | </ul> | ||
50 | </li> | ||
51 | <li><a href="#changes-introduced">Changes introduced</a></li> | ||
52 | <li><a href="#acknowledgements">Acknowledgements</a></li> | ||
53 | <li><a href="#credits">Credits</a></li> | ||
54 | <li><a href="#license">License</a></li> | ||
55 | </ol> | ||
56 | </details> | ||
57 | |||
58 | ## About | 37 | ## About |
59 | 38 | ||
60 | This is a re-implementation of the combined approach for CQ answering over RSA ontologies described in [[1](#references)]. | 39 | This is an *improved* re-implementation of the combined approach for CQ answering over RSA ontologies described in [[1](#references)]. |
61 | 40 | ||
62 | > Please note that the prototype mentioned in [[1](#references)] is not available (and the contributors of this repository have never seen it); | 41 | > Please note that the prototype mentioned in [[1](#references)] is not available (and the contributors to this repository have never seen it); |
63 | > therefore, this "re-implementation" could be completely different from that prototype (potentially using different tools and programming language). | 42 | > therefore, this "re-implementation" could be completely different from that prototype (potentially using different tools and programming language). |
64 | 43 | ||
65 | ## Preliminaries | 44 | ## Preliminaries |
@@ -68,7 +47,58 @@ In order to use this program you need to have [RDFox](https://www.oxfordsemantic | |||
68 | RDFox is proprietary software and as such we are not able to distribute it along with our code. | 47 | RDFox is proprietary software and as such we are not able to distribute it along with our code. |
69 | Please refer to [this link](https://www.oxfordsemantic.tech/tryrdfoxforfree) to request a free trial. | 48 | Please refer to [this link](https://www.oxfordsemantic.tech/tryrdfoxforfree) to request a free trial. |
70 | 49 | ||
71 | This software has been developed and tested with RDFox v.4.1 | 50 | This software has been developed and tested with RDFox v5.2.1 |
51 | |||
52 | ## Changes introduced | ||
53 | |||
54 | We tried to implement the system as close as possible to the theoretical description provided in [[1](#references)]. | ||
55 | Regardless, we had to deal with the fact that we are using different tools to carry out reasoning tasks and we are probably using a different language to implement the system. | ||
56 | The following is a (non exhaustive) summary of fixes (🔧), changes (🔄) and improvements (⚡), we introduced along the way: | ||
57 | |||
58 | + 🔄 [RDFox](https://www.oxfordsemantic.tech/product) is used instead of DLV as the underlying LP engine. | ||
59 | |||
60 | + âš¡ The system accepts unrestricted OWL ontologies as input and takes care of normalising and approximating the ontology to RSA. | ||
61 | At the time of writing, two approximation algorithms are provided, to compute a sound (or complete) set of answer to the input queries, respectively. | ||
62 | |||
63 | + âš¡ The different steps of the combined approach (namely, the canonical model computation and the filtering step) are executed in isolation using different *named graphs*. | ||
64 | This allows us to reuse partial products of the computation and can even be used to parellalise filtering and answering steps. | ||
65 | |||
66 | + 🔧 In Def.4, the definition of built-in predicate `notIn` is wrong and should reflect the implicit semantics implied by the name, i.e., | ||
67 | |||
68 | > let [...] `notIn` be a built-in predicate which holds when the first argument is **not** an element of the set given as second argument | ||
69 | |||
70 | This has been fixed by (1) introducing a built-in predicate `In` (note that instances of `In` can be computed beforehand since they only depend on the input ontology), and (2) implement `notIn` as the negation of `In` using RDFox *NaF* built-in support. | ||
71 | |||
72 | + 🔄 Top (`owl:Thing`) axiomatisation is performed introducing rules as follows. | ||
73 | Given `p` predicate (arity *n*) *in the original ontology*, the following rule is introduced: | ||
74 | ``` | ||
75 | owl:Thing[?X1], ..., owl:Thing[?Xn] :- p(?X1, ..., ?Xn) . | ||
76 | ``` | ||
77 | Note that, by definition, arity can be either 1 or 2. | ||
78 | |||
79 | + 🔄 Equality axiomatisation is performed introducing the following rules: | ||
80 | ``` | ||
81 | rsacomb:congruent[?X, ?X] :- owl:Thing[?X] . | ||
82 | rsacomb:congruent[?Y, ?X] :- rsacomb:congruent[?X, ?Y] . | ||
83 | rsacomb:congruent[?X, ?Z] :- rsacomb:congruent[?X, ?Y], rsacomb:congruent[?Y, ?Z] . | ||
84 | ``` | ||
85 | defining equivalence as a congruence relation over terms in the ontology. | ||
86 | Substitution rules propagate the equivalence to all existing atoms. | ||
87 | |||
88 | + 🔧 In Def. 4, the definition of built-in predicate `NI` is not consistent with its use in Table 3 and related description in Sec. 4.2. | ||
89 | We redefined `NI` as the set of all constants that are *equal* to a constant in the original ontology (according to the internal equality predicate `rsa:congruent`). | ||
90 | Note that, in this scenario, there is no need to introduce `NI` instances as facts in the system; | ||
91 | instead we can add a rule to populate the new predicate: | ||
92 | ``` | ||
93 | rsa:NI[?X] :- rsa:congruent[?X, ?Y], rsa:named[?Y] . | ||
94 | ``` | ||
95 | where `rsa:named` is an internal predicate keeping track of all constants in the original ontology. | ||
96 | |||
97 | + âš¡ In Def. 3, regarding the generation of the logic program used for the RSA check, only T5 axioms involving an unsafe role will introduce the internal predicates `PE` and `U`. | ||
98 | |||
99 | + âš¡ Both in the canonical model and the filtering program computations, rules without a body are loaded into RDFox as facts. | ||
100 | |||
101 | + ⚡ The `cycle` function introduced in Def.4 establishing the direction of the *unraveling* of loops is defined over triples `(A,R,B)`. We are currently limiting the triple only to those appearing in a T5 axiom `A ⊑ ∃R.B`. Note that this greatly limits the size of cycle for a given triple, and as a consequence limits the number of rules used to compute the canonical model. | ||
72 | 102 | ||
73 | ## Using the software | 103 | ## Using the software |
74 | 104 | ||
@@ -79,9 +109,9 @@ Download links for specific versions and operating systems can be found [here](h | |||
79 | 109 | ||
80 | ```{.bash} | 110 | ```{.bash} |
81 | mkdir -p lib && pushd lib | 111 | mkdir -p lib && pushd lib |
82 | wget https://rdfox-distribution.s3.eu-west-2.amazonaws.com/release/v4.2.0/RDFox-linux-4.2.0.zip | 112 | wget https://rdfox-distribution.s3.eu-west-2.amazonaws.com/release/v5.2.1/RDFox-linux-x86_64-5.2.1.zip |
83 | unzip RDFox-linux-4.2.0.zip | 113 | unzip RDFox-linux-x86_64-5.2.1.zip |
84 | ln -s RDFox-linux-4.2.0.zip/lib/JRDFox.jar | 114 | ln -s RDFox-linux-x86_64-5.2.1.zip/lib/JRDFox.jar |
85 | popd | 115 | popd |
86 | ``` | 116 | ``` |
87 | 117 | ||
@@ -106,11 +136,7 @@ Run the following from the base directory of the project to produce a standalone | |||
106 | sbt assembly | 136 | sbt assembly |
107 | ``` | 137 | ``` |
108 | 138 | ||
109 | The output of the command will print the location of the produced jar. Execute it with | 139 | The output of the command will print the location of the produced jar. |
110 | ``` | ||
111 | java -jar <path/to/fat.jar> [<option> ...] | ||
112 | ``` | ||
113 | |||
114 | Note that the fat jar file distributed with this repository excludes the RDFox as a dependency. Provided that you have the RDFox setup on your machine, you can run the program as follows | 140 | Note that the fat jar file distributed with this repository excludes the RDFox as a dependency. Provided that you have the RDFox setup on your machine, you can run the program as follows |
115 | ``` | 141 | ``` |
116 | java -cp <path/to/JRDFox.jar>:<path/to/fat.jar> uk.ac.ox.cs.rsacomb.RSAComb [<option> ...] | 142 | java -cp <path/to/JRDFox.jar>:<path/to/fat.jar> uk.ac.ox.cs.rsacomb.RSAComb [<option> ...] |
@@ -144,50 +170,32 @@ To run only functional tests for LUBM, excluding tests tagged as *slow* (that re | |||
144 | sbt "testOnly *functional.LUBM -- -l org.scalatest.tags.Slow" | 170 | sbt "testOnly *functional.LUBM -- -l org.scalatest.tags.Slow" |
145 | ``` | 171 | ``` |
146 | 172 | ||
147 | ## Changes introduced | 173 | ### Debugging |
148 | 174 | ||
149 | We tried to implement the system as close as possible to the theoretical description provided in [[1](#references)]. | 175 | You can set the logging level of RSAComb using the `-l | --logger` flag (see the help screen for more information). |
150 | Regardless, we had to deal with the fact that we where using different tools to carry out reasoning tasks and we where probably using a different language to implement the system. | 176 | When the logger is set to `verbose`, RSAComb will generate a set of files that contain the intermediate products of the program execution (these include the set of rules to generate the canonical model for the input ontology and the filtering rules derived from the input query). |
151 | The following is a (non exhaustive) summary of fixes (🔧), changes (🔄) and improvements (⚡), we introduced along the way: | 177 | These files are stored in the working directory, in a new folder named `rsacomb-<timestamp>`. |
152 | |||
153 | + 🔄 [RDFox](https://www.oxfordsemantic.tech/product) is used instead of DLV as the underlying LP engine. | ||
154 | |||
155 | + 🔧 In Def.4, the definition of built-in predicate `notIn` is wrong and should reflect the implicit semantics implied by the name, i.e., | ||
156 | 178 | ||
157 | > let [...] `notIn` be a built-in predicate which holds when the first argument is **not** an element of the set given as second argument | 179 | You can load these files directly into RDFox to simulate the same environment used by RSAComb, leaving you in a state just before the answer gathering process. |
158 | 180 | We also provide a convenient `simulate.rdfox` RDFox script that can be used to load all the necessary files in RDFox for you. | |
159 | This has been fixed by (1) introducing a built-in predicate `In` (note that instances of `In` can be computed beforehand since they only depend on the input ontology), and (2) implement `notIn` as the negation of `In` using RDFox *NaF* built-in support. | ||
160 | |||
161 | + 🔄 Top (`owl:Thing`) axiomatisation is performed introducing rules as follows. | ||
162 | Given `p` predicate (arity *n*) *in the original ontology*, the following rule is introduced: | ||
163 | ``` | ||
164 | owl:Thing[?X1], ..., owl:Thing[?Xn] :- p(?X1, ..., ?Xn) . | ||
165 | ``` | ||
166 | Note that, by definition, arity can be either 1 or 2. | ||
167 | 181 | ||
168 | + 🔄 Equality axiomatisation is performed introducing the following rules: | 182 | Let's suppose you run the following command from the root of the project |
169 | ``` | 183 | ```{.sh} |
170 | rsa:congruent[?X, ?X] :- owl:Thing[?X] . | 184 | java -cp lib/JRDFox.jar:target/scala-2.13/RSAComb-assembly-0.2.0.jar uk.ac.ox.cs.rsacomb.RSAComb -l verbose -o tests/lubm/univ-bench.owl -d tests/lubm/data/lubm1.ttl -q tests/lubm/queries.sparql |
171 | rsa:congruent[?Y, ?X] :- rsa:congruent[?X, ?Y] . | 185 | ``` |
172 | rsa:congruent[?X, ?Z] :- rsa:congruent[?X, ?Y], rsa:congruent[?Y, ?Z] . | ||
173 | ``` | ||
174 | defining equivalence as a congruence relation over terms in the ontology. | ||
175 | |||
176 | + 🔧 In Def. 4, the definition of built-in predicate `NI` is not consistent with its use in Table 3 and related description in Sec. 4.2. | ||
177 | We redefined `NI` as the set of all constants that are *equal* to a constant in the original ontology (according to the internal equality predicate `rsa:congruent`). | ||
178 | Note that, in this scenario, there is no need to introduce `NI` instances as facts in the system; | ||
179 | instead we can add a rule to populate the new predicate: | ||
180 | ``` | ||
181 | rsa:NI[?X] :- rsa:congruent[?X, ?Y], rsa:named[?Y] . | ||
182 | ``` | ||
183 | where `rsa:named` is an internal predicate keeping track of all constants in the original ontology. | ||
184 | |||
185 | + âš¡ In Def. 3, regarding the generation of the logic program used for the RSA check, only T5 axioms involving an unsafe role will introduce the internal predicates `PE` and `U`. | ||
186 | |||
187 | + âš¡ Both in the canonical model and the filtering program computations, rules without a body are loaded into RDFox as facts. | ||
188 | 186 | ||
189 | + ⚡ The `cycle` function introduced in Def.4 establishing the direction of the *unraveling* of loops is defined over triples `(A,R,B)`. We are currently limiting the triple only to those appearing in a T5 axiom `A ⊑ ∃R.B`. Note that this greatly limits the size of cycle for a given triple, and as a consequence limits the number of rules used to compute the canonical model. | 187 | This will answers all the queries in `tests/lubm/queries.sparql` and generate debug information in a new folder in the current working directory (let's say, `rsacomb-20211005120845/`). |
188 | You can run the provided RDFox script as follows | ||
189 | ```{.sh} | ||
190 | ./lib/RDFox-linux-x86_64-5.2.1/RDFox sandbox . "simulate <debug-folder> <data> <query-id>" | ||
191 | ``` | ||
192 | where | ||
193 | - `debug-folder` is the newly generated folder (`rsacomb-20211005120845` in this example) | ||
194 | - `<data>` is the path to the data file used with RSAComb (`tests/lubm/data/lubm1.ttl` in this example) | ||
195 | - `query-id` is the identifier of the query we want to simulate (if we want to simulate query 16 we will pass `16` as an argument) | ||
190 | 196 | ||
197 | This will launch a sandboxed RDFox console, where you will be able to explore a simulation of the datastore used by RSAComb. | ||
198 | You can also access the same datastore from the web interface at [http://localhost:12110/console/](http://localhost:12110/console/). | ||
191 | 199 | ||
192 | ## References | 200 | ## References |
193 | 201 | ||