Hey guys<br><br>I've been lately working on better automated testing for Nepomuk, and since I've never attempted stuff like this before, I'm not sure if I'm going in the right direction.<br><br>Nepomuk has a server-client architecture where there is a 'nepomukstorage' process which hosts the the virtuoso database, and all other nepomuk processes communicate with this process via dbus + local socket. Most of the client libraries are just thin wrappers around these socket/dbus calls along with some caching. So, in order to test any of the client libraries, we need to have the nepomuk server components running.<br>
<br>I wrote a small library to create a fake dbus + kde session and then start a proper nepomuk environment. This environment is created before each of the tests.<br><br>Problem -<br><br>* Unit tests are very slow - They require a fresh nepomuk instance to be run. We cannot independently test any one class since they all generally need to communicate with the database, and require the ontologies to be installed.<br>
<br>Approximate time running a test = 3 - 5 minutes. Maybe someone could look into the code (nepomuk-core/autotests/lib/tools/)? They are just a couple of shell scripts.<br><br>Installing the library<br>----------------------------<br>
<br>Should I be installing the library? It's purely for testing, but it would be useful if someone else wanted to write nepomuk enabled test.<br>
<br>Query Testing<br>--------------------<br>Nepomuk has a query library which provides a C++ interface to write queries, which it then converts to sparql. The current existing tests simply test the string output of the query library with hand crafted sparql queries. Maintaining these tests is hard since a slight optimization might change the sparql query, even though the results are the same.<br>
<br>In order to improve this situation, I started writing proper tests for the QueryLibrary which actually check if the correct results are returned. In order for these tests to work, we need to push data into Nepomuk, which requires the entire unit testing environment described above. It also requires injection of data into Nepomuk, which depending on the quantity, could take some time.<br clear="all">
<br>I wrote a simple DataGenerator class, which is supposed to create contacts, files, emails and other data into Nepomuk. The queries are then run against this data, and the results are checked. Is this the right way to go about it?<br>
<br>Backup-Restore Testing<br>----------------------------------<br>Exact same problem, we need data and a test environment. Both take a lot of time.<br><br>Benchmarking<br>---------------------<br>I have been thinking about using this data generator in order to be able to quantify improvements. How fast are searches for a email when you have 100 emails? 1k? 100k? The same is the case when pushing that kind of data into Nepomuk. It's fairly slow right now, and we need proper measurements.<br>
<br>Where do these benchmarks go? Are they supposed to be in the main repo? They also require this entire test environment.<br>-----<br><br>And finally - Are these really unit tests? We aren't testing one class at a time. Should these reside in the autotests directory?<br>
<br>Overall, I'd just like to know if that I'm not doing something stupid :)<br><br>-- <br><span style="color:rgb(192,192,192)">Vishesh Handa</span><br><br>