[Nepomuk] More information about queries performance issues

Ignacio Serantes kde at aynoa.net
Mon Feb 27 00:32:13 UTC 2012


Hi,

More information about query performance issues obtained doing some test in
my home system.

1) If query fails and [Virtuoso Server]SQ200: The memory pool size
400016176 reached the limit 400000000 bytes, try to increase the
MaxMemPoolSize ini setting.) error is not raised Virtuoso goes crazy and
the only solution is to restart it.
2) When Virtuoso goes crazy is still working, queries still works as usual,
but is consuming a lot of cpu with the failing query. This can be observed
executing command status(); in isql.
2) Actual QueryParser() is only working with 4 terms, with 5 terms queries
begins to fail.
3) I'm hacking with Nepoogle the query genereated to use subqueries and the
problems are minimized but not solved. There is a new limit of 28 terms
before problems begins again.
4) Nepoogle's SPARQL queries are slow than QueryParser() queries (hacked
subqueries version) until some number of terms (over 8-10), then Nepoogle's
queries are fast. This is a comparison using always hastag: trying to
minimize the impact of others aspects in the queries.
5) I tried Nepoogle with more than 200 terms (duplicating 11 terms because
I can't obtain results using 200 different terms) without errors and is
really fast. I'm assuming that Virtuoso optimizer is reusing subqueries
results, this is a right and expected behavior, because query is really
fast, less than 1 second, and this is not logical.

I done this test using QueryParser() but the full search API must be
affected if it's building queries like QueryParser() does.

The test machine:
-Distribution: openSUSE 11.4
-KDE version: 4.8.0 (official packages updated to last available version)
-Memory reserved to Nepomuk: 100 MBytes
-Files indexed: 27.318
-Database size: 262.1 MBytes

I have few free time this weekend so *this test could not be accurate*, in
particular speed comparison test and terms limit.

Nepoogle's git version has a new prefix "e2" that use the hacked with
subqueries API query so, if others are interested in doing some test it's
easy to generate queries that can be tested with NepSak, or your favorite
method, executing Nepoogle in console. Examples:

   - tag1 using Nepoogle's engine: *nepoogle --verbose=on --results=off
   hastag:tag1*
   - tag1 using QueryParser(): *nepoogle --verbose=on --results=off e0
   hastag:tag1*
   - tag1 using hacked version of QueryParser(): *nepoogle --verbose=on
   --results=off e2 hastag:tag1*

previous commands print to stdout the equivalent SPARQL query and a summary
and don't display results.

-- 
Best wishes,
Ignacio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20120227/9e5ded45/attachment.html>


More information about the Nepomuk mailing list