[Nepomuk] Re: Advances in my alternative query system and several questions

Mon Feb 28 08:32:55 CET 2011

On 02/27/2011 02:32 PM, Ignacio Serantes wrote:
> Hi!
> 
> As I explain in my presentation I'm working in an alternative system to
> nepomuksearch protocol and dolphin's filter panel. My first try was last
> weekend with basic success but my sparql builder lack of certain logic
> problems, and code was a mess, so I rewritten yesterday from scratch.
> 
> I created a new class nsSparqlBuilder to build queries and I think that
> I solved all previous logic problems with negations and when you
> mixtured different ontologies. In current version supported query syntax
> is the next:
> 
> exact_string = "string" | 'string'
> value = exact_string | number | string
> 
> term = [ + | - ] value [:ontology]
> query = term [ [ and | or ] term ]...
> 
> but this is not the final syntax, dates, arithmetic operators (currently
> only "!=" "==" and "=" are supported), for numbers and dates basically,
> wildcards and parenthesis are not ready yet but probably will be next
> weekend.
> 
> Here are some working samples extracted for my test cases:
> "kim sa rang ha ji won"
> "'kim sa rang' -'ha ji won'"
> "música:hastag tomtom:hastag or 'son dam bi':hasTag música:hastag"
> "guilty 9:rating"
> "悪魔と契約し:name"
> "'giruti akuma' or 'otoko':description"
> "+'abe hiroshi' 'natsukawa yui'"
> 
> There is no dependencies between parser engine and query builder so,
> don't worry, change query syntax is easy so there is no problem to
> implement a syntax similar actual nepomukquery with some enhancements. I
> at least add "and" to this version but you can write queries without
> using it :)
> 
> Because a lack of time I only written 15 test cases but some human
> testers are welcome. You only need python 2.6 and kde python bindings
> installed.
> 
> While I coding some questions present to me so I ask you guys for help.
> 
> 1) What would be the search engine behavior when you write a simple
> query like this "music"?
> 
> Actually I use nsSparqlBuilder.ontologyFilters to determine what filters
> I must add to the query. Currently nao:description, nao:identifier,
> nie:url and nao:hasTag are supported but this method is manual so I
> wonder if there is a better solution out there.

That is why I wrote the query API. It handles all that for you and is
optimized. There is no need for you to try to solve all the SPARQL
problems again.

> I add two methods, nsSparqlBuilder.AddOntologyFilter() and
> nsSparqlBuilder.RemoveOntologyFilter() (not coded yet :)),
> to dynamically change the behavior of the query builder but the best
> solution would be an automatic method. Any suggestions?

Again: do not do it manually, use the query API.

> 2) There is a method to determine an ontology type?

What do you mean by "ontology type"?

> Actually I store in nsSparqlBuilder.ontologyFilters the type because
> this information is required to build filter clause and this is far to
> be a good solution. Coerced all data as string has collateral problems
> with arithmetic operators and a time penalty so I look for a better
> solution.
> 
> 3) When I search for a term I must use relations to search for more
> items? At what level I must stop the search?
> 
> For example, I search for "korea" (without any ontology) and in my
> database there is files tagged with "korea" but there is tags or
> contacts with "nao:isRelated" "korea" also. The case is more complex if
> you thing that there are many relation kinds "nmm:actor", "nmm:writer",
> "nco:creator", etc.
> 
> Adding all this stuff will produce slow queries so I don't sure about
> the better approach. There is even more work to do if you don't stop in
> first level so you could finish with a pretty big and slow query.
> 
> On the other side, what is the better solution to determine what
> relations I must use. I know that all are in documentation but I'm
> asking for a non hardcoded solution.

Use LiteralTerm("korea") and have it all done for you. :)

> 4) What are the basic ontologies that must be implemented?
> 
> I currently support the next abbreviates: description, hasTag, rating or
> numericRating, prefLabel, title, url or name and tag that are converted
> to proper ontology. You even write completed ontologies names, so
> nao:url and url are the same and there is no limitations in this case
> and you can add any available ontology but, with the problem of point 2,
> probably this will not work well with non string ontologies and more
> work is needed and doing test are pending.

I do not really see the problem. Do you mean you check for "nao:hasTag"
string in the query? An alternative is always to query for the
properties. This is what the current query parser does. If performance
is an issue you can always cache the properties.
Keep in mind that all ontologies are loaded into the database.

> But the really big problem is with relation ontologies like nao:hasTag.
> In this particular case query is "?x nao:hasTag ?y . ?y nao:identifier
> ?z . FILTER(?z)", so you need a inner join and you do the real filter in
> nao:identifier but this is not applicable to all relations. Some light
> about this stuff are welcome.

Actually the query should look like this:
?r nao:hasTag ?t . ?t nao:prefLabel ?l . ?l bif:contains "foobar" .
And that is why I can only stress: use the query API. This is why I
wrote it. :)

Cheers,
Sebastian