[Nepomuk] Query parser - idea.

Fri Apr 19 20:39:32 UTC 2013

Hi,

My name is Lukasz, I'm from Poland and I'd like to take part in GSoC this
year. I'm pretty impressed by a quality of code and the way KDE development
works. You're doing a great job and I want to work with you! :)

To the point. I want to rewrite current query parser. Now, I'm having good
time thinking about it and I wonder whether using of Bison + Flex is
desirable to generate parser. In whole KDE, Bison is used in several
subprojects, I think it might me the best choice if we consider future
maintenance and code readability.

Few weeks ago, I made a patch which optimizes regexp cache mechanism in
Nepomuk. Some pattern recognition is involved and whole code looks pretty
complicated and might be hard to understand at first glance. On the other
hand, code prepared to use with Bison and Flex looks more clear and I
believe we should go that way. What is your opinion?

To achieve goal (
http://community.kde.org/GSoC/2013/Ideas#Project:_A_.22real.22_query_parser_and_Query_Builder_Widget),
we will need two parsers. One for checking and parsing completed query to
SPARQL request and another for checking correctness in not finished query.
If not finished query is correct, it will be compiled to more general
request (for example, query ending with "hastag:" will be compiled to
something like "get all tags considering earlier part of query"). I'll try
to show you how I imagine process of parsing single query.

First, all query extensions (localized words like "music", "yesterday",
date and time),  will be mapped in traditional way, by just replacing
strings in query.

Second, query will be checked by parser which will show whether it is
already completed or not. If it's not, but it's, let's say, semi-completed,
a mechanism will provide a avaliable list of keywords. For example a
semi-completed query may be "hastag:" and provided autocomplete list will
contain all of available tags. If query is more specific, autocomplete will
also include it.

Lastly, finished query will be parsed and compiled into SPARQL request.
Existing parsing errors might be shown to user as suggestions and... that's
all :) The most difficult part of this task will probably be grammar
creation and optimization of autocomplete mechanism to provide fast and not
much cpu-consuming hints.

You have much more deep knowledge and experience in the use of Nepomuk,
could you tell me what is your opinion? Or maybe for some reason, some
parts of my idea may not work as expected? I realize this description is
very general, I'd be very happy to write it in more details :)

Have a nice day,

Lukasz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20130419/e4bbf2bb/attachment.html>