What is my name?

Frans Englich englich at kde.org
Tue May 30 01:08:00 CEST 2006


On Monday 29 May 2006 21:43, Cornelius Schumacher wrote:
> On Monday 29 May 2006 23:10, Frans Englich wrote:
> > I am personally very interested in XQuery in the KDE-context(what a
> > surprise). For those who doesn't know XQuery, it is basically SQL for
> > XML(XML Query == XQuery).
>
> Well, in contrast to SQL XQuery is missing the data manipulation part.

XQuery Update Facility
http://www.w3.org/TR/xqupdate/

Stil early draft stage though.

There is an enormous pressure from the industry for XQuery. Basically, the database industry wants to get a piece of the XML market, and the XML market badly needs the backend software. IBM, Oracle, Microsoft, Datadirect as well as about 40 other implementors(registered by the working group) have XQuery engines in their product line ups.

> > Combined with that the data model is abstracted,
> > means that we in KDE can let loose a full fledged query engine on any
> > data that can be represented in a hierachial way. In other words, it
> > doesn't have to be "XML" as in "those text files" that is
> > queries/searched, but pretty much everything. Later one one can also
> > implement Full Text searching.[1]
>
> How useful is XQuery outside of a database context?

The more complex queries you need to do, the larger is the gain, I'd say. It can be very simple stuff. Try writing DOM code that evaluates the equivalent of "/foo/bar/moo[@myProperty = kde:myFunc()][1]" and that doesn't walk more 'foo' elements necessary than to find the first "moo" element that is a parent of 'bar' and have a property by name myProperty that has the value returned from kde:myFunc().

Then, try to maintain that DOM code. That's also an important aspect. That is, writing for example "<foo/>{my/data/to/select[last() -1]}</foo>" can be a lot simpler/easier/safer/readable than coding it by hand.

When determing when XQuery/XSL-T/XPath can be useful is probably best done on a per-case basis. One can't really give an all-in-one answer.

Btw, Apple's Sherlock uses XQuery(but I know zero about it).

> Without the help of a 
> database backend it seems to be a bit questionable, if the enormous
> flexibility of XQuery is worth the performance.

I'm not sure what you refer to here with the word "flexibility". When I know, I can comment on "costs", etc.

> > It would probably be healthy to brainstorm a bit on the possibilities of
> > this on k-c-d or similar, to open up opportunities, and so on.
>
> This sounds like a good idea, but I'm not really sure that would lead to
> anywhere, when it's not clear yet how somebody would be able to use it a
> all given the lack of a public API.

Yeah, sounds sensible.

> > > Isn't this tied to the rest of KDom?
> >
> > Actually not, it will be the other way around; KHTML/KDOM will depend on
> > the XPath/XSL-T/XQuery code. KDOM/KHTML's trees are specialized for
> > rendering/web compatibility, leading to performance impacts for those who
> > only need data representation(and they neither allow different tree
> > implementations) and they also bring in dependencies on X11 and what not
> > :) We'll probably depend on kxmlcore beyond kdecore/qtcore, though.
> >
> > May I ask where your interest stems from? Identity constrains in XML
> > Schema? XPath need in XForms? Our parser/tokenizers are already geared
> > toward different XQuery/XPath dialects, so doing that kind of stuff is on
> > the roadmap.
>
> I would love to have an XPath implementation which could seamlessly be used
> in existing XML processing code. e.g. applying an XPath to a QDom document
> and getting a QDomElement as result.
>
> It would also be interesting to have a XSLT implementation which could be
> used instead of libxslt. But I'm not sure how much we would be able to gain
> from it. A very simple wrapper of libxslt already provides most of the API
> one would need and I don't know, if the performance of a new implementation
> would be comparable to libxslt.

It is hard to tell. libxslt beats many processors even though it's not very optimized, simply by being coded by C. We have written performance-consciouly C++ code, and in several areas we surpass libxml2 in optimizations. I think we have a very good position in terms of performance because of C++ and the sophistiance of the engine. The algortihms used have a massive impact on performance and that's why for example Saxon(Java) can beat libxslt in many cases, and we're moving in the direction of Saxon's sophistication.

However, it is rather commonly agreed among implementors that one of the most costly parts of an XSL-T transform is constructing the result tree(that is, memory allocation) and serialization(IO, memory allocation), and using libxslt in KDE is here a notable performance impact since one must convert between trees/string representations. For example, I would like to see KOffice export to HTML/Docbook/etc by transforming its internal tree directly, and that's an expensive operation if libxslt is used.

It is also a big matter of features, of course. XSL-T 2.0/XPath 2.0 is a breeze compared to 1.0(and this in turn also affects performance).

Another aspect is integration.

If the XSL-T engine is well-integrated into KHTML(it's a lot of work until then, but anyway), one can construct KHTML's tree directly, and render incrementally.

However, I'm still strongly interested in the KDE integration part. With the 2.0 technologies the data model allows elements/attributes to contain any XML Schema types, so you can have an element with base64 content or a list of floating points. One can also do things like registering your own XQuery/XSL-T expressions/functions and in this way hook "back" into your KDE/C++ code(with Qt introspection, elegant things can probably be done with that).

> Would be interesting to port for example 
> the help kioslave to your xslt lib and compare the performance. As with
> XPath it might be interesting to be able to apply XSLTs to existing
> represenations of XML data, not only text, but also a QDom tree.
>
> XQuery is quite a beast,

In what way?

> so I don't know how interesting this would be to 
> application developers.

A wide spread opinion is the exact opposite. People who hold courses in XSL-T or work with it a lot say people with procedural backgrounds(C++, Java programmers, etc) have a hard time getting XSL-T's declarative approach. They don't use the power of templates, and write tons of for-each/if instead. So, many think XQuery will hit because it is easier for SQL people, has Perl-ish syntax, and so on.

Basically, there is no task you can do in XQuery that you cannot do with XSL-T(and vice versa). The languages are more practical for different tasks(doing something like transforming Docbook to XHTML is a hell with XQuery), and optimized for different thing(XQuery tailored towards computation, heavier querying).

So, if I would pick up my crystal ball, I would actually put my money on that people will like XQuery more.

> It still would be nice to have an implementation. 
> The API for this could be very simple as well when it just uses existing
> XML representations.
>
> I would really like to try your implementation. If it doesn't depend on
> KDom it should be possible to create something like a stand-alone package
> which already works now, right?

Yeah, it is very close. Here's the obstacles:

* It doesn't have a name, so we don't have a command line util yet, grr.

* It depend on a certain kdelibs revision, more specifically we're not aligned with the i18n() and (k/kd)Debug() changes -- but that's easily fixed. The reason I haven't fixed it is that KDOM is non-trivial, and it's done in a branch(work/khtml-svg).
* The implementation of the features you would use, XML literals("foo/>"), path expressions("foo/bar"), and I/O has been postponed and are not done. But it's all that's remaining essentially, so after that XQuery is pretty much complete.

So, I unfortunately have to decline. But once a name is found, we can align with kdelibs, create a command line util, and do a kdelibs merge(and continue development in a branch).


Cheers,

		Frans


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.kde.org/pipermail/kde-quality/attachments/20060530/fa107eee/attachment-0001.html 


More information about the kde-quality mailing list