Using ODF Relax NG schema to generate easier XML writing classes

Jos van den Oever jos.van.den.oever at kogmbh.com
Sat Jun 11 18:07:52 BST 2011


Hi again,

Here is a new version with some improvements:
 - attributes setters have the right type (double, int, string, bool, etc)
 - attributes setters have information on whether attribute is required
 - attributes setters have information on what values from a list are allowed
 - each Writer now has an end() call which has to be called: using destructor 
for this is not nice for script bindings
 - better parsing of all the information in the relax ng schema (was 
incomplete before)

On Saturday, June 11, 2011 04:57:19 AM Sebastian Sauer wrote:
> Also this seems to be a very regression friendly approach. I don't think
> throwing away what we have is a good idea nor that doing the job
> (loading+saving) in 2 completly different ways is good.
This approach can coexist with the current approach since the KoXmlWriter is 
always available. However see below on why it would be better to eventually 
have only this approach.
The new approach is not radially different from what we have. As long as it is 
optional, it is only synctactic sugar.

On Saturday, June 11, 2011 07:53:43 AM Thorsten Zachmann wrote:
> Sometimes different type of attributes are possible e.g. number or text
Fixed, this is supported. It could be improved with types for QUri QTime etc 
which KoXmlWriter does not support yet. See mapping in this patch:
http://gitorious.org/odfkit/webodf/commit/303ae84171634ea51be602d961e957eb61b3ed10/diffs

On Saturday, June 11, 2011 07:52:25 AM Thorsten Zachmann wrote:
> >   TextPWriter textP1 = textContentWriter.startTextPWriter();
> >   TextPWriter textP2 = textContentWriter.startTextPWriter();
> >   textP1.writeText("hello");
> > 
> > At debug time, such errors can be detected by passing a digital baton
> > between the classes and reporting an error if a class tries to write
> > without having a baton. In the above code, the textP1 would have the
> > baton and
> > textContentWriiter cannot instantiate a textP2 until it gets back the
> > baton when textP1 is destructed. That would add overhead that can
> > disabled in a release.
> 
> Not sure If I understood correctly but how would that solved stuff of
> nested tags?
Yes, nesting as in the example is not easy to fix at compile time. What would 
be possible to fix is auto-closing of tags, but at an overhead cost.
So now the situation is the same as with startElement() and endElement() in 
KoXmlWriter.

Now for some more serious advantages to this approach that go beyond merely 
avoiding coding errors and coding faster with autocompletion.

Once the writing and reading of ODF XML is behind these generated API we get 
these advantages:
 1) easy to support different ODF versions or at least easy to see where 
certain tags or attributes are not valid at compile time.
 2) easy to upgrade to a new experimental ODF version and find changes
 3) possibility to use a different serialization such as EXI (Efficient XML 
Interchange) or JSON.
 4) possibility to optimize serialization more easily: all code is in one 
place and the API has not XML tag or attribute name strings: only values.
 5) ability to easily change the prefixes (text:, style: etc)

Especially point 3 is important: the filters generate ODF XML, write it to a 
ZIP, then unzip and read it again. The mobile developers know how much of a 
performance burden this is. If we could use a faster serialization (perhaps 
EXI [1]) then importing non-ODF document will be much faster.

The API as I use it now is of course up for discussion. I can imagine people 
have ideas for improvement. E.g. one could support chaining, like so:

 TextPWriter(text).writeTextStyleName("bold")
    .addTextNode("Hello ODF!").end();

Also method for naming could be changed. At the moment text:p becomes 
TextPWriter. This could also be text::pWriter or odf1_2::text::out::P.
Take into account that for setting the attributes, no '::' is possible. For 
this reason I've not used them in the class names either. an ODF_1_2 namespace 
makes sense though.

I leave you with the current example code:

#include "odf.h"
#include <QtCore/QBuffer>
#include <QtCore/QFile>

int
main() {
    QFile out;
    out.open(stdout, QIODevice::WriteOnly);
    KoXmlWriter xml(&out);
    {
        OfficeDocumentWriter doc(&xml);
        OfficeBodyWriter content(doc);
        OfficeTextWriter text(content);
        TextHWriter h(text);
        h.writeTextOutlineLevel(1);
        h.addTextNode("Hello ODF!");
        h.end();
        TextPWriter p(text);
        p.addTextNode("This is paragraph 1.");
        p.end();
        p = TextPWriter(text);
        p.addTextNode("This is paragraph 2.");
        p.end();
        text.end();
        content.end();
        doc.end();
    }
    out.close();
    return 0;
}

Note that this currently will output without the required office:version and 
that text:outline-level in text:h is not part of the constructor atm even 
though it is required.

Cheers,
Jos

[1] http://www.w3.org/TR/2011/REC-exi-20110310/


-- 
Jos van den Oever, software architect
+49 391 25 19 15 53
074 3491911
http://kogmbh.com/legal/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: odf.h.bz2
Type: application/x-bzip
Size: 61883 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/calligra-devel/attachments/20110611/44989ba3/attachment.bin>


More information about the calligra-devel mailing list