<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 03/26/2013 04:32 PM, Sebastian Sauer

      wrote:<br>

    </div>

    <blockquote cite="mid:51516B4A.6050205@dipe.org" type="cite">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      <div class="moz-cite-prefix">On 03/26/2013 02:51 PM, Lassi

        Nieminen wrote:<br>

      </div>

      <blockquote

cite="mid:CABCpZwrTLGFWnrfTF6NaTeJgKtRqAcc46OC6UmAqu3ktA+mHKA@mail.gmail.com"

        type="cite">Hola,<br>

        <br>

        <div class="gmail_quote">On Mon, Mar 25, 2013 at 8:12 PM, Inge

          Wallin <span dir="ltr"><<a moz-do-not-send="true"

              href="mailto:inge@lysator.liu.se" target="_blank">inge@lysator.liu.se</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div class="im">On Monday, March 25, 2013 17:54:53 <a

                moz-do-not-send="true"

                href="mailto:matus.uzak@gmail.com">matus.uzak@gmail.com</a>

              wrote:<br>

              > Hi,<br>

              ><br>

              > sorry for not discussing earlier, but I did not have

              much free time last<br>

              > two weeks.<br>

              ><br>

              > I think we should continue the parser type discussion

              in order to also<br>

              > improve state of things in libmsooxml.  What we have

              there is a PULL<br>

              > parser. And I identified the following problems

              (Would be cool is Lassi<br>

              > could check those):<br>

              ><br>

              > 1. OOXML sometimes requires us to run the parser

              twice at one element in<br>

              > order to first collect selected information required

              to convert the content<br>

              > of child elements.<br>

              ><br>

              > 2. There are situations when conversion of the 1st

              child of the root<br>

              > element requires information from the last child of

              the root element.<br>

              <br>

            </div>

            It would be interesting to see some examples of these two

            issues.</blockquote>

          <div><br>

          </div>

          <div>As an example : in pptx files, in slides,</div>

          <div>there can be text which is specified to use theme color

            lt1</div>

          <div><br>

          </div>

          <div>Don't remember the exact syntax, but something like</div>

          <div><p></div>

          <div><rPr "color" = "lt1"/></div>

          <div><r>Hejsan</r></div>

          <div></p></div>

          <div> <br>

          </div>

          <div>Then as the last element of that slide there may or may

            not be</div>

          <div><clrMap "lt1" = "bg1" ...../> // or something

            similar</div>

          <div><br>

          </div>

          <div>Which means that lt1 should be interpreted to be bg1 for

            this particular slide.</div>

          <div>Currently what we're doing is that we first read the

            slide once, skipping everything</div>

          <div>except clrMap. Then we read the slide again (yay!) and

            start the real conversion.</div>

          <div><br>

          </div>

          <div>There was something similar in xlsx filters too if my

            memory serves me correctly.</div>

          <div><br>

          </div>

        </div>

      </blockquote>

      <br>

      See also somewhat related XmlWriteBuffer in

      filters/libmsooxml/MsooXmlUtils.h which is used "when information

      that has to be written in advance is based on XML elements parsed

      later.  In such case the information cannot be saved in one pass"

      for OOXML=>ODF.<br>

      <br>

      In the case of XSLT I also remember that there where a problem

      with offset-references. Means something like (pseudo-xml):<br>

      <br>

      <style><br>

        <item>index 0</index><br>

        <item>index 1</index><br>

        <item>index 2</index><br>

      </style><br>

      <br>

      <content><br>

        <content withStyleIndex="1"> // where 1 references to the

      second stlye-item<br>

      <content><br>

      <br>

      XSLT does iirc not allow such index-based reference-fetching

      making it needed to for-loop with counter over the <style>

      items all the time they are referenced. Super expensive and iirc

      not caching is done (my knowledge there is a few years old, so

      maybe that changed). A classic case where someone just likes to

      introduce a "caching concept" to read all the items at once,

      prepare them and access them later on direct by index from a

      style-container/mnager. OOXML makes quit a lot of use of such

      index-based references being a 1:1 port from C/C++ to XML.<br>

    </blockquote>

    <br>

    Also somewhat related: Hard to say if caused by ugly design

    decisions alone or driven by XSLT limitations (would think both) but

    years ago when the CleverAge OOXML=>ODF converter sponsored by

    Microsoft appeared during the OOXML ISO battle I investigated that

    code (for my diploma thesis which had OOXML<=>ODF as subject).

    Lots of intermedia-steps (pre- and post processing, multiple xslt

    runs).<br>

    <br>

    Code is still available at:

<a class="moz-txt-link-freetext" href="http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/">http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/</a><br>

    Readme:

<a class="moz-txt-link-freetext" href="http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/Readme.txt?revision=5309&view=markup">http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/Readme.txt?revision=5309&view=markup</a><br>

    The main converter lib:

<a class="moz-txt-link-freetext" href="http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/Common/OdfConverterLib/">http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/Common/OdfConverterLib/</a><br>

    The xsl's:

<a class="moz-txt-link-freetext" href="http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/Common/OdfConverterLib/resources/oox2odf/">http://odf-converter.svn.sourceforge.net/viewvc/odf-converter/trunk/source/Common/OdfConverterLib/resources/oox2odf/</a><br>

    <br>

    It wasn't that bad but I can confirm Rob Weir's blog back then that

    the converter needs >10x longer then anything else and is a

    memory-monster.<br>

    <br>

    <blockquote cite="mid:51516B4A.6050205@dipe.org" type="cite"> <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

calligra-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:calligra-devel@kde.org">calligra-devel@kde.org</a>

<a class="moz-txt-link-freetext" href="https://mail.kde.org/mailman/listinfo/calligra-devel">https://mail.kde.org/mailman/listinfo/calligra-devel</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>