How to fix import of headings in DOC/DOCX (text:h instead of text:p)?

Lassi Nieminen lassniem at gmail.com
Sun Jul 13 17:36:27 BST 2014


Docx the magical file is DocxXmlDocumentReader.cpp  (don't have code atm.,
so method names may be slightly different)

For style the structure seems to be:

<w:pPr><w:outlineLvl w:val="0"/></w:pPr>

In read_pPr you need to add a call to new method

read_outlineLvl where you can read the potential value

In the actual document in w:pPr you may find the outlineLvl or you may find
that it references some other style which has outlineLvl, here Heading1

<w:p>
<w:pPr>
<w:pStyle w:val="Heading1"/>
</w:pPr>
<w:r>
<w:t>Hi</w:t>
</w:r>
</w:p>

This means that in read_pStyle you should also check the whether the
referenced style has outlineLevel

Now in read_p method replace someWriter->startElement("text:p") with
someWriter("text:h") if the used paragraph style or the style it references
has outlineLvl set

Hopefully this approach works.

-Lassi




On Sun, Jul 13, 2014 at 2:48 AM, Friedrich W. H. Kossebau <kossebau at kde.org>
wrote:

> Hi,
>
> it seems that the filters for DOCX (and DOC?) do not import headings as
> <text:h>, but only as <text:p> (so also without any text:outline-level
> attributes).
>
> Can be seen e.g. in the new "Navigation" docker, it does not show any
> structure of an imported DOCX file (other than e.g. in LO).
>
> (Found the problem actually with the generator plugin for Okular, which
> meanwhile also supports access to the structure, querying like the docker
> the
> outline-level property).
>
> I would like to fix this.
> So, dear DOC/DOCX import filter experts, where do I have to look exactly,
> to
> make headings being imported as <text:h> with the proper outline-level?
>
> Cheers
> Friedrich
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/calligra-devel/attachments/20140713/74b687b7/attachment.htm>


More information about the calligra-devel mailing list