Could you help me in parsing of .DOC files
Yuriy Kardapolov
clotofdarkness at hotmail.com
Thu Apr 14 11:55:28 BST 2011
Hello,
Project background:
I need to read .doc files in asp.net. It's needed for our project (converter).
I have downloaded documentation from Microsoft about msword file format.
But the instruction is very tangled and contains just description of different msword structures.
I can read compound file format (OLE2) and get any stream from it such as "WordDocument" "Table1" "Table0" etc.
I can get text from "WordDocument" stream. As I know there is all text of whole documents.
Also I have download wvWare 2 but can't compile it.
What I want is to know how parse the .DOC files and get text formatting such as font name,color,size,boldness etc.
My question:
Could you advise me how to read text formatting? What structures should I read for that in my .NET project?
Your any advice and suggestion will be very helpful for me!
Thank you in advance,
Kardapolov Yuriy
>I'm not really working on wvWare anymore. Actually, the code has been copied
>into the Calligra office suite (http://www.calligra-suite.org) repository, and
>that is where people are really working on the filter. It might be better to
>contact them with your questions.
-Benjamin
On Thursday 14 April 2011 04:23:26 you wrote:
> Hello Benjamin Cail,
>
> My questions:
>
> 1) Is it possible to compile wvWare 2 in MS Visual C++ 6? Is there other
> alternative to compile wvWare 2 in Windows where I will be able to debug
> and trace the code?
>
> 2) Could you advise me how to read text formatting? What structures should
> I read for that?
>
> Your any advice and suggestion will be very helpful for me!
>
> Thank you in advance,
> Kardapolov Yuriy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/calligra-devel/attachments/20110414/a8a55b5c/attachment.htm>
More information about the calligra-devel
mailing list