Could you help me in parsing of .DOC files
Yuriy Kardapolov
clotofdarkness at hotmail.com
Thu Apr 14 11:46:11 BST 2011
Hello,
Project background:
I need to read .doc files in asp.net. It's needed for our project (converter).
I have downloaded documentation from Microsoft about msword file format.
But the instruction is very tangled and contains just description of different msword structures.
I can read compound file format (OLE2) and get any stream from it such as "WordDocument" "Table1" "Table0" etc.
I can get text from "WordDocument" stream. As I know there is all text of whole documents.
Also I have download wvWare 2 but can't compile it.
What I want is to know how parse the .DOC files and get text formatting such as font name,color,size,boldness etc.
My question:
Could you advise me how to read text formatting? What structures should I read for that in my .NET project?
Your any advice and suggestion will be very helpful for me!
Thank you in advance,
Kardapolov Yuriy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/calligra-devel/attachments/20110414/54de9924/attachment.htm>
More information about the calligra-devel
mailing list