Review Request: Make Calligra able to load foo.doc that really contains a docx file, and similar for xls and ppt.

Inge Wallin inge at lysator.liu.se
Mon May 2 14:47:03 BST 2011



> On May 2, 2011, 1:38 p.m., David Faure wrote:
> > Such hardcoded lists of mimetypes and extensions make me cringe :)
> > 
> > Did you first check what KMimeType::findByContent returns? Surely in many cases it can help, when findByUrl is fooled by a wrong extension.
> > Althought, it might not be able to distinguish between types of msoffice docs, for lack of good magic.
> > But at least it should be able to tell you easily whether a .doc file is OLE or not.

Yes, the table is very much based on what KMimeType::findByContent returns. I can't say that I really like it, but the only way to fix this in the "right" way is to add much more data to the common mimetype databases.  This is the best way, but it would take a long time until it was processed by FreeDesktop.org and even longer until it was out in the linux distributions (and Maemo/MeeGo even more so).

If you think I should put more of the table into code, I could do so, but the table was the best way I could come up with.  Regarding the hardcoded extensions that make you cringe, they made me do the same while writing it but these seem to be the only types of files that are mislabled this way.  At least that I'm aware of.  Would magic like this make sense to add to KMimeType?


- Inge


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/101271/#review3056
-----------------------------------------------------------


On May 2, 2011, 1:40 p.m., Inge Wallin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/101271/
> -----------------------------------------------------------
> 
> (Updated May 2, 2011, 1:40 p.m.)
> 
> 
> Review request for Calligra and David Faure.
> 
> 
> Summary
> -------
> 
> Lately, a lot of document has been cropping up that have names like foo.doc but that really are docx files inside.  Similar goes for xls/xlsx and ppt/pptx. This patch handles this case by not just using the name for determining the mimetype of the file while loading, but also looking at the contents (KMimeType:findByContents). It also introduces a replacement scheme to take care of reported mimetypes like application/zip instead of, say, docx.
> 
> 
> Diffs
> -----
> 
>   krita/plugins/filters/fastcolortransfer/fastcolortransfer.cpp fc94465 
>   krita/sdk/tests/filestest.h ef6f0f0 
>   krita/ui/kis_import_catcher.cc 8c2c42a 
>   libs/main/KoDocument.cpp 1ed2052 
>   libs/main/KoFilterManager.h fc7731c 
>   libs/main/KoFilterManager.cpp f840f69 
> 
> Diff: http://git.reviewboard.kde.org/r/101271/diff
> 
> 
> Testing
> -------
> 
> This patch is tested on all combinations of doc/docx, ppt/pptx and xls/xlsx as well as on files containing the actual format that the names suggest.
> 
> 
> Thanks,
> 
> Inge
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/calligra-devel/attachments/20110502/5cdbc744/attachment.htm>


More information about the calligra-devel mailing list