Problems with mimetype recognition
Andras Mantia
amantia at kde.org
Wed Oct 1 21:04:22 BST 2003
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Wednesday 01 October 2003 22:46, David Faure wrote:
> On Wednesday 01 October 2003 10:26, Andras Mantia wrote:
> > Hi,
> >
> > Altough we have a method to determine if a file is text or not with the
help
> > of mimetypes and the [X-KDE-Text] property, I still got reports about
users
> > who say that their PHP files are not recognized as text ones. I got an
> > example file from them and was surprised to see that the mimetype
detection
> > fails in KMimeType. In KMimeType::findFormatByFileContent first the
mimetype
> > is queried with findByFileContent() which calls
> > KMimeMagic::self()->findFileType() on it's own. This call returns for the
PHP
> > file in case "application/octet-stream" and for another which is
recognized
> > as text file, suprise: "text/x-c++-src". So even for files that were
> > recognized as text, it was only by chance as they were not treated as PHP.
> >
> > This means that either the "magic code" is broken or the magic fields for
PHP
> > are broken. Can someone with more knowledge take a look? If needed, I can
> > send the files in question.
>
> I just added a rule that looks for <?php at the beginning of the file.
> Problem is, we can only check at offset 0 (the XDG mimetype standard
> suggests checking between offsets 0 and 64, this isn't possible in our
> magic file currently, only by code).
Not so good, but may be acceptable.
>
> > I think a workaround would be to use findByURL() instead of
> > findByFileContent() as that one first tries a match by the extension, but
> > this doesn't solve the problem that findByFileContent() returns the wrong
> > mimetype.
>
> Yes, you should definitely use findByURL().
> The amount of PHP files not named *.php must be very very small, IMHO.
I use findByURL() in my code, but it returns application/x-php and from this I
can't figure out that it's a text file or not (well, I can if I have a list
of text mimetypes...) This was the reason why the [X-KDE-Text] and
findFormatByFileContent() was introduced. And this findFormatByFileContent()
fails, because it doesn't use findByURL(). I think I will add the findByURL()
call also there altough in this case findFormatByFileContent() doesn't look
for format by content in every case. ;-)
Andras
>
> --
> David FAURE, faure at kde.org, sponsored by Trolltech to work on KDE,
> Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org).
>
>
>
>
>
>
- --
Quanta Plus developer - http://quanta.sourceforge.net
K Desktop Environment - http://www.kde.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-rc1-SuSE (GNU/Linux)
iD8DBQE/ezNHTQdfac6L/08RAgAlAKCSQtsp4TKkup/uMnQyOsqd6OiR6wCgjd5V
Ui6XyAhB5cANEOo+n8JXIqM=
=9vkX
-----END PGP SIGNATURE-----
More information about the kde-core-devel
mailing list