[Kde-pim] decodeRFC2047String

Ingo Klöcker kloecker at kde.org
Sun Aug 23 18:19:25 BST 2009


On Sunday 23 August 2009, Martin Koller wrote:
> On Sunday 23 August 2009, Thomas McGuire wrote:
> > Hi,
> >
> > On Saturday 22 August 2009 23:26:25 you wrote:
> > > tying to fix a drag'n drop bug from a mail URL into a
> > > AddresseeLineEdit (e.g. drop a mailto url with an umlaut to the
> > > identity dialog in kmail/advanced tab/reply to or BCC field -
> > > same problem as bug 138725) I wanted to use a decodeRFC2047String
> > > method.
> > > But there are some of them ... :-(
> > >
> > > There is one in KMime, and one in KIMAP.
> > > Which is the preferred one ? Why are there 2 at all ?
> > > (kmail uses the one from KMime)
> > >
> > > Also, the latter in KIMAP seems to have a bug: when I drag/drop a
> > > mailto url with an umlaut in it, e.g. "Götz" <goetz at gmail.com>
> > > KIMAP's decode function snippes the Götz and returns ""
> > > <goetz at gmail.com>
> >
> > RFC2047 encoded strings are not allowed to have umlauts, the
> > umlauts need to be encoded (which is the whole point of RFC2047).
> > So KIMAP's error handling in that case is not the best. Does KMime
> > do something better?
>
> Misunderstanding. What gets dragged is already RFC2047 encoded, but
> the patch is about decoding it back to a QString which can then be
> displayed correctly.
>
> > In the patch, I'm sceptical of the "address.toUtf8()" part, because
> > the RFC2047-encoded string should contain only ASCII characters,
> > and never anything else.

"Be strict when sending and tolerant when receiving." (RFC 1958 - 
Architectural Principles of the Internet)

The decodeRFC2047() method we used to have in KMail followed this 
principle by allowing non-ASCII characters in the input. Those 
non-ASCII characters were left untouched. In particular, this 
decodeRFC2047() method was idempotent, i.e.
decodeRFC2047( s ) == decodeRFC2047( decodeRFC2047( s ) ) for all 
QString s. Moreover, decodeRFC2047() could be applied to the content of 
the message headers without checking whether they are actually RFC2047 
encoded (optimistic decoding).


> > It would fail if the charset of the string 
> > would be Japanese sjis, instead of Utf8 (although both are wrong).
>
> You're right. I was misled by the implementation inside kmime, which
> calls at one point decodeRFC2047String( src, usedCS, "utf-8", false
> );

In the case of a mailto URL it is safe to assume that umlauts have been 
encoded properly. In the patch the line
+          address = KUrl::fromPercentEncoding( u.path().toLatin1() );
should be
+          address = u.path();
since KUrl::path() already returns the decoded path.

Now the question is whether all applications do apply RFC 2047 encoding 
on the addresses they put into a mailto URL. Our safest bet would be to 
write a variant of KMime::decodeRFC2047String() accepting a QString as 
input and allowing/ignoring any non-ASCII characters (see above).

Then the patch could be reduced to
-          contents.append( (*it).path() );
+          contents.append( KMime::decodeRFC2047String( 
(*it).path() ) );

BTW, there is another minor error in AddresseeLineEdit::dropEvent(). If 
the dropped URL is not a mailto URL then still a comma is appended to 
the content of the line edit. The lines
  if ( !contents.isEmpty() ) {
    contents.append( ", " );
  }
should be moved just before the above line appending the dropped 
address.


Regards,
Ingo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20090823/939ca2f3/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/


More information about the kde-pim mailing list