Review Request 114219: Do not encode QString to QByteArray and cast it back to QString. This causes problem when there are Unicode characters in ${HOME}

Yichao Yu yyc1992 at gmail.com
Sat Nov 30 02:13:12 GMT 2013



> On Nov. 29, 2013, 7:45 p.m., Thomas Lübking wrote:
> > kcontrol/krdb/krdb.cpp, line 102
> > <http://git.reviewboard.kde.org/r/114219/diff/1/?file=221836#file221836line102>
> >
> >     QFile::encodeName() seems equal to QString::toLocal8Bit(), ::decodeName() to ::fromLocal8Bit()
> >     
> >     I don't think one can just drop one of them and whether transcoding is required probably depends on what is done to the string interim.
> >     
> >     If at all "KToolInvocation::klauncher()->setLaunchEnv()" would perform a second decode, so it probaly depends on what that does.
> >     
> >     Was "locale charmap" determined by the reporter in the bug?
> >     
> >     ---
> >     
> >     Secret world domination plan:
> >     ------------------------------
> >     #1: classified
> >     #2: classified
> >     #3: force ASCII as global standard
> >     #4: classified
> >     #5: classified
> >     #6: classified
> >     #7: classified
> >     #8: classified
> >     #9: classified
> >     #a: classified
> 
> Yichao Yu wrote:
>     encodeName/toLocal8Bit is used to encode a unicode string to a c-string/byte-array representation and decodeName/fromLocal8Bit does the reverse.
>     
>     The proper decoding is already done in QFile::decodeName above and the QString already has the right unicode string in it.
>     
>     Basically, QString is not a wrapper of arbitrary c-string/byte-array, rather a wrapper of a unicode string so whatever done to a QString before or after should assume it is a valid unicode string and is independent of what encoding (utf8 in the case of dbus) is needed afterward.
>     
>     Encode to a byte array and cast it back can only cause wrong encoding in the second conversion and will not affect what is done in setLaunchEnv.
>
> 
> Yichao Yu wrote:
>     Or in another word QString has no encoding (well, by which I mean the internal encoding is trasparent to the user), only byte array and c-string has encoding.
>
> 
> Thomas Lübking wrote:
>     QString(QByteArray) according to the API doc actually differs between Qt4 & 5 (fromAscii -> fromUtf8) but an encoding should not happen nevertheless because:
>     
>     282 void KLauncher::setLaunchEnv(const QString &name, const QString &value)
>     283 {
>     284 #ifndef USE_KPROCESS_FOR_KIOSLAVES
>     285    klauncher_header request_header;
>     286    QByteArray requestData;
>     287    requestData.append(name.toLocal8Bit()).append('\0').append(value.toLocal8Bit()).append('\0');
>     
>     Also QString(QByteArray) is obvisouly problematic by itself for the apparent 4/5 "incompatibility".
> 
> Yichao Yu wrote:
>     I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.
>     
>     That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....
>
> 
> Yichao Yu wrote:
>     I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.
>     
>     That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....
>

I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.

That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....


> On Nov. 29, 2013, 7:45 p.m., Thomas Lübking wrote:
> > kcontrol/krdb/krdb.cpp, line 102
> > <http://git.reviewboard.kde.org/r/114219/diff/1/?file=221836#file221836line102>
> >
> >     QFile::encodeName() seems equal to QString::toLocal8Bit(), ::decodeName() to ::fromLocal8Bit()
> >     
> >     I don't think one can just drop one of them and whether transcoding is required probably depends on what is done to the string interim.
> >     
> >     If at all "KToolInvocation::klauncher()->setLaunchEnv()" would perform a second decode, so it probaly depends on what that does.
> >     
> >     Was "locale charmap" determined by the reporter in the bug?
> >     
> >     ---
> >     
> >     Secret world domination plan:
> >     ------------------------------
> >     #1: classified
> >     #2: classified
> >     #3: force ASCII as global standard
> >     #4: classified
> >     #5: classified
> >     #6: classified
> >     #7: classified
> >     #8: classified
> >     #9: classified
> >     #a: classified
> 
> Yichao Yu wrote:
>     encodeName/toLocal8Bit is used to encode a unicode string to a c-string/byte-array representation and decodeName/fromLocal8Bit does the reverse.
>     
>     The proper decoding is already done in QFile::decodeName above and the QString already has the right unicode string in it.
>     
>     Basically, QString is not a wrapper of arbitrary c-string/byte-array, rather a wrapper of a unicode string so whatever done to a QString before or after should assume it is a valid unicode string and is independent of what encoding (utf8 in the case of dbus) is needed afterward.
>     
>     Encode to a byte array and cast it back can only cause wrong encoding in the second conversion and will not affect what is done in setLaunchEnv.
>
> 
> Yichao Yu wrote:
>     Or in another word QString has no encoding (well, by which I mean the internal encoding is trasparent to the user), only byte array and c-string has encoding.
>
> 
> Thomas Lübking wrote:
>     QString(QByteArray) according to the API doc actually differs between Qt4 & 5 (fromAscii -> fromUtf8) but an encoding should not happen nevertheless because:
>     
>     282 void KLauncher::setLaunchEnv(const QString &name, const QString &value)
>     283 {
>     284 #ifndef USE_KPROCESS_FOR_KIOSLAVES
>     285    klauncher_header request_header;
>     286    QByteArray requestData;
>     287    requestData.append(name.toLocal8Bit()).append('\0').append(value.toLocal8Bit()).append('\0');
>     
>     Also QString(QByteArray) is obvisouly problematic by itself for the apparent 4/5 "incompatibility".
> 
> Yichao Yu wrote:
>     I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.
>     
>     That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....
>
> 
> Yichao Yu wrote:
>     I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.
>     
>     That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....
>

I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.

That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....


> On Nov. 29, 2013, 7:45 p.m., Thomas Lübking wrote:
> > kcontrol/krdb/krdb.cpp, line 102
> > <http://git.reviewboard.kde.org/r/114219/diff/1/?file=221836#file221836line102>
> >
> >     QFile::encodeName() seems equal to QString::toLocal8Bit(), ::decodeName() to ::fromLocal8Bit()
> >     
> >     I don't think one can just drop one of them and whether transcoding is required probably depends on what is done to the string interim.
> >     
> >     If at all "KToolInvocation::klauncher()->setLaunchEnv()" would perform a second decode, so it probaly depends on what that does.
> >     
> >     Was "locale charmap" determined by the reporter in the bug?
> >     
> >     ---
> >     
> >     Secret world domination plan:
> >     ------------------------------
> >     #1: classified
> >     #2: classified
> >     #3: force ASCII as global standard
> >     #4: classified
> >     #5: classified
> >     #6: classified
> >     #7: classified
> >     #8: classified
> >     #9: classified
> >     #a: classified
> 
> Yichao Yu wrote:
>     encodeName/toLocal8Bit is used to encode a unicode string to a c-string/byte-array representation and decodeName/fromLocal8Bit does the reverse.
>     
>     The proper decoding is already done in QFile::decodeName above and the QString already has the right unicode string in it.
>     
>     Basically, QString is not a wrapper of arbitrary c-string/byte-array, rather a wrapper of a unicode string so whatever done to a QString before or after should assume it is a valid unicode string and is independent of what encoding (utf8 in the case of dbus) is needed afterward.
>     
>     Encode to a byte array and cast it back can only cause wrong encoding in the second conversion and will not affect what is done in setLaunchEnv.
>
> 
> Yichao Yu wrote:
>     Or in another word QString has no encoding (well, by which I mean the internal encoding is trasparent to the user), only byte array and c-string has encoding.
>
> 
> Thomas Lübking wrote:
>     QString(QByteArray) according to the API doc actually differs between Qt4 & 5 (fromAscii -> fromUtf8) but an encoding should not happen nevertheless because:
>     
>     282 void KLauncher::setLaunchEnv(const QString &name, const QString &value)
>     283 {
>     284 #ifndef USE_KPROCESS_FOR_KIOSLAVES
>     285    klauncher_header request_header;
>     286    QByteArray requestData;
>     287    requestData.append(name.toLocal8Bit()).append('\0').append(value.toLocal8Bit()).append('\0');
>     
>     Also QString(QByteArray) is obvisouly problematic by itself for the apparent 4/5 "incompatibility".
> 
> Yichao Yu wrote:
>     I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.
>     
>     That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....
>
> 
> Yichao Yu wrote:
>     I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.
>     
>     That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....
>

I guess you can also put it in this this way (setLaunchEnv have toLocal8Bit inside) although I still think the simplest way is to remember QString -- encode --> QByteArray, QByteArray -- decode --> QString and always to the necessary explicit conversion.

That's why I hate hate hate this constructor. (and I've already fixed 3-4 bugs in KDE due to this constructor.) It might actually be helpful to compile KDE with it commented out and replace everything with explicit conversion.....


- Yichao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/114219/#review44857
-----------------------------------------------------------


On Nov. 29, 2013, 7:26 p.m., Yichao Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/114219/
> -----------------------------------------------------------
> 
> (Updated Nov. 29, 2013, 7:26 p.m.)
> 
> 
> Review request for kde-workspace, David Faure, Martin Gräßlin, and Hugo Pereira Da Costa.
> 
> 
> Bugs: 327919
>     http://bugs.kde.org/show_bug.cgi?id=327919
> 
> 
> Repository: kde-workspace
> 
> 
> Description
> -------
> 
> list.join already returns a QString and there is no need to encode it and cast back to QString again....
> 
> P.S. for a patch that applies to both KDE4 and KF5(master for kde-workspace, frameworks for kdelibs?) How should I submit review request? Should I add both in branch or submit two review request? (But often the patch cannot apply directly due to context or file path changes).
> 
> 
> Diffs
> -----
> 
>   kcontrol/krdb/krdb.cpp 92d84e9 
> 
> Diff: http://git.reviewboard.kde.org/r/114219/diff/
> 
> 
> Testing
> -------
> 
> Compiles.
> Waiting for bug reporter's test.
> 
> 
> Thanks,
> 
> Yichao Yu
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20131130/bec30296/attachment.htm>


More information about the kde-core-devel mailing list