[Kde-bindings] [Bug 209046] New: krosspython: UTF-8 python strings are encoded as ASCII when converted to QString

Daniel Calviño Sánchez danxuliu at gmail.com
Wed Sep 30 22:38:45 UTC 2009


https://bugs.kde.org/show_bug.cgi?id=209046

           Summary: krosspython: UTF-8 python strings are encoded as ASCII
                    when converted to QString
           Product: bindings
           Version: unspecified
          Platform: Compiled Sources
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: NOR
         Component: general
        AssignedTo: kde-bindings at kde.org
        ReportedBy: danxuliu at gmail.com


Version:            (using Devel)
Compiler:          gcc 4.3.2 
OS:                Linux
Installed from:    Compiled sources

First of all, I have scarce Python knowledge, so maybe I'm doing something
wrong and this is not a real bug. I apologize if that's the case.

In Python scripts encoded with UTF-8, when a string is passed to a C++ method,
the QString received by the method is the ASCII version of the UTF-8 string.
That is, UTF-8 characters that need more than 1 byte aren't stored as a single
character in the QString, but using 1 character for each byte. For example, a
"ñ" in Python would be stored as "ñ" in the QString.

This does happen not only with hardcoded strings, but also with UTF-8 strings
passed from a QString to Python and back again. So that means that translated
strings got from Kross::TranslationModule
(http://http//api.kde.org/4.x-api/kdelibs-apidocs/kross/html/classKross_1_1TranslationModule.html)
in the scripts aren't correctly converted to QString.

Note, however, that UTF-8 QStrings are correctly passed to Python. The problem
only appears when passing a UTF-8 string from Python to C++.

In the next comments I'm going to post a unit test showing the problem and a
workaround for it.

Oh, and although I suppose that it is not related, at least in my system the
test in kdelibs/kross/tests/unittest.py:65 fails:
self.assert_( self.object1.func_qstring_qstring(unicode("abcdef")) == "abcdef"
)

Printing self.object1.func_qstring_qstring(unicode("abcdef")) I saw that it
only printed the two first characters, so I debugged it and found in
kdebindings/python/krosspython/pythonvariant.h:225 that the string is set with
s.setUtf16( (quint16*)t, sizeof(t) / 4 ), being t a Py_UNICODE*. sizeof(t)
seems to be always 8, so only the first two characters in the string are taken
into account. I think that it is not related to the other bug, and I suppose
that this happens in other systems too, not only in mine's, but you can judge
better than me :)

-- 
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the Kde-bindings mailing list