Standard gettext PO format

Nicolas Goutte nicolasg at snafu.de
Fri Dec 30 19:36:11 GMT 2005


On Friday 30 December 2005 18:27, Chusslove Illich wrote:
> >> [: Chusslove Illich :]
> >> [...] the patch is attached as a reference, but it certainly
> >> needs some explanations from my side, as MO format itself has
> >> compatibility hacks.
> >
> > [: Nicolas Goutte :]
> > if (trans_full[i] == '\000')
> >        trans_full[i] = '\004';
> >
> > Why do you expect a NUL character in the middle of a UTF-8 string?
>
> Those are the hacks I reffered to. For compatibility when plural forms were
> introduced, translated forms (as well as msgid and msgid_plural) are
> stored in the catalog delimited by nulls. Catalog lookup will give the
> pointer to first plural form and total length of the entry (over all
> forms). So, if the strlen and returned total length differ, we know those
> are actually plural forms.
>
> Then, we need to replace nulls with some sane delimiter, basically the same
> story as presently with newlines, but something much more robust. For this
> same purporse, in Gettext \004 was chosen as context separator (msgctxt
> and msgid in the catalog are delimited by it).

Good, then I think that the function should not return a QString in that case.

I assume that you do not want to return a QStringList, so I suppose that it 
could return a QByteArray (with the NUL character) and then search the NUL 
character in the function where you need to process the result.

That would make clean code, without hacks.

>
> > (Especially passing \004 to QString is perhaps a bad idea as I do not
> > know if QString supports the conversion of control characters.)
>
> Hm, it works as it is. I think the point of using \004 in Gettext was that
> it is benign enough.

That will fail the day somebody does another nice trick using \004. (Good at 
least it seems not to be one of the standard escapes like \b \r \n and such.)

(Sorry, I have seen too much code where such kind of hacks have fired back. So 
I am a little "allergic" to such hacks.)

>
> > Well, I suppose that you will probabl have less problem by tweaking the
> > current code.
>
> I don't seem to have many problems with Gettext either, especially if we
> let it handle language resolution. And it gets awful lot of other code
> deleted :)

Well, I do not know. I can only offer ideas to avoid the problems that we 
enocunter (or that we are likely to encounter).

>
(...)

Have a nice day!





More information about the kde-core-devel mailing list