void String::copyFromUTF16(...) - Comments

Wed Jan 22 20:34:08 UTC 2014

Hi Kageyu

Well, I guess I'm really confused because it was/is my understanding that:
On a UTF16 string:
s[0] == 0xfeff == Big Endian regardless of system byte order.
s[0] == 0xfffe == Little Endian regardless of system byte order.

Thanks

-Enjoy
fh : )_~

----- Original Message -----
From: Tsuda Kageyu <tsuda.kageyu at gmail.com>
To: Festus Hagen <festushagenlists at yahoo.com>; taglib-devel at kde.org
Cc: 
Sent: Wednesday, January 22, 2014 2:56 PM
Subject: Re: void String::copyFromUTF16(...) - Comments

Hi Festus.

I said:
>If you read a little-endian BOM as a 16-bit integer in a little-endian 
>system, it will be 0xfeff.
>
>That's what this line checks.
>> if(length >= 1 && s[0] == 0xfeff)

If s[0] == 0xfeff, it means you are reading a UTF-16LE string on a 
little-endian system or a UTF-16BE one on a big-endian system. The 
comment "Same as CPU endian. No need to swap bytes." describes that 
situation.
If the UTF-16 endian is different from the CPU endian, a BOM will look 
like swapped and s[0] will be 0xfffe. 

Kageyu.