RFC: Encoding of filenames [long]
Waldo Bastian
bastian at kde.org
Thu Jun 5 16:20:38 BST 2003
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thursday 05 June 2003 14:21, Thiago Macieira wrote:
> Adding to all that, there's the URL problem. URLs are supposed to be 8-bit
> encoded and, as far as the current standards go (from what I can tell),
> UTF-8. I managed to resolve the domain part of the issue -- I hope --, but
> Konqueror still fails the two tests shown in bug #55177. The major problem
> with those is that the encoding is NOT backwards compatible with many sites
> out there that use non-encoded URIs. By being compliant, I'm sure we'll get
> a lot of bug reports that Konqueror doesn't load the right images or go to
> the right sites.
The recommendation has the following note:
"Note. Some older user agents trivially process URIs in HTML using the bytes
of the character encoding in which the document was received. Some older HTML
documents rely on this practice and break when transcoded. User agents that
want to handle these older documents should, on receiving a URI containing
characters outside the legal set, first use the conversion based on UTF-8.
Only if the resulting URI does not resolve should they try constructing a URI
based on the bytes of the character encoding in which the document was
received."
I think w3c is very well aware that using utf8 only in such case will break a
zillion sites. I would like to hear about some real world sites that actually
depend on the utf8 behavior.
Cheers,
Waldo
- --
bastian at kde.org -=|[ SuSE, The Linux Desktop Experts ]|=- bastian at suse.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE+31/GN4pvrENfboIRAmCHAJkB3IYK7p3YgOVHNoimpdXYYzh+UQCfZLP4
WMqQhbxKHM5xbHfwjKOoPlA=
=gMAw
-----END PGP SIGNATURE-----
More information about the kde-core-devel
mailing list