[WebKit-devel] [Bug 287690] New: KWebkitPart does not apply correct locale encoding settings on some pages with CJK characters.

moriramar at gmail.com moriramar at gmail.com
Sun Nov 27 16:11:54 UTC 2011


https://bugs.kde.org/show_bug.cgi?id=287690

           Summary: KWebkitPart does not apply correct locale encoding
                    settings on some pages with CJK characters.
           Product: kwebkitpart
           Version: unspecified
          Platform: Gentoo Packages
        OS/Version: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: NOR
         Component: general
        AssignedTo: webkit-devel at kde.org
        ReportedBy: moriramar at gmail.com


Version:           unspecified (using KDE 4.7.2) 
OS:                Linux

When I open some pages with both simplified Chinese characters and traditional
Chinese characters, some characters are not displayed correctly. Pages
containing both Chinese characters and Japanese characters might cause this
problem as well.

Personal guess:
These pages might be encoded in zh_CN.GBK or zh_CN.GB18030 (which contains more
character encodings), while KWebkitPart might apply zh_CN.GB2312 (which is
generally considered as a subset of GBK.).

Reproducible: Always

Steps to Reproduce:
1. Install a font covering CJK characters. Bitstream Cyberbit, WenQuanYi Zen
Hei, WenQuanYi Microhei or Droid is OK.
2. Make sure zh_CN.GBK, zh_CN.GB2312, zh_CN.GB18030, zh_CN.UTF-8 locales are
available on the system.
3. Open Konqueror 4.7.2 and enable Webkit mode.
4. Go to http://www.acfun.tv/v/ac265957/ , which might be a little slow.

Actual Results:  
In the top bold title line of the page content, a black box with white question
mark appears. In the next line, there are two black boxes seperated by a "W"
character, followed by a "o" character.
Trying "View >> Encoding >> Simplified Chinese >>" any GB* locales does not
solve the problem.
Opening this kind of pages has a chance to crash Konqueror.

Expected Results:  
No these black boxes and "W" or "o" characters in these two line.
KHTML can show this page well when encoding is set to "Simplified Chinese >>
GBK" or "Simplified Chinese >> GB18030", which can be referred to.

Portage 2.1.10.38 (hardened/linux/x86/desktop, gcc-4.5.3, glibc-2.13-r4,
3.0.4-hardened-r5 i686)
=================================================================
System uname:
Linux-3.0.4-hardened-r5-i686-AMD_Athlon-tm-_II_Neo_K345_Dual-Core_Processor-with-gentoo-2.1
Timestamp of tree: Sat, 26 Nov 2011 16:30:01 +0000
app-shells/bash:          4.2_p10
dev-lang/python:          2.7.2-r3, 3.2.2
dev-util/cmake:           2.8.6-r1
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.1
sys-apps/openrc:          0.9.4
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.68
sys-devel/automake:       1.10.3, 1.11.1-r1
sys-devel/binutils:       2.21.1-r1
sys-devel/gcc:            4.5.3-r1
sys-devel/gcc-config:     1.4.1-r1
sys-devel/libtool:        2.4-r4
sys-devel/make:           3.82-r3
sys-kernel/linux-headers: 2.6.39 (virtual/os-headers)
sys-libs/glibc:           2.13-r4
Repositories: gentoo gentoo-zh gentoo-haskell science kde sunrise local
ACCEPT_KEYWORDS="x86 ~x86"
ACCEPT_LICENSE="* - at EULA skype-eula"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-march=i686 -O2 -pipe -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt
/usr/share/openvpn/easy-rsa"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf
/etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo
/etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d
/etc/texmf/web2c"
CXXFLAGS="-march=i686 -O2 -pipe -fomit-frame-pointer"
DISTDIR="/var/pkg/dist"
EMERGE_DEFAULT_OPTS="--keep-going y --with-bdeps y"
FEATURES="assume-digests binpkg-logs distlocks ebuild-locks fixlafiles news
parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn
unmerge-logs unmerge-orphans userfetch"
FFLAGS=""
GENTOO_MIRRORS="http://mirrors.163.com/gentoo"
LANG="en_GB.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="zh_TW zh af ak am ar as as_IN ast az be be_BY bg bn bn_BD bn_IN bo br
brx bs ca ca_XV ca at valencia crh cs csb cy da de de_FR dgo dz ee el en en_CA
en_GB en_US en_ZA eo es es_AR es_CL es_CR es_ES es_LA es_MX et et_EE eu fa fi
fil fo fr fr_CA fy fy_NL ga ga_IE gd gl gu gu_IN he hi hi_IN hne hr hsb hu hy
hy_AM ia id is it ja ka kk km kn kn_IN ko ko_KR kok ks ku ky la lb lg lo lt lv
mai me mk ml ml_IN mn mni mr mr_IN ms mt my nb nb_NO nds ne nl nn nn_NO no nr
ns nso oc om or or_IN pa pa_IN pap pl ps pt pt_BR pt_PT rm ro ru rw sa_IN sat
sd se sh sh_YU son si sk sl sq sr sr at ijekavian sr at ijekavianlatin sr at latin
sr at Latn sr_CS ss st sv sv_SE sw sw_TZ ta ta_IN ta_LK te te_IN tg th ti ti_ER tk
tl tn tr ts ug uk ur_IN ur_PK uz uz at cyrillic ve vi wa xh zh_CN zh_HK zu"
MAKEOPTS="-j2"
PKGDIR="/var/pkg/bin"
PORTAGE_COMPRESS=""
PORTAGE_COMPRESS_FLAGS=""
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress
--force --whole-file --delete --stats --timeout=180 --exclude=/distfiles
--exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/var/pkg/portage"
PORTDIR_OVERLAY="/var/pkg/gentoo-zh /var/pkg/haskell /var/pkg/science
/var/pkg/kde /var/pkg/sunrise /var/pkg/usr"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi avahi bash-completion berkdb bluetooth branding bzip2
cairo cdda cdr cjk cli consolekit cracklib crypt cups cxx dbus djvu dri dts dvd
dvdr emboss encode exif fam ffmpeg firefox flac fontconfig gdbm gdu gif gpm
gstreamer hardened iconv ipv6 jpeg jpeg2k kde lame lcms ldap libnotify mad mms
mmx mmxext mng modules mp3 mp4 mpeg msn mudflap ncurses nls nptl nptlonly ogg
opengl openmp pam pango pax_kernel pcre pdf pic png policykit ppds pppd
pulseaudio qt3support qt4 readline samba sdl semantic-desktop session spell
sqlite ssl startup-notification svg sysfs syslog taglib tcpd threads tiff
truetype udev unicode upnp urandom usb v4l vaapi vim-syntax vorbis wifi x264
x86 xcb xcomposite xml xorg xulrunner xv xvid xvmc zlib" ALSA_CARDS="ali5451
als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370
ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident
usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy
dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear
meter mmap_emul mulaw multi null plug rate route share shm softvol"
APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm
authn_default authn_file authz_dbm authz_default authz_groupfile authz_host
authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir
disk_cache env expires ext_filter file_cache filter headers include info
log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling
status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words
flow plan stage tables krita karbon braindump" CAMERAS="ptp2"
COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog"
DRACUT_MODULES="crypt crypt-gpg syslog" ELIBC="glibc" INPUT_DEVICES="acecad
aiptek elographics evdev fpit hyperpen joystick keyboard mouse mutouch penmount
synaptics wacom" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk
hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="apm ark
ast chips cirrus epson geode glint i128 i740 intel mach64 mga neomagic nouveau
r128 radeon rendition s3 s3virge savage siliconmotion sis sisusb tdfx tga
trident tseng v4l vesa via"

-- 
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the WebKit-devel mailing list