[rkward-devel] character encoding on windows

meik michalke Meik.Michalke at uni-duesseldorf.de
Fri May 13 14:20:03 UTC 2011


hi,

thanks for the quick reply.

Am Freitag, 13. Mai 2011, 14:11:08 schrieb Thomas Friedrichsmeier:
> well, encoding issues tend to confuse me, badly.

me too... why can't all just agree on UTF-8 and be happy with it...

> - To narrow down the problem, it's always a good idea to try in plain R, or
> in  this case, in the plain Rgui.exe (also included in the installation
> bundle).

ok, did that, it gets me the same messed-up results. so it's an R issue.

> - Writing R Extensions has a section on encoding issues: http://cran.r-
> project.org/doc/manuals/R-exts.html#Encoding-issues . I'm not really sure, 
> what to make of that, but I guess the best hints in there is to play with 
> Encoding() (and possibly with enc2native()), and to check the "Encoding"
> field  in the package's DESCRIPTION file.

yes, i've looked into that, too. the package already has "Encoding: UTF-8" 
set, Encoding() gives me lots of "unknown"s, enc2native() doesn't change 
anything and enc2utf8() even makes things look worse.

perhaps trying with iconv() might be an option left.

> - I did not understand, how exactly you pass text to / from the external 
> commands.

they're directly read from files by the external command; on windows, this 
command is usually "perl C:/TreeTagger/cmd/tokenize.pl <file>".


viele grüße :: m.eik

-- 
  dipl. psych. meik michalke
  abt. f"ur diagnostik und differentielle psychologie
  institut f"ur experimentelle psychologie
  heinrich-heine-universit"at d"usseldorf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/rkward-devel/attachments/20110513/e8018e44/attachment.sig>


More information about the Rkward-devel mailing list