<br><br><div><span class="gmail_quote">2007/2/1, Thomas Friedrichsmeier <<a href="mailto:thomas.friedrichsmeier@ruhr-uni-bochum.de" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">thomas.friedrichsmeier@ruhr-uni-bochum.de
</a>>:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Thursday 01 February 2007 18:28, you wrote:<br>> I have tested the mechanism with a greek sample file. If the headers are<br>> not greek (1.sav) the file is imported. If the headers are greek I take the<br>> following error message:
<br>><br>> Error in read.spss("/home/user/Desktop/12.sav", to.data.frame = TRUE, :<br>> error reading system-file header<br>> In addition: Warning message:<br>> /home/user/Desktop/12.sav: position 0: Variable name begins with invalid
<br>> character<br><br>This is deep inside the read.spss() C code. I'm afraid, there is nothing we<br>can do about this from the RKWard side of affairs (the conversion mechanism I<br>added simply converts all strings in the created R object, it can't access
<br>the lower levels of reading the file itself).</blockquote><div><br>Well I am afraid that unfortunately this is deep inside R not read.spss like Peter Dalgaard wrote, and it belongs in the general incombatibity with encondings. I made some testings with R on windows, imported the spss files, then saved workspace and tried to open it from R on linux. Here are the outcomes. First there are the trials importing the spss files on linux and then on windows. The files used are the followings
<br><br>The following files are small examples used below:<br><a href="http://users.forthnet.gr/the/isoumpasis/data/1.sav" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://users.forthnet.gr/the/isoumpasis/data/1.sav
</a><br><a href="http://users.forthnet.gr/the/isoumpasis/data/12.sav" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
http://users.forthnet.gr/the/isoumpasis/data/12.sav</a><br>
<a href="http://users.forthnet.gr/the/isoumpasis/data/12.RData" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://users.forthnet.gr/the/isoumpasis/data/12.RData</a><br><br>The first file has english value labels and can be read:
<br>> read.spss("~/Desktop/1.sav")
<br>$VAR1<br> [1] "\xf3\xf0\xdf\xf4\xe9 " "\xf3\xf0\xdf\xf4\xe9 "
<br> [3] "\xf3\xf0\xdf\xf4\xe9 " "\xf3\xf0\xdf\xf4\xe9 " <br> [5] "\xf3\xf0\xdf\xf4\xe9 " "\xe3\xf1\xe1\xf6\xe5\xdf\xef "<br> [7] "\xe3\xf1\xe1\xf6\xe5\xdf\xef " "\xe3\xf1\xe1\xf6\xe5\xdf\xef "
<br> [9] "\xe3\xf1\xe1\xf6\xe5\xdf\xef " "\xf3\xf0\xdf\xf4\xe9 " <br>[11] "\xe3\xf1\xe1\xf6\xe5\xdf\xef "<br><br>$VAR2<br> [1] 5 6 7 7 5 7 3 5 6 7 8<br><br>attr(,"label.table
")<br>attr(,"label.table")$VAR1<br>NULL<br><br>attr(,"label.table")$VAR2<br>NULL<br><br>So we can convert this.<br><br>In file
12.sav the value labels are greek. The problem is that the file cannot be read.
<br><br>> read.spss("~/Desktop/12.sav")<br>Error in read.spss("~/Desktop/12.sav") : error reading system-file header<br>In addition: Warning message:<br>~/Desktop/12.sav: position 0: Variable name begins with invalid character
<br><br>I also tried using use.value.labels=FALSE having the same message.<br><br>> read.spss("~/Desktop/12.sav", use.value.labels=FALSE)<br>Error in read.spss("~/Desktop/12.sav", use.value.labels = FALSE) :
<br> error reading system-file header<br>In addition: Warning message:<br>~/Desktop/12.sav: position 0: Variable name begins with invalid character <br>
<br>
The encoding of the spss files is <span>windows-1253 (greek). </span><br><br>Here is the windows part.<br>
<br>
I imported the files in windows R with no problem. I saved the workspace as 12.RData.<br>
<br>And now the import to linux.<br><br>
I loaded the file from linux <br>
> load('/home/igoutsou/Desktop/12.RData')<br>
<br>All seems fine. Now I have two sets.<br>
> ls()<br>
[1] "sav11" "sav12"<br>
<br>
sav11 comes form 1.sav and I take the same results<br>
> sav11<br>
$VAR1<br>
[1] "\xf3\xf0\xdf\xf4\xe9 " "\xf3\xf0\xdf\xf4\xe9 " <br>
[3] "\xf3\xf0\xdf\xf4\xe9 " "\xf3\xf0\xdf\xf4\xe9 " <br>
[5] "\xf3\xf0\xdf\xf4\xe9 " "\xe3\xf1\xe1\xf6\xe5\xdf\xef "<br>
[7] "\xe3\xf1\xe1\xf6\xe5\xdf\xef " "\xe3\xf1\xe1\xf6\xe5\xdf\xef "<br>
[9] "\xe3\xf1\xe1\xf6\xe5\xdf\xef " "\xf3\xf0\xdf\xf4\xe9 " <br>
[11] "\xe3\xf1\xe1\xf6\xe5\xdf\xef "<br>
<br>
$VAR2<br>
[1] 5 6 7 7 5 7 3 5 6 7 8<br>
<br>
attr(,"label.table")<br>
attr(,"label.table")$VAR1<br>
NULL<br>
<br>
attr(,"label.table")$VAR2<br>
NULL<br>
<br>
sav12 comes 2.sav and I cannot see it<br>
> sav12<br>
Error: invalid multibyte string </div><br></div><br>
So if it is as I think, I do not know if there is a way for that to be fixed even from R-developers, and if it can be fixed I do not know if they want to. Does it make sense to ask something like this or reported as a bug? From the answers to Thomas question on r-help list I believe that they do not think this as bug. It is just the way it works.
<br>