[rkward-devel] [rkward-cvs] SF.net SVN: rkward:[4063] trunk/rkward/rkward/plugins/00saveload/import

Stefan Rödiger stefan_roediger at gmx.de
Fri Dec 2 16:36:00 UTC 2011


On Friday 02 December 2011 10:31:50 Thomas Friedrichsmeier wrote:
> Hi Stefan,
> 

Hi Thomas,

> On Friday 02 December 2011, sjar at users.sourceforge.net wrote:
> > first working plugin of XLS/XLSX import for RKWard
> 
> nice! I did not test on an actual xls file, yet. But some first comment 
below:
> > + has a rather slow import speed (somewhat CPU hungry): XLS -> perl
> > script -> read.x -> data.frame #e.g., reading 3000 lines and 256 columns
> > took 100% CPU usage, circa 4 min and 350 mb on my machine (2x Intel Atom
> > N270 @ clocked at [ 1333.000 MHz ]) # therefore no multiple sheet import
> > is/will be implemented
> 
> Wow, that really is slow. A warning for large data sets would be nice,
> indeed.
> 
> > + find a way to make the Perl path user definable (i.e. if it is located
> > on bizarre places) and as such persistent (@ Thomas hope you know what I
> > mean)
> 
> I see what you mean, but I'm not sure, how to address this best, yet. I've
> added a feature request for the time being:
>   
> https://sourceforge.net/tracker/?func=detail&aid=3448066&group_id=50231&at
> id=459010 If you have further ideas on this, perhaps add them there.
> 

I think this might be useful in other occasions too.

> > + test on MS Windows
> 
> Just a thought: On Windows, there is package xlsReadWrite, which might be
> less problematic than perl on that platform. Unfortunately, it has a
> rather different function signature for read.xls(), however, so it may not
> be feasible to support both in one plugin.

To be honest, I was not aware of the package. Therefore I did a bit of more 
research and it seems that the are more solutions. Actually I really try to 
avoid a fully featured solution for such an import. The Installation and usage 
must be as simple and not prone to errors (e.g., perl missing). Maybe it is 
worth to evaluate all the possibilities and to list some pro and cons. The 
"xlsx" package seems to be a nice candidate. The change-log indicates a 
functionality not only for XLSX (2007+) but also the other versions accept 95. 
Beyond this writing as XLSX (with formating) seems to be supported too. I 
however see no reason to implement such functionality. The "xlsx" package 
seems to be cross-platform, with some dependencies for other packages 
(‘xlsxjars’, ‘rJava’) and java (which might even be better than Perl since it 
is present on most machines). The java dependency in turn might also result on 
some issues (version, ..., location). An installation failed so far.

> 
> Some small suggestions:
> 
> I would suggest to move the "sheet" option directly below the file option,
> and to remove the separate <frame> around it. It's pretty much part of the
> definition of what should be imported, IMO.

ok, I consent

> 
> For "skip", a default value of 0, instead of -1 could be used to indicate
> "no skipping". Perhaps "fill" should default to checked for Excel files?
> "stripwhite" should be relabeled to "Strip white space", not "values",
> IMO.

okay, I'll fix that.

> 
> Regards
> Thomas

Regards
Stefan




More information about the Rkward-devel mailing list