[rkward-devel] Potential Data editor bug

Thomas Friedrichsmeier thomas.friedrichsmeier at ruhr-uni-bochum.de
Sun Sep 26 17:46:30 UTC 2010


Hi,

On Sunday 26 September 2010, Prasenjit Kapat wrote:
> Recent updates to the data editor do not display the column number,
> which, to me is helpful at times. So, why not internally always use
> the column number and use the column names only for displaying?
> Although transformations like sort etc.. may not be implementable
> using just column numbers, but we may be able to implement a layer in
> between and let R handle the duplicated columns however it wants to..
> 
> What I am trying to say is that on the C++ end deal with only column
> numbers. Is that possible keeping in mind that we want to provide an
> interface for transformations in future?

I suppose it would be possible. Plugins (including sort) would continue to be 
using column names instead of numbers, but the core of the editor could 
probably be made to read and write data based on numbers rather than names, 
without too much trouble.

There is a bit of danger associated with that, however: If for whatever reason 
the representation in the editor, and the data in R become out of sync (*), 
then relying on the column names is a lot safer. Trying to modify a non-
existing column name will - depending on what is done - either cause an error, 
or create that column. In contrast, writing to the wrong column number can 
easily damage the user's data...

Thus, I am rather reluctant to do this. I would assume that duplicate column 
names are a rare exception, and so we can afford not to support them 100%. Do 
you think they are important enough as a use-case to make a better effort?

Displaying column numbers in the GUI should be no problem. Where would you 
like to see them?

Regards
Thomas

(*): Usually, this should not happen, but it could happen, e.g. In this 
scenario:
1. User is currently running a time-consuming command in R. This command will 
insert a column into a data.frame which is currently also being edited.
2. While the command is still running, user edits a column in that data.frame, 
to the right of the insert position. This will generate an R command of the 
form data[[column]][row] <- value.
3. The command from 1 inserts the column, causing the editor to update it's 
representation, now.
4. The command from 2 will still be run, affecting the wrong column.
It may be possible to deal with this, somehow, but it is not going to be 
trivial.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/rkward-devel/attachments/20100926/7a427842/attachment.sig>


More information about the Rkward-devel mailing list