[Kst] Re: Data Wizard issues (long post !)

Fri Jun 25 23:47:50 CEST 2004

For the hurried (it seems I can only produce awfully long posts !): here's a
digest of my suggestions:
1) for multiple input files: valid fields in the datawizard's 2nd page ==
variables present in all selected files is a BAD idea
2) ASCII options: why not put them in an "ASCII" subpage of the Settings dialog
? They can then be read by the datasource code from the corresponding key(s) in
the kstrc file.
3) parsing variable names in an ASCII file is easy (a simple QRegExp + split()
covers 99% of the "formats") and makes you life so much better !
4) a dialog in the datawizard allowing direct and precise placement of curves in
windows/plots could be feasible (based on gaiw's "Definition" tab, simplified).
------
For the not so hurried: you may read on for details... :-)

cbn wrote:
> MULTIPLE FILES:
> Currently, the kst command line can take a list of files.  
> (eg, kst -x 1 -y 2 *.dat). 
> The data wizard does not, but this is a fairly straightforward modification I
> think....
Good! There is hope, then.

> On the second page, the list of valid fields which can be checked are the
> ones which are in common to all files which have been selected.
CAUTION: that's the approach I had taken originally for gaiw because I thought
people would use the multi-file feature to compare files with the same variables
for different experiments. 
But that is (by far) not always the case: very often different files contain
different data that the user still wants to plot together (common reasons for
having different files include different sampling rates, or data coming from
different equipment, possibly with different formats).
For gaiw, I ended up having two "modes": the "Common variables" mode which works
the way you describe, useful for quick comparisons, and the "All variables" mode
which just lists a concatenation of all the variables along with the file they
originate from. As I hadn't planned it in the first place, the implementation
(especially the way curves are added to plots according to the active mode) is
not very elegant. I still think having these two modes with either a toogle
button or combobox to choose from is nice, but if only one were to be
implemented in the first place, I think it'd have to be the "All variables" mode
as it gives more possibilities, while the other is blocking for files containing
different data (with the difference that gaiw comes upstream from Grace, and
can't add curves to an existing plot in Grace, which makes it really blocking in
that case as opposed to kst's "internal" wizard which can be invoked more than
once).

> I would guess not for the next release, but probably this summer.
Well, the sooner the better :-) Or should I say the summer the better ?

> GENERALIZED ASCII INPUT OPTIONS
> -Currently kst data sources have no optional parameters, and no place for 
> their UI.  We could consider changing this if we come up with compelling 
> reasons.
> -In the current ascii data source, 
> 	comment lines (eg, starting with #, //, !, c, ;) are ignored
> 	one can skip an un-commented header by changing the starting frame 
ALL ASCII data files I have contain at least a comment line with variable names.
I have so many diferent contents that I can't imagine having only the data in
them, not metadata like variable names. Sometimes, I have hundreds of vars in
just one file. Column numbers are just not usable ! I find this pretty
compelling... besides, I've just imagined something which sounds quite easy to
me: add a configuration item in kst's Settings dialog, which is only read from
the rc file via a call to KSettings (or whatever it's called in KDE, I only know
the Qt variant) in the datasource code. I see no problem having a config dialog
in that place for some datasources. For ASCII, we'd put there options like
number of header lines to skip, line number to read var names from (option),
line number to read units from (option).

> -colums are only refered to by their colum number - there is no prevision to
> parse a header for better names.  This would be an exceedingly format 
> specific operation!
WRONG: from experience (gaiw once again) you cover 99% of people's needs using a
split() call on the corresponding line with a QRegExp containing the following
name separators: one or more spaces, or one tab, or one semicolon, or one pipe,
or one comma.
For the remaining 1%, add a "custom separator" lineedit in the Settings page,
and you're done! 

> PLOT PLACEMENT
> We currently have several options in an attempt to anticipate what people
> will really want.  The data wizard should be '1 rule for all vectors'. [...]  
> I really want the datawizard to be optimized for the common case, and leave 
> the other options to the more specific dialogs.
I think you're right that:
1) the plot dialog is good enough that it can still be used for more operations
than in Grace's case
2) the datawizard should be optimized for the common case.
But still, I believe something gaiw-like it very easy to use and not so hard to
implement (basically, you only need to be able to query the list of windows, and
for each of them the list of plots with their curves). I'll try to make a mockup
with Qt Designer...

> Thoughts?
Some as you've just seen :-) I hope they'll be helpful.

Best regards, 

Nicolas