[Kst] kst2 datasource API comments

Peter Kümmel syntheticpp at gmx.net
Thu Mar 18 19:16:29 CET 2010


Barth Netterfield wrote:
> On March 18, 2010 10:17:00 am Peter Kümmel wrote:
>> Reading datasource.h I had following ideas for DataSource:
> 
>> - most static functions are for handling the plugins. This
>>   code could be splitted out into a other, new class:
>>  DataSourcePluginManager
> 
> What would this help?  Make datasource less intimidating?  Something else I'm 
> not thinking of?  I have no objection to the change, but I would like to 
> understand it better.

Yes, it's only a kind of cleanup (don't put to much into one class. When there
is a other class the classname by itsself is some sort of documentation).

> 
>> - as Nicolas suggested I would add metadata to each data and would
>>  introduce a simple struct, something like
>>
>>     struct DataSource::Info{
>>        QMap<QString, double> scalar;
>>        QMap<QString, QString> strings;
>>     };
>>
>>   and then
>>     Info sourceInfo();
>>     Info fieldInfo(const QString&);
>>     Info matrixInfo(const QString&);
> 
> Yes...
> 
> When I wrote the metadata stuff, I thought that readFieldScalars would be 
> called very often, with the idea that field scalars might end up being dynamic 
> (maybe the data source would provide metadata on a field like 'data quality' or 
> 'compression rate' or some such thing that could change if the file were being 
> written to.).  For this reason, I tried to make updates as cheap as possible.  
> However, datavectors don't take advantage of this, and dynamic meta-data is 
> not supported.  Nor do I think it likely that we would actually ever want it.  
> For example, both of the things, above, I mentioned are more data-source level 
> numbers, and would be plain scalars.

When sometimes such a dynamic metadata is needed then it is maybe worth to add
a new type of a primitive.

> 
> This is a long way to say there is no reason we couldn't do what you are 
> suggesting: unify fieldScalars, readFieldScalars, fieldStrings, readFieldStrings 
> into fieldInfo.  It would be cleaner!
> 
> Only potential caveat which the current implementation also doesn't address: 
> currently only fields (vectors) and Matrixes can have meta-data, and meta-data 
> can only be scalars and strings.  Question: should this be more general?  
> 
> 1) Should it be possible to have vectors or matrices as metadata (eg, a color 
> table as a pair of vectors as metadata to a matrix?)
> 
> 2) Should strings and scalars be able to have metaData. (eg, a unit string as 
> metadata to a scalar).
> 
> This leaves us with 16 combinations, unless we add more primitives.
> -we can add them one at a time, as data sources appear that support them
> or
> -we can add them explicitly now
> or
> -we can do something general now

I would do it this way: "DataSource API is for loading any primitive
which each could have metadata". If we put the metadata into the Info struct,
then we could start with strings and scalars, and later on add new meta data by
simply adding a new member variable to Info.

> 
> 
>> - Are there files which store fields and matrices? If not,
>>   I would introduce two classes which inherit from DataSource,
>>   on for vector data and one for matrices. The info and data
>>   lists becomes:
> 
> Yes.  QImage currently does.  FITS does will.

OK, then only one class for reading any primitive.


> 
> Data sources can support any subset of primitives.
> 
>>     QStringList dataList(); // vectors or matrices
>>     Info dataInfo(const QString&);
>>
>> - somehow we should split out the update stuff, and implement it
>>   such way that a concrete DataSource implementations does not
>>   need to know anything about the update mechanism.
> 
> Yes.  Data sources shouldn't have to call internalUpdate - It was an oversight 
> in the design :-(

We should start a other thread about the update. I have the impression it is very
broken atm, no dependency tracking,

> 
>> - make all functions which must be implemented pure virtual
> 
> Not sure how many that is.  What *must* be implemented?  You could have a data 
> source that only supports scalars.

The idea was to make it explicit what a data source implementation is responsible for.
And when some primitive is not supported a dummy must implemented. Here we could also
add some meta data describing the supported data types.

> 
>> - most of the static functions of DataSourcePluginInterface should
>>   be part of the concrete data source implementations: understands(),
>>   supportsTime(), ... are essentially functions of the datasource
>>   not of the plugin infrastructure.
> 
> I'm sort of murky on this - George wrote this stuff - but isn't the point of 
> these that a concrete data source doesn't have to be created in order to 
> determine if it 'understands' a particular file, or to get its field list?

I think 'understands' and reading data must be located in the same class because,
both functions are strong related. And when there is a cheap ctor code like
'MyDataSource(.understands(filename)' would not hurt.


> 
>> - why do we use malloc? We have C++.
> 
> Probably because I learned C first (and when I did, malloc was shiny and new).
> :-)  I have no strong feelings about either keeping them or removing them.

OK, when I stumble on mallocs I'll replace them.

> 
> cbn
> 

BTW could we use 'vector' instead of 'field' in the API.
And it's not clear to me what 'frame' is good for: A vector
of error bars?

Peter



More information about the Kst mailing list