[Kst] Re: AsciiSource: new defaults, Kst's atof

Barth Netterfield netterfield at astro.utoronto.ca
Mon Jan 24 21:01:28 CET 2011


Great work.  Huge speedup!

On Mon, Jan 24, 2011 at 2:08 PM, Peter Kuemmel <syntheticpp at gmx.net> wrote:

> Attached the benchmark results with values for Linux.
> It was a 310 MB gyrodata file, and I always have loaded
> column three only.
>

Can you explain what the various cases in the table are?  Are these all
fixed width columns, or are they variable width?  Or did you show both
cases?


> I found that the atof function which we already use on Windows
> by default is also faster on Linux. Therefore I think we should
> also use it on Linux, especially our numbers aren't that
> complicate to parse.
>

Good.  Can it parse scientific notation?  What does it do about '.' vs ','
(I haven't looked...)

Additionally we should change the default comment delimiter as
> Nicolas already suggested. Then a normal user who often uses the
> defaults settings would see a speed on Linux by factor 5 by
> simply updating to Kst 2.0.3 on windows it is about factor 2-3.
>

Is the proposal to have '#' as the default comment delimiter?  The speedup
is from only having one?


> But speedup is only for the pure data loading.
> The internalDataSourceUpdate is still very slow, counting the
> rows and looking for comments is now slower than reading the data!
> This makes no sense so we should also optimize internalDataSourceUpdate
> before we release 2.0.3.
>

Yes.  It should be far faster.


> Do we support comments which are anywhere in the data or is it
> enough to only support complete lines as comments, lines which
> starts with the comment delimiter?
>

Well... we should probably support white space before a comment at the
begining of a line, but a correctly formed ascii file will have the same
number of columns for every line, so if there is a comment later in a line
other than the first line it will either be after the last column, or will
be a syntax error in the file.

So: check for comments characters anywhere in the first line when chosing
the number of columns.
After that, only check at the begining of the line (up to the first
non-white space character).


> Peter
> --
> Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief!
> Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail
>
> _______________________________________________
> Kst mailing list
> Kst at kde.org
> https://mail.kde.org/mailman/listinfo/kst
>
>


-- 
C. Barth Netterfield
University of Toronto
416-845-0946
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.kde.org/pipermail/kst/attachments/20110124/60be86ae/attachment-0001.htm 


More information about the Kst mailing list