[Kst] Real-time data display without files ?

Mon Apr 24 08:07:39 CEST 2006

On Sunday 23 April 2006 10:20, Dirk Eddelbuettel wrote:
> | Yes, you can build your own plugin to do that; however, none of the
> | *existing* plugins shipped with kst do that.  Eventually (perhaps quite
>
> Yes, but how would I get it loaded / started? I saw no command-line or menu
> option for it ...  Can you give me a pointer as to where to start digging?

In principle, you could write and install a datasource to do this.

> | quickly) your datasource would consume a large amount of memory to hold
> | the past values that you are reading (until a restart of kst of
> | what-have-you), but that's really your only obstacle.
>
> Just to be sure: isn't that the same with file-based data? If kst read
> 40mb, it 'has' 40mb.

Kst requires that the data sources  provide random access (so that you can 
move back in time, etc).  You could, as policy, assert that your data source 
only keeps around the previous N minutes of data, and going back before that 
would return NaNs or something like that.  The stdin data source (which you 
use below) doesn't do this: instead it keeps all the data that hits kst from 
when you start until you quit.  So you eventually can hit a memory limit - 
soon or not depending on your data rate.

> It shouldn't be a problem, our data tends to in 'skinny vector' form: lots
> of rows, few columns. If it grows to a few dozen mb it is still way less
> than my emacs or firefox session :)
>
> | > My data can come in at maybe 200 to 300 updates per second at peak. I
> | > also
>
> I forgot to stress that the median is more like 10 to 20 to 30 updates a
> second. Which should be doable. During peaks we can easily drop a few
> updates.  So the file-based approach has a lot going for it in terms of
> testing and experimentation.

Usually we have kst updating itself every 250ms, reading whatever data has 
been written to the data source since the last update.  This yields 4Hz plot 
updates, but no missed data...

> And while out running earlier this morning, it occurred to me that kst,
> with the dual focus on the command-line, is probably well behaved in the
> Unix tradition and takes takes stdin as a filename -- and indeed, the
> following simple R/shell script generates suitable data with C times and
> random gaussian noise:
>
> edd at basebud:~> cat /tmp/kstdemo.sh
> #!/bin/sh
>
> cat <<EOF | R --slave
> options('digits'=14)
> options('digits.secs'=4)
> while(1) {
>    cat(as.numeric(Sys.time()), 100+3*rnorm(1),'\n')
>    Sys.sleep(0.005)
> }
> EOF
>
> [ options() sets the needed display granularity for the C time, and
> activates the display of milli seconds -- that last feature needs R 2.3.0,
> due tomorrow, or any of the recent pre-releases -- and we add a 5
> millisecond sleep to not drown kst ]
>
> which we can pipe into kst via a simple
>
> edd at basebud:~> /tmp/kstdemo.sh | kst -x 1 -y 2 -
>
> Very sleek!!  All that is left is to tell kst to take X values as C time
> and to show them in a suitable format, as you kindly reminded me yesterday.
>
> So all I have to do at work is to replace kstdemo.sh with a command-line
> tool that listen to our data --- easy as I just wrote one of those for
> other purposes.  Very nice -- pluggable Unix toolchains save the day.

Just keep in mind that all data hitting kst will be cached, eventually filling 
up virtual memory.  I prefer to explicitly cache to disk with the logger, and 
re-read from kst:

edd at basebud:~> /tmp/kstdemo.sh > tmp.dat &
edd at basebud:~> kst -x 1 -y 2 tmp.dat

the file system will make sure that you never actually read from disk unless 
you go back before things are buffered... so the performance is going to be 
basically indestinguishable (sp).

> | > may want to view and relate several such streams.  As a start, I should
> | > be able to plug into that data pretty easily via code that listens to
> | > our data streams and emits csv data (for which I may just have to
> | > reformat the time stamp to get suitable 'ctime' seconds for kst).
> | >
> | > However, for efficiency, could I just 'listen' to data that is not
> | > coming via files -- without re-architecting kst?  It just doesn't seem
> | > right to divert all that data to file only to read it back later.  (I
> | > should note that we already have existing logging and capturing
> | > mechanisms.)
> |
> | As you state, you already log and capture your data.  The 'standard' kst
> | (as in it's how kst was originally intended) way of doing things is to
> | have kst read the data from the logs.  It is true that you are diverting
> | "all of that data to file only to read it back later," but 'later' may
> | be milliseconds, not seconds (or longer), provided the logger flushes
>
> It may matter how many milliseconds I am behind. Shaving the 'write to
> file, read from file' cycle seems like an obvious improvement. Stdin is a
> good alternative as we're down to 'format/print, read, parse'. Passing
> binary data around could improve this. I'll have to see if I need it.

kst can read a couple of very light weight binary formats designed for this.  
If you decide to go binary, let us know and we can describe them.

Remember, on each update, kst will only be reading the unread data; at a 20 Hz 
aquisition rate and only a few vectors, this will be no trouble I would 
guess.  We read dozens of channels at a 100Hz/channel aquisition rate.

> | the file it's writing at regular (as fast as you want) intervals.  As
> | long as kst can read the type of file that the 'logger' generates, kst
> | can plot the data 'real time' with new data appearing on the plot 'as
> | soon' as it is captured and written by the logger, where 'as soon' is
> | defined as with an on order 100 milliseconds delay.
>
> I take it the file reader is 'smart' and has a file pointer to not re-read
> everything from the beginning?  
> Or is that handled inside kst via the 
> KstObject::UPDATE vs KstObject::NO_CHANGE and the plugin has to figure out
> how to be efficient?

kst only asks the data source for data that it hasn't read yet.

> | Of course, if you don't want to log all data, then this forces you to
> | log some data you don't need, but you can easily delete it at a later
> | time.  However, versus the method where the kst datasource reads the
> | captured data itself into memory, it gives you the added benefit that
> | you may look at data that occurred before that kst instance was started,
> | including just before.
> |
> | Whichever route you choose, let us know how you're doing, and we'll help
> | out wherever we can.  Good luck.
>
> Thanks a bunch -- you've already been very helpful.
>
> Regards, Dirk

what is your data?

cbn