[Nepomuk] survey on Nepomuk usage statistics

Laura Dragan aprilush at gmail.com
Mon Aug 23 17:25:07 CEST 2010


Hi,

I'm putting together a usage survey for Nepomuk, trying to find out
more about how it is used at large. This email has 2 requests in it :)
1. if you use nepomuk, please volunteer to participate in the survey
2. please comment on the survey idea itself, as it is described below.
I'm particularly interested if you see any other range of data that
can be collected at the same time, that would bring out more details
and aspects of the usage.

The way I imagine it, the survey will have 2 parts, the first part is
the data collection and the second is a questionnaire for the
participants. I hope to find about 20 people willing to participate,
the condition being that they already use Nepomuk in their everyday
work and fun.

They would have to install and run a Nepomuk service (that is almost
done and the source will be available soon). The service has two
functions: it does an initial scan of the nepomuk system every time it
starts, and logs the status. Also it logs the changes to the rdf store
as they happen. Any of the 2 functions can be turned off at any time,
but to gather as much useful data as possible they need to be on. The
output of the service is saved in text format in a folder. At the end
of the period, the files in this folder will be analysed. After the
service is installed, it will do it's collecting without bugging the
user. However, there might be a slow down of nepomuk during the
initial scan.

I'd like to collect data for a period of 2-3 weeks or a month, the
longer the better.

The results gathered by the study would be public, however, I'll do my
best to anonymise the data that is published - which would consist
mostly of graphs and statistics. Participants will get the chance to
review the material before it is published, and can withdraw at any
time from the study. They will be asked to send the output files for
analysis by email, as the service does not send anything on it's own.

At the end of the study, the participants will be asked to answer a
questionnaire about their experience using Nepomuk.

If you wish to participate, please send me an email.
If you know people who might be interested, please forward this email.

Thanks,
Laura

----------

Details about data gathered by the service (the list is not final yet):

1. initial scan - every time the system starts
  1.1 nepomuk services - which are running, which are installed, which
are started by default
  1.2 nepomuk storage - what backend is used
  1.3 data - how many triples in the repository, what ontologies are
loaded, how many distinct types, which, and how many distinct
instances of each, how many distinct properties, which, and how many
times are they used
  1.4 akonadi - are any nepomuk feeders running
  1.5 strigi - is strigi running, which folders are indexed, which are excluded

2. change monitoring
  when the data in the repository is changed, the service logs the
triples by the following pattern:
  - $date: [+-] [ a $type ] <$property> "some_value"
  - $date: [+-] [ a $type ] <$property> [ a $objecttype ]
  - the $ variables are replaced by actual values
  - if the statement is added, there is a + sign in front, if it is
removed there is a - sign in front


More information about the Nepomuk mailing list