Telemetry Policy

Jaroslaw Staniek staniek at kde.org
Fri Aug 18 10:23:49 BST 2017


On 17 August 2017 at 16:19, Volker Krause <vkrause at kde.org> wrote:

> On Wednesday, 16 August 2017 20:35:59 CEST Jaroslaw Staniek wrote:
> > On 16 August 2017 at 18:56, Volker Krause <vkrause at kde.org> wrote:
> > > On Wednesday, 16 August 2017 15:23:07 CEST Jaroslaw Staniek wrote:
> > > > On 16 August 2017 at 14:13, Volker Krause <vkrause at kde.org> wrote:
> [...]
> > > > In addition maybe distributors can sometimes make the decision based
> on
> > > > opinions from given subprojects.
> > > > For example the option would be pre-set to ON in KEXI's installer for
> > > > Mac and  Windows itself and for Linux AppImages, not in the source
> code.
> > > > Just saying, KEXI has not yet switched to the new framework :)
> > >
> > > The policy we are discussing here is (and is supposed to be)
> independent
> > > of the implementation. And that's not just theoretical, Kexi is one
> > > prominent case for an alternative implementation, and the Krita GSoC
> also
> > > seems to contain some alternative server code for example. So input in
> > > particular from those teams matters a lot for me, as this policy in its
> > > current form would affect them too.
> > >
> > > And a policy we only adhere in code and work around in the end by
> putting
> > > on a distributor hat (which we can in many places, as your examples
> show)
> > > isn't really helping, I'd much rather have it reflect what we actually
> do
> > > :)
> > >
> > > From having read the code of both, I think the only possible points of
> > > conflict with the policy draft might be:
> > > - opt-in
> >
> > Source code has 100% of opt-in (grep for
> > 'areas(KexiUserFeedbackAgent::NoAreas)'). Anyone is free to change this
> > default and create distribution under his name and I understand this will
> > not be a "distribution by KDE".
>
> Great, so I just misread both Krita's and Kexi's requirements here and we
> don't have a problem :)
>

In case of KEXI the idea from the very beginning was to make also some
distros happy (avoid the need of patching the source).


> > > - hosted on KDE infrastructure
> >
> > My assumption: As KEXI is an open-core+whatever-license-for-plugins
> > architecture, ultimately the telemetry information from KEXI users would
> be
> > better hosted by KDE. Any extra information retrieved by plugins (if that
> > even exists) can be hosted elsewhere but and this is a responsibility of
> > plugin developers.
>
> Yep, I'd say 3rd party addons are encouraged to follow the same policy,
> just
> like distributors, but we have no way of actually enforcing it.
>
> > > - Kexi seems to (optionally?) contain a unique identifier
> >
> > This is mostly related to cases when any kind of cloud storage is used.
> > These cases involve unique accounts already so users can be identified
> very
> > well even without having telemetry functionality.
> >
> > KEXI installations limited to open-core, used away from a cloud, do not
> > need identifiers.
> > However I understand that identifiers, independent of network or host ID
> > (basically a random-generated QUuids) are useful for even basic telemetry
> > needs. Without them it's easy to abuse the system using any kind of bots
> to
> > trick us that e.g. 99% of sessions happen on KDE 1.0 or that given Linux
> > distro has 90% of the global market :)
>
> Vandalism is a potential problem indeed (did you actually have issues with
> that on Kexi btw? if so, what counter-measures did you apply?). However I
> don't see how a UUID is helping here, the bot could just as well generate
> UUIDs for each submission?
>

UIDs indeed can't help with too clever bots ​but e.g. semi-evil use cases
such as executing apps in batch mode can be catch. I've mostly encountered
logs coming from test machines including myself so I probably should not
have used the term 'bots' but (as unrealistic as it sounds) real bots can
be created.


> > Similarly app projects may need the IDs to answer question about most and
> > least used features. Most used as in "most users found it, understood it
> > and use it", not "most usage reports has been delivered for it (maybe
> > coming from a single user -- maybe even my very own co-developer). There
> > are many other examples probably already discussed.
>
> Sure this gets easier with unique ids, but it's not impossible without
> them.
> After all the goal here isn't to make our lives easier, but to agree on
> something that is acceptable for our users. And yes, that might imply more
> work and/or less accurate data.
>

My assumption when started with telemetry was having adequate level of
precision. Assuming no logs are fabricated as fake interesting questions
are for example: how many users actually run supported software and how
many run outdated one? Not how many executions per given period of time
because it may be that old software is executed by a few users very
frequently for some reason. e.g. because 3 years old sofware crashes on old
OS every minute and restart was needed :)

How to know that without unique (anonymous) identification?
Using extra fields such as OS+Desktop type/version would be indeed a form
of cheap UID.
But I would say disclosing OS+Desktop type/version for that discloses more
than the anonymous random UID represents.
In bugzilla and mailing list we're asking for all this information too
anyway and (at least I) do not like supporting anonymous users since I am
not anonymous.

​BTW, it's worth to remind, the UID is not even a hash of any host and user
info, it's a random number. I do admit that "hash of a host and user info"
would be even better as it allows to recreate the UID after e.g. OS has
been reinstalled or new account created. But I do not use hashing for KEXI
anyway.


> > Thus I would see the Anonymity is covered by KEXI's approach except that
> it
> > offers opt-in tracking of unique user for unique installations. KEXI
> > currently does not track unique installations at all until the user
> agrees
> > for any telemetry (the KexiUserFeedbackAgent::
> AnonymousIdentificationArea
> > value). This is required by nature of stats computed (and abuses
> mentioned
> > above are the reason).
> >
> > Is this a big deal? We're close to philosophy area here.
>
> Correct, this is about the philosophy behind our products :) And one very
> core
> part of that happens to be privacy.
>
> That basically leaves the question: do we want to additionally allow the
> opt-
> in use of unique identifiers?
>

I would say yes. For example ​I see no reasons to reject any (inter)network
software having ​a concept of accounts from KDE. Our Phabricator and forums
and bugzilla are example of that. Well, our very own Akademy registration
software especially if it's "our" code base. All of them operate with
unique IDs. Even more: some software disclose some user-visible strings
(e.g. user names on the forums).
I think the key is to require that the apps, no matter what type, precisely
and clearly ask users for the agreement. And do not scare them.


> > Before designing the stats engine I guessed: not more than installing an
> > email app or buying a SIM card and starting to use them; they allow me to
> > send email or make a call using protocols that disclose quite a bit about
> > me.
>
> Sure, but that is also where we can differentiate. Just because other
> applications
> ​​
> weren't designed with privacy in mind doesn't mean we should
> follow their example IMHO.
>

Well, ​"​weren't designed with privacy in mind"​ sounds s bit strong. Each
application has desired *level* of privacy hopefully defined maybe at
design time. I can imagine a storage for KEXI that is using public GitHub
account and repos in exchange for being free-as-beer cloud solution.

Example. Our KDE software runs on hardware that is not assuring privacy
(emits signals that someone can easily decipher). ​"The apps weren't
designed with privacy in mind because they should block non-open CPUs" --
someone with high-enough expectations would easily say. "But we don't care,
we still want to develop them and see them used" -- we say, that's not our
level.

For most of the folks email has confidentiality are the SIM is OK.

Thanks for the notes and for working on the stuff, Volker.


> Regards,
> Volker
>
> > I would respect
> > users that disagree with that but they are unlikely to become KEXI users.
> > For in my book anonymous users less likely receive support from me in the
> > MLs or forums.
> >
> > > Regards,
> > > Volker
> > >
> > > > > Seeing yesterday's blog from the Krita team (
> > >
> > > https://akapust1n.github.io/
> > >
> > > > > 2017-08-15-sixth-blog-gsoc-2017/), I'd particularly be interested
> in
> > >
> > > their
> > >
> > > > > view on this.
> > > > >
> > > > > Regards,
> > > > > Volker
> > > > >
> > > > > > On Sun, Aug 13, 2017 at 3:18 AM, Christian Loosli
> > > > > >
> > > > > > <christian.loosli at fuchsnet.ch> wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > thank you very much for this work, sounds great!
> > > > > > >
> > > > > > > Only point I have: maybe make sure that the opt-in / default
> > >
> > > settings
> > >
> > > > > are
> > > > >
> > > > > > > not only mandatory for application developers, but also for
> > >
> > > packagers
> > >
> > > > > > > /
> > > > > > > distributions.
> > > > > > >
> > > > > > > Some distributions have rather questionable views on privacy
> and
> > > > > > > by
> > > > > > > default
> > > > > > > sent information to third parties, so I would feel much more
> safe
> > >
> > > if
> > >
> > > > > they
> > > > >
> > > > > > > weren't allowed (in theory) to flick the switch in their
> package
> > > > > > > by
> > > > > > > default to "on" either.
> > > > > > >
> > > > > > > Kind regards,
> > > > > > >
> > > > > > > Christian
>
>
>


-- 
regards, Jaroslaw Staniek

KDE:
: A world-wide network of software engineers, artists, writers, translators
: and facilitators committed to Free Software development - http://kde.org
Calligra Suite:
: A graphic art and office suite - http://calligra.org
Kexi:
: A visual database apps builder - http://calligra.org/kexi
Qt Certified Specialist:
: http://www.linkedin.com/in/jstaniek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-community/attachments/20170818/08327531/attachment.htm>


More information about the kde-community mailing list