Telemetry Policy

Martin Flöser mgraesslin at kde.org
Mon Aug 14 16:06:24 BST 2017


Am 2017-08-14 14:17, schrieb Volker Krause:
> On Sunday, 13 August 2017 12:56:27 CEST Martin Flöser wrote:
>> Am 2017-08-13 11:47, schrieb Volker Krause:
>> > Hi,
>> >
>> > during the KUserFeedback BoF at Akademy there was quite some interest
>> > in
>> > collecting telemetry data in KDE applications. But before actually
>> > implementing that we agreed to define the rules under which we would
>> > want to
>> > do that. I've tried to put the input we collected during Akademy into
>> > proper
>> > wording below. What do you think? Did I miss anything?
>> 
>> To me it looks good!
>> 
>> I have some additional requests:
>>   * the collected data must be made available to the public (mostly
>> thinking of research institutes here)
> 
> This has come up before, not in the context of 3rd parties like 
> research
> organisations, but for transparency towards our users.
> 
> There is a practical limitation of making raw data available live, as 
> that
> would create a publicly readable and writable system, with similar 
> abuse
> potential as e.g. pastebin. But I don't think that is the requirement 
> you have
> in mind here, it's more about sharing the raw data after 
> review/eventually,
> right?

Yes. I certainly don't want to give public edit functionality.

> 
> In the currently envisioned setup anyone with a KDE contributor account 
> would
> have access, so the remaining questions would be about the 
> practicalities and
> processes to review and release the data to the general public I think.
> 
>>   * data must be made available under a CC license (CC0?)
> 
> Interesting point, I hadn't thought about that yet :) Can we even 
> license the
> data, as we didn't create it? Do we need to ask our users to license 
> their
> telemetry contributions?

Yes we can license the data. We are building up a database, so the 
copyright as for databases applies. As you are German I reference the 
German Wikipedia article: https://de.wikipedia.org/wiki/Datenbankwerk

> 
>>   * maybe allow the user to delete the dataset again (difficult as 
>> that
>> conflicts with making the data public and would require authentication
>> which is the opposite to anonymity).
> 
> As discussed on kde-core-devel a while ago, I think this would be 
> doable
> technically, without compromising anonymity. The server would generate 
> a
> unique unpredictable token for each submitted sample and return that to 
> the
> client. The client collects those and can use them as part of a 
> deletion
> request.
> 
> However, this does only work as long as we have full control over the 
> data, we
> can't recall data that has already been extracted from our systems. So 
> I think
> this conflicts with the first two requirements you mentioned. How do we 
> want to
> resolve that?

Yes I am aware that these are contradicting requirements. Of course it's 
not possible to delete after it's published, but maybe we have 
situations like a user submitted data and than things about it half an 
hour later and decided "no, I don't want to share". So if it's 
technically possible it would be nice to have.

Cheers
Martin

> 
> Regards,
> Volker
> 
>> > # Telemetry Policy Draft
>> >
>> > Application telemetry data can be a valuable tool for tailoring our
>> > products
>> > to the needs of our users. The following rules define how KDE collects
>> > and
>> > uses such application telemetry data. As privacy is of utmost
>> > importance to
>> > us, the general rule of thumb is to err on the side of caution here.
>> > Privacy
>> > always trumps any need for telemetry data, no matter how legitimate.
>> >
>> > These rules apply to all products released by KDE.
>> >
>> > ## Transparency
>> >
>> > We provide detailed information about the data that is going to be
>> > shared, in
>> > a way that:
>> > - is easy to understand
>> > - is precise and complete
>> > - is available locally without network connectivity
>> >
>> > Any changes or additions to the telemetry functionality of an
>> > application will
>> > be highlighted in the corresponding release announcement.
>> >
>> > ## Control
>> >
>> > We give the user full control over what data they want to share with
>> > KDE. In
>> > particular:
>> > - application telemetry is always opt-in, that is off by default
>> > - application telemetry settings can be changed at any time, and are
>> > provided
>> > as prominent in the application interface as other application settings
>> > - applications honor system-wide telemetry settings where they exist
>> > (global
>> > "kill switch")
>> > - we provide detailed documentation about how to control the
>> > application
>> > telemetry system
>> >
>> > In order to ensure control over the data after it has been shared with
>> > KDE,
>> > applications will only transmit this data to KDE servers, that is
>> > servers
>> > under the full control of the KDE sysadmin team.
>> >
>> > We will provide a designated contact point for users who have concerns
>> > about
>> > the data they have shared with KDE. While we are willing to delete data
>> > a user
>> > no longer wants to have shared, it should be understood that the below
>> > rules
>> > are designed to make identification of data of a specific user
>> > impossible, and
>> > thus a deletion request impractical.
>> >
>> > ## Anonymity
>> >
>> > We do not transmit data that could be used to identify a specific user.
>> > In
>> > particular:
>> > - we will not use any unique device, installation or user id
>> > - data is stripped of any unnecessary detail and downsampled
>> > appropriately
>> > before sharing to avoid fingerprinting
>> > - network addresses (which are exposed inevitably as part of the data
>> > transmission) are not stored together with the telemetry data, and must
>> > only
>> > be stored or used to the extend necessary for abuse counter-measures
>> >
>> > ## Minimalism
>> >
>> > We only track the bare minimum of data necessary to answer specific
>> > questions,
>> > we do not collect data preemptively or for exploratory research. In
>> > particular, this means:
>> > - collected data  must have a clear purpose
>> > - data is downsampled to the maximum extend possible at the source
>> > - relevant correlations between individual bits of data should be
>> > computed at
>> > the source whenever possible
>> > - data collection is stopped once corresponding question has been
>> > answered
>> >
>> > ## Privacy
>> >
>> > We will never transmit anything containing user content, or even just
>> > hints at
>> > possible user content such as e.g. file names, URLs, etc.
>> >
>> > We will only ever track:
>> > - system information that are specific to the installation/environment,
>> > but
>> > independent of how the application/machine/installation is actually
>> > used
>> > - statistical usage data of an installation/application
>> >
>> > ## Compliance
>> >
>> > KDE only releases products capable of acquiring telemetry data if
>> > compliance
>> > with these rules has been established by a public review on
>> > [kde-core-devel|
>> > kde-community]@kde.org from at least two reviewers. The review has to
>> > be
>> > repeated for every release if changes have been made to how/what data
>> > is
>> > collected.
>> >
>> > Received data is regularly reviewed for violations of these rules, in
>> > particular for data that is prone to fingerprinting. Should such
>> > violations be
>> > found, the affected data will be deleted, and data recording will be
>> > suspended
>> > until compliance with these rules has been established again. In order
>> > to
>> > enable reviewing of the data, every KDE contributor with a developer
>> > account
>> > will have access to all telemetry data gathered by any KDE product.



More information about the kde-community mailing list