Telemetry Policy

Ben Cooksley bcooksley at kde.org
Mon Aug 14 10:53:17 BST 2017


On Sun, Aug 13, 2017 at 9:47 PM, Volker Krause <vkrause at kde.org> wrote:
> Hi,

Hi Volker,

>
> during the KUserFeedback BoF at Akademy there was quite some interest in
> collecting telemetry data in KDE applications. But before actually
> implementing that we agreed to define the rules under which we would want to
> do that. I've tried to put the input we collected during Akademy into proper
> wording below. What do you think? Did I miss anything?
>
> Regards,
> Volker
>
>
> # Telemetry Policy Draft
>
> Application telemetry data can be a valuable tool for tailoring our products
> to the needs of our users. The following rules define how KDE collects and
> uses such application telemetry data. As privacy is of utmost importance to
> us, the general rule of thumb is to err on the side of caution here. Privacy
> always trumps any need for telemetry data, no matter how legitimate.
>
> These rules apply to all products released by KDE.
>
> ## Transparency
>
> We provide detailed information about the data that is going to be shared, in
> a way that:
> - is easy to understand
> - is precise and complete
> - is available locally without network connectivity
>
> Any changes or additions to the telemetry functionality of an application will
> be highlighted in the corresponding release announcement.
>
> ## Control
>
> We give the user full control over what data they want to share with KDE. In
> particular:
> - application telemetry is always opt-in, that is off by default
> - application telemetry settings can be changed at any time, and are provided
> as prominent in the application interface as other application settings
> - applications honor system-wide telemetry settings where they exist (global
> "kill switch")
> - we provide detailed documentation about how to control the application
> telemetry system
>
> In order to ensure control over the data after it has been shared with KDE,
> applications will only transmit this data to KDE servers, that is servers
> under the full control of the KDE sysadmin team.
>
> We will provide a designated contact point for users who have concerns about
> the data they have shared with KDE. While we are willing to delete data a user
> no longer wants to have shared, it should be understood that the below rules
> are designed to make identification of data of a specific user impossible, and
> thus a deletion request impractical.

Can we change "impractical" to "effectively impossible" here please?

>
> ## Anonymity
>
> We do not transmit data that could be used to identify a specific user. In
> particular:
> - we will not use any unique device, installation or user id
> - data is stripped of any unnecessary detail and downsampled appropriately
> before sharing to avoid fingerprinting
> - network addresses (which are exposed inevitably as part of the data
> transmission) are not stored together with the telemetry data, and must only
> be stored or used to the extend necessary for abuse counter-measures

I'm wary that people might jump on the network addresses bit here.

Can we please mention that all records that contain network addresses
and other similar information would be stored in such a form that they
could not be associated with telemetry records.

In terms of the logs - as there are other uses for them, i'd prefer if
we widened that to also allow them to be kept to allow us to maintain
the proper and effective operation of the telemetry system and other
associated services. The time we retain those logs should also be at
our complete and total discretion and if need be should be indefinite.

>
> ## Minimalism
>
> We only track the bare minimum of data necessary to answer specific questions,
> we do not collect data preemptively or for exploratory research. In
> particular, this means:
> - collected data  must have a clear purpose
> - data is downsampled to the maximum extend possible at the source
> - relevant correlations between individual bits of data should be computed at
> the source whenever possible
> - data collection is stopped once corresponding question has been answered
>
> ## Privacy
>
> We will never transmit anything containing user content, or even just hints at
> possible user content such as e.g. file names, URLs, etc.
>
> We will only ever track:
> - system information that are specific to the installation/environment, but
> independent of how the application/machine/installation is actually used
> - statistical usage data of an installation/application
>
> ## Compliance
>
> KDE only releases products capable of acquiring telemetry data if compliance
> with these rules has been established by a public review on [kde-core-devel|
> kde-community]@kde.org from at least two reviewers. The review has to be
> repeated for every release if changes have been made to how/what data is
> collected.
>
> Received data is regularly reviewed for violations of these rules, in
> particular for data that is prone to fingerprinting. Should such violations be
> found, the affected data will be deleted, and data recording will be suspended
> until compliance with these rules has been established again. In order to
> enable reviewing of the data, every KDE contributor with a developer account
> will have access to all telemetry data gathered by any KDE product.

I've got two technical notes here:

1) All products should fetch details on where to submit telemetry data
from an online configuration file similar to
https://autoconfig.kde.org/ocs/providers.xml

This would give us the capacity to version the telemetry server api,
and potentially even "kill" telemetry submissions from older
application versions if needed.

2) No software product should use the QNetworkAccessManager family of
classes due to known defects in it's operation within some versions of
Qt which cause infrastructure problems.

Cheers,
Ben



More information about the kde-community mailing list