GDPR - what does it mean for us?

Ben Cooksley bcooksley at
Sun Apr 1 20:19:52 UTC 2018

On Mon, Apr 2, 2018 at 3:33 AM, Luigi Toscano <luigi.toscano at> wrote:
> Lydia Pintscher ha scritto:
>> Hey folks,
>> As you might have heard new EU privacy regulations are going to come
>> into effect on May 25th. I wanted to start this thread to see if there
>> is anything we still need to do and to make sure we're all aware of it
>> for future activities.
>> I found
>> helpful in understanding what it all means.
>> What I am taking away from it for us so far:
>> * Don't track without consent unless required by law. Consent needs to
>> be asked for in plain understandable language. Collecting data without
>> being able to track the person seems ok (and relevant for our
>> telemetry efforts).
>> * When we're asking for data we need to have a reason for it and we
>> need to make it explicit what we're using the data for.
>> * People can ask for all the data we have about them and can ask for
>> it to be deleted. This means we need to make it less of a pita to do
>> this for sysadmin. And we need a list of all the places where we hold
>> data. Sysadmin: do you have that already?

Just about everything that is supported by a database fits into this
category i'd say, as the database is usually used to store user
account information (username and email address at the minimum,
sometimes more)

Going through our DNS zone file, plus what I can think of off the top
of my head that would be:
- Identity itself
- Bugzilla
- The Wikis
- Phabricator
- All CMS installations (Drupal and Wordpress)
- Forum
- CiviCRM
- Nextcloud
- Etherpad (Notes)
- Reviewboard
- Season of KDE (
- Limesurvey (
- Mailing List archives (

Given that the data collected by Piwik is anonymised to an extent it's
only usable to track user activity in aggregate, it shouldn't be
subject to GDPR regulation.

There is also the donations tracker on, which I believe
records the Paypal transaction ID, along with the donating user's
email address. I would imagine we're legally required to retain a
certain amount of information on donations we receive for Tax and
Anti-Money Laundering purposes however so i'd expect that to be carved
out of GDPR.

>> * Penalties for non-compliance are potentially _severe_.
>> What's still fuzzy to me:
>> * What is considered data in this context? A user profile that the
>> user himself created? A machine-generated user profile based on
>> actions the user took for advertising etc? A post on a forum? Probably
>> all of the above.

I would expect the content submitted by users (postings themselves,
edits to wiki pages, etc) to fall under copyright law, and not be
subject to GDPR as it doesn't actually concern people's personal

The metadata on those postings or edits however, being the link back
to their user profile would be subject to it though. In the case of
the Wikis, Bugzilla and Forum we can resolve that by just merging
those accounts into a single "anonymous" user account though when we
get such requests.

Reviewboard is scheduled to be turned into a static copy, which means
purging anything other than one or two items posted by a person will
be a nightmare. We may end up having to just not have an historical
archive of Reviewboard to avoid this issue altogether. Given the
pushback I got when we first tried to shutdown Reviewboard i'd expect
people to be unhappy about that (the only workaround to that is to
keep it running indefinitely, which means more maintenance for

I'd suggest that we disable commenting globally across all our
Wordpress and Drupal instances, and make all existing comments we hold
inaccessible to the public as i've no idea how easy it is to clean
these up.

Phabricator is a bit more complicated. Upstream strongly advises
against deleting anything from the system, especially those items
which are heavily referenced like user accounts and repositories
(there is a tool to do it though, which gives you a massive "this is
unsupported" warning with red skull and crossbones and all). If
something breaks as a result of deleting stuff, upstream policy is you
get to keep the pieces. Deleting a user account will not delete
anything they posted (Phabricator will just say it was posted by
"Unknown User")

If it's permitted under GDPR, we could just remove the personal
information from the user account and give it a generic username
(deletedaccount001 or something). This would leave the log (metadata)
of actions the user took, along with anything they posted, in place.
It should be noted that even if we do delete a user account, the
underlying database would still have this information (the user PHID
will be there), and you could still reconstruct this log using the
Conduit API if you wanted (so hopefully purging personal information
is enough here).

We'll likely need proper advice on the above one, however it does help
that it's only a log of what they did after we anonymise the profile.

> I'm not an expert of GDPR, but I see that there is a confusion around the
> application of erasure to - for example - version control system.
> IANAL and we probably need a lawyer and coordination with other FLOSS
> communities, but a quick search shows:

That was something I was trying to avoid thinking about, but is also
an issue yes.

On that note, where do things like names and email addresses we put in
the copyright statements in all our source code? Removing all of that
would be nearly impossible (and does that mean the requestor has to
sign away all their IP rights over to us before we can do a removal -
or if they can keep them and someone has legal issue with that code in
the future what happens?)

If there is not an explicit carve out for SCMs then someone should
send the European Commission a "congratulations for making open source
development illegal" card.

> Ciao
> --
> Luigi


More information about the kde-www mailing list