CI Outages
Ben Cooksley
bcooksley at kde.org
Fri May 1 11:02:20 BST 2026
Hi all,
Over the past 48 hours or so we've had a series of two unfortunate
incidents that have significantly degraded the CI system.
The first part of this took place approximately 36 hours ago, when an
admin, in response to the announcement of the https://copy.fail/ Linux
kernel exploit, installed updates on one or more of our VM Runners. The
unfortunate side effect of this is that it also installed updates for
gitlab-runner and upgraded it to a newer version. As part of it's work,
Gitlab Runner requires the assistance of a helper binary within the VM, and
this helper should ideally be the same version as is deployed on the VM
runner hosts, or at the very least be a newer version.
In the case of this update, there were incompatible changes as part of
changes to how artifacts are captured, which is why we are seeing breakages
related to an unrecognised timeout parameter which is causing a complete
fatal failure of the CI jobs.
The images that support the majority of our CI builds (Linux - Qt 6.11, Qt
6.12 and Qt 5.15, Android, Flatpak, Snap and Appimages) have been rebuilt
to include the newer Gitlab Runner helper and those builds should now be
functional again.
Custom VM images utilised by Yocto, Buildstream, Neon and KDE Linux have
also been rebuilt and should also be functional again.
Windows builds require a replacement base image as the Gitlab Runner helper
is burned into the base image - and that is in the process of being
uploaded currently.
Once uploaded, i'll rebuild the image that supports both general Windows CI
and Craft builds which will restore those builds to working order as well.
For FreeBSD, we will need our custom package repository updated to include
the newer Gitlab Runner helper.
This has been requested and should be completed in the next few days so
those builds will remain broken for a bit longer i'm afraid.
The second incident involved a service outage of the builder that supports
Docker based jobs. This was caused by hoster maintenance related to the SAN
that supports those hosts, and also caused service disruptions to all
Notary Service operations, WebSVN and Sentry.
This outage impacted us for approximately 12 hours and has now been
corrected with all services fully returned to normal.
Apologies for the disruption caused by these incidents, it is most
regrettable - and in the case of the issue affecting VM builds - completely
avoidable.
Please let me know if you have any questions on the above.
Many thanks,
Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20260501/b6097dea/attachment.htm>
More information about the kde-core-devel
mailing list