CI congestion/starvation

Albert Astals Cid aacid at kde.org
Sat Mar 7 17:30:11 GMT 2026


El dissabte, 7 de març del 2026, a les 14:41:51 (Hora estàndard d’Europa 
central), Alexander Semke va escriure:
> On 01/03/26 01:31, Ben Cooksley wrote:
> > On Sun, Mar 1, 2026 at 10:46 AM Johnny Jazeix <jazeix at gmail.com> wrote:
> >     Le sam. 28 févr. 2026 à 22:24, Ingo Klöcker <kloecker at kde.org> a
> >     
> >     écrit :
> >     > On Samstag, 28. Februar 2026 20:53:38 Mitteleuropäische
> >     
> >     Normalzeit Johnny
> >     
> >     > Jazeix wrote:
> >     > > Hi,
> >     > > today we also have a lot of congestion. After discussion with Ben,
> >     > > it's due to a new Gear update which uses the resources of the
> >     
> >     CI for
> >     
> >     > > multiple hours.
> >     > > Would it be possible to spread the changes done to each repo
> >     
> >     during a
> >     
> >     > > full day (with sleeps between each git push) instead of doing
> >     
> >     them at
> >     
> >     > > once to let other projects use the CI?
> >     > 
> >     > You do realize that this would mean that the people who do our
> >     
> >     releases would
> >     
> >     > have to sit the full day in front of their computer?
> >     
> >     I don't know the exact process, but I guess all the pushes are not
> >     done manually but via a script?
> >     How often is there an error requiring human intervention? If it is
> >     none, the script can run in background and the person can live its
> >     life?
> > 
> > Putting sleeps in between each push would make release preparation
> > activities quite difficult, as pushing the version bumps is just one
> > part of the process.
> > 
> >     > A Gear release happens once a month. I really don't think that's
> >     
> >     a big
> >     
> >     > problem. (Yes, there's also Plasma, but I think that's a lot
> >     
> >     less projects,
> >     
> >     > and Frameworks.) Just make sure that you don't plan a release of
> >     
> >     a non-Gear
> >     
> >     > project around the release date of Gear (or Plasma or
> >     
> >     Frameworks). Marketing-
> >     
> >     > wise it's anyway better to avoid such a collision.
> >     
> >     You don't but other people are impacted. Maybe we can run these heavy
> >     process at a "better" time where less developers are active (I guess
> >     we can have stats from the CI usage)?
> > 
> > It took the CI nodes approximately 10 hours to work through all of the
> > builds for the record (they're just finishing up now, from when they
> > were triggered at 2pm UTC).
> > That includes all the other builds they also received during that time
> > they would normally service.
> > 
> > During this time the CI nodes completed a total of 5,211 builds, with
> > the vast majority of these jobs completing either in a matter of
> > seconds (for the JSON/XML/etc validation jobs) or in the space of a
> > few minutes (for conventional CI and CD jobs).
> > 4,807 of those took less than 10 minutes (160 hours of CI time), 346
> > of them took between 10-25 minutes (85 hours of CI time) and 77 of
> > them took more than 25 minutes (55 hours of CI time) for a total of
> > 301 CI hours (difference of 1 hour due to rounding).
> > 
> > During this we had conventional Linux CI jobs that completed in under
> > a minute (which includes VM provisioning, cloning sources, unpacking
> > dependencies, configure, build, install, publishing build artifacts,
> > and running tests) as well as jobs for other OSes completing in 2-3
> > minutes.
> > 
> > In terms of optimisation, the CI jobs enabled for pim/pim-sieve-editor
> > need to be reviewed, as it is running inappropriate jobs considering
> > the nature of that repository.
> > The results of those runs contributed to 2 hours of wasted CI time.
> > 
> > Data for all this is attached.
> 
> Today the waiting time on CI is very long again looks like. By looking
> at the attached statistics, I think more things should be reviewed and
> optimized.

As announced in this mailing list, we branched KDE Gear yesterday, this means 
triggering jobs for 251 repositories, which are going to take a while to 
process.

A bit of patience goes a long way.

CI has a 2 minutes wait time at the moment (except the macos builder, that's a 
bit more backlogged).

Albert

> 
> 
> build_sphinx_app_docs for docs-kdenlive-org failed after 2h (timeout?)
> and is always expensive in general looks like:
> 
> https://invent.kde.org/documentation/docs-kdenlive-org/-/jobs?kind=BUILD
> 
> 
> There are also multiple qt5 builds (especially the expensive and failing
> builds for krita) - do we still need to support Qt5?
> 
> > That means it is actually not possible to make it non-disruptive, as
> > doing it at a different time would just be a means of favouring one
> > timezone (say EU) over others - it simply takes a significant amount
> > of time to rebuild the world (which is essentially what a Gear release
> > entails).
> 
> If we collect these statistics now for a couple of weeks/months,
> basically the data you attached in the previous email but also with the
> start times, we'll see the distributions across different days and time
> frames and would also be able to calculate the "degree of concurrency on
> CI" - this would allow us to move such peak loads and infrequent
> expensive builds into more idle time frames.






More information about the kde-devel mailing list