Gitlab CI: failed unit tests vs. currently passing CI

Sun Jan 23 00:59:01 GMT 2022

On Sun, Jan 23, 2022 at 12:29 PM Albert Astals Cid <aacid at kde.org> wrote:

> El diumenge, 23 de gener de 2022, a les 0:09:23 (CET), Ben Cooksley va
> escriure:
> > On Sun, Jan 23, 2022 at 11:29 AM Albert Astals Cid <aacid at kde.org>
> wrote:
> >
> > > El dissabte, 22 de gener de 2022, a les 6:11:29 (CET), Ben Cooksley va
> > > escriure:
> > > > EXCLUDE_DEPRECATED_BEFORE_AND_ATOn Sat, Jan 22, 2022 at 1:31 PM
> Friedrich
> > > > W. H. Kossebau <kossebau at kde.org> wrote:
> > > >
> > > > > Hi,
> > > >
> > > >
> > > > > seems that Gitlab CI is currently configured to show the green
> > > "Success"
> > > > > checkmark for pipeline runs even if unit tests are failing.
> > > > >
> > > >
> > > > That is correct, only compilation or other internal failures cause
> the
> > > > build to show a failure result.
> > > >
> > > >
> > > > > Reasons seems to be that there Gitlab only knows Yay or Nay,
> without
> > > the
> > > > > warning state level known from Jenkins.
> > > > >
> > > >
> > > > Also correct.
> > > >
> > > >
> > > > > And given that quite some projects (sadly) maintain a few long-time
> > > > > failing
> > > > > unit tests, having the pipeline fail on unit tests seems to have
> been
> > > seen
> > > > > as
> > > > > too aggressive
> > > >
> > > >
> > > > Correct again.
> > > >
> > > >
> > > > >
> > > > >
> > > > > This of course harms the purpose of the unit tests, when their
> failures
> > > > > are
> > > > > only noticed weeks later, not e.g. at MR discussion time.
> > > > >
> > > >
> > > > Gitlab does note changes in the test suite as can currently be seen
> on
> > > > https://invent.kde.org/frameworks/kio/-/merge_requests/708
> > > > Quoting the page:  "Test summary contained 33 failed and 16 fixed
> test
> > > > results out of 205 total tests"
> > > >
> > > > It does the same thing for Code Quality - "Code quality scanning
> detected
> > > > 51 changes in merged results"
> > >
> > > Don't want to derail the confirmation, but those results are terrible,
> > > they always say things changed in places not touched by the code of
> the MR,
> > > any idea why?
> > >
> >
> > Unfortunately not - my only guess would be that cppcheck reports results
> > slightly differently which Gitlab has issues interpreting.
> >
> >
> > >
> > > >
> > > >
> > > > >
> > > > > Seeing how at least in KDE Frameworks first regressions sneaked in
> > > without
> > > > > someone noticing (nobody looks at logs when the surface shows a
> green
> > > > > checkmark, e.g. kcoreaddons, kwidgetsaddons, kio, purpose, krunner
> on
> > > > > openSUSE
> > > > > and possibly more have regressed in recent weeks, see
> build.kde.org)
> > > this
> > > > > should be something to deal with better, right?
> > > >
> > > >
> > > > > Bhushan gave two first ideas just now on #kde-sysadmin:
> > > > > > Well we can add a switch that repos can commit to saying test
> > > failure is
> > > > > build failure
> > > > > > Another alternative is we use bot to write a comment on MR
> > > > >
> > > > > IMHO, to give unit tests the purpose they have, we should by
> default to
> > > > > let
> > > > > test failures be build failures. And have projects opt out if they
> > > need to
> > > > > have some unit tests keep failing, instead of e.g. tagging them
> with
> > > > > expected
> > > > > failures or handling any special environment they run into on the
> CI.
> > > > >
> > > > > Your opinions?
> > > > >
> > > >
> > > > The switch will need to be around the other way i'm afraid as there
> are
> > > > simply too many projects with broken tests right now.
> > > > The best place for that switch will be in .kde-ci.yml.
> > > >
> > > > My only concern however would be abuse of this switch, much in the
> way
> > > that
> > > > certain projects abuse EXCLUDE_DEPRECATED_BEFORE_AND_AT.
> > > > The last thing we would want would be for people to flip this switch
> and
> > > > then leave their CI builds in a failing state - meaning that actual
> > > > compilation failures would be missed (and then lead to CI maintenance
> > > > issues)
> > > >
> > > > Thoughts on that?
> > >
> > > Test failing should mark the CI as failed, anything other than that
> > > doesn't make sense. The CI did fail marking it as passed is lying to
> > > ourselves.
> >
> >
> > > We can *still* merge failed MR with failed CI, the Merge button is just
> > > red, but it will work.
> > >
> >
> > There is a big difference between "this doesn't compile" (because someone
> > forgot to commit a header/etc, dependency change that isn't in place or
> > because of a platform specific issue) and "some tests failed".
> > What that encourages is for people to ignore the results from the CI
> system
> > - as they'll get used to ignoring the CI system saying something is
> failing.
>
> I disagree, "this doesn't compile" is usually super easy to fix once
> spotted, a failing test is usually more nuanced and catching it when it
> starts happening is paramount.
>

Except when it comes to non-Linux platforms where in some cases developers
are hostile to fixing issues - viewing those platforms as someone else's
problem.
One of the benefits we were expecting to get out of Gitlab CI was that
build failures on platforms like Windows wouldn't end up being merged and
would be fixed before it lands.

You are now wanting to remove that.

In that case I guess we should remove everything but Linux CI - because the
cost of maintaining it will be too high.

>
> >
> > While this is not such a big deal for Linux, it is a massive deal for the
> > smaller platforms that far less people run.
> >
> > Saying you can merge when the CI says it is failing is setting ourselves
> up
> > for failure.
>
> I don't see how hiding (because of the super low visibility they have)
> that the tests are failing, or even worse have started failing (status quo
> and as far as I understand your suggestion) is any better.
>

If a project has all passing tests and has a commitment to ensure they
remain passing, then I have no problem in principle with that project being
allowed to make failing tests mean a failing CI run.
Code should never be merged when the CI run fails though.

We should never normalise merging when the CI reports a failure - it makes
pre-merge CI pointless.

>
> Cheers,
>   Albert
>

Thanks,
Ben

>
> >
> >
> > > Maybe this red button will convince people to fix their tests. (one can
> > > hope, right?)
> > >
> > > Of course if we do this change it should happen after we've done that
> > > change that fixes the test failing because of however gitlab CI is
> set-up
> > > (you mentioned we had to wait for Jenkins to be disabled for that)
> > >
> > > Cheers,
> > >   Albert
> > >
> >
> > Regards,
> > Ben
> >
> >
> > >
> > > >
> > > >
> > > > >
> > > > > Cheers
> > > > > Friedrich
> > > > >
> > > >
> > > > Cheers,
> > > > Ben
> > > >
> > >
> > >
> > >
> > >
> > >
> >
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20220123/743335cc/attachment.htm>