Gitlab CI: failed unit tests vs. currently passing CI

Sat Jan 22 23:29:19 GMT 2022

El diumenge, 23 de gener de 2022, a les 0:09:23 (CET), Ben Cooksley va escriure:
> On Sun, Jan 23, 2022 at 11:29 AM Albert Astals Cid <aacid at kde.org> wrote:
> 
> > El dissabte, 22 de gener de 2022, a les 6:11:29 (CET), Ben Cooksley va
> > escriure:
> > > EXCLUDE_DEPRECATED_BEFORE_AND_ATOn Sat, Jan 22, 2022 at 1:31 PM Friedrich
> > > W. H. Kossebau <kossebau at kde.org> wrote:
> > >
> > > > Hi,
> > >
> > >
> > > > seems that Gitlab CI is currently configured to show the green
> > "Success"
> > > > checkmark for pipeline runs even if unit tests are failing.
> > > >
> > >
> > > That is correct, only compilation or other internal failures cause the
> > > build to show a failure result.
> > >
> > >
> > > > Reasons seems to be that there Gitlab only knows Yay or Nay, without
> > the
> > > > warning state level known from Jenkins.
> > > >
> > >
> > > Also correct.
> > >
> > >
> > > > And given that quite some projects (sadly) maintain a few long-time
> > > > failing
> > > > unit tests, having the pipeline fail on unit tests seems to have been
> > seen
> > > > as
> > > > too aggressive
> > >
> > >
> > > Correct again.
> > >
> > >
> > > >
> > > >
> > > > This of course harms the purpose of the unit tests, when their failures
> > > > are
> > > > only noticed weeks later, not e.g. at MR discussion time.
> > > >
> > >
> > > Gitlab does note changes in the test suite as can currently be seen on
> > > https://invent.kde.org/frameworks/kio/-/merge_requests/708
> > > Quoting the page:  "Test summary contained 33 failed and 16 fixed test
> > > results out of 205 total tests"
> > >
> > > It does the same thing for Code Quality - "Code quality scanning detected
> > > 51 changes in merged results"
> >
> > Don't want to derail the confirmation, but those results are terrible,
> > they always say things changed in places not touched by the code of the MR,
> > any idea why?
> >
> 
> Unfortunately not - my only guess would be that cppcheck reports results
> slightly differently which Gitlab has issues interpreting.
> 
> 
> >
> > >
> > >
> > > >
> > > > Seeing how at least in KDE Frameworks first regressions sneaked in
> > without
> > > > someone noticing (nobody looks at logs when the surface shows a green
> > > > checkmark, e.g. kcoreaddons, kwidgetsaddons, kio, purpose, krunner on
> > > > openSUSE
> > > > and possibly more have regressed in recent weeks, see build.kde.org)
> > this
> > > > should be something to deal with better, right?
> > >
> > >
> > > > Bhushan gave two first ideas just now on #kde-sysadmin:
> > > > > Well we can add a switch that repos can commit to saying test
> > failure is
> > > > build failure
> > > > > Another alternative is we use bot to write a comment on MR
> > > >
> > > > IMHO, to give unit tests the purpose they have, we should by default to
> > > > let
> > > > test failures be build failures. And have projects opt out if they
> > need to
> > > > have some unit tests keep failing, instead of e.g. tagging them with
> > > > expected
> > > > failures or handling any special environment they run into on the CI.
> > > >
> > > > Your opinions?
> > > >
> > >
> > > The switch will need to be around the other way i'm afraid as there are
> > > simply too many projects with broken tests right now.
> > > The best place for that switch will be in .kde-ci.yml.
> > >
> > > My only concern however would be abuse of this switch, much in the way
> > that
> > > certain projects abuse EXCLUDE_DEPRECATED_BEFORE_AND_AT.
> > > The last thing we would want would be for people to flip this switch and
> > > then leave their CI builds in a failing state - meaning that actual
> > > compilation failures would be missed (and then lead to CI maintenance
> > > issues)
> > >
> > > Thoughts on that?
> >
> > Test failing should mark the CI as failed, anything other than that
> > doesn't make sense. The CI did fail marking it as passed is lying to
> > ourselves.
> 
> 
> > We can *still* merge failed MR with failed CI, the Merge button is just
> > red, but it will work.
> >
> 
> There is a big difference between "this doesn't compile" (because someone
> forgot to commit a header/etc, dependency change that isn't in place or
> because of a platform specific issue) and "some tests failed".
> What that encourages is for people to ignore the results from the CI system
> - as they'll get used to ignoring the CI system saying something is failing.

I disagree, "this doesn't compile" is usually super easy to fix once spotted, a failing test is usually more nuanced and catching it when it starts happening is paramount.

> 
> While this is not such a big deal for Linux, it is a massive deal for the
> smaller platforms that far less people run.
> 
> Saying you can merge when the CI says it is failing is setting ourselves up
> for failure.

I don't see how hiding (because of the super low visibility they have) that the tests are failing, or even worse have started failing (status quo and as far as I understand your suggestion) is any better.

Cheers,
  Albert

> 
> 
> > Maybe this red button will convince people to fix their tests. (one can
> > hope, right?)
> >
> > Of course if we do this change it should happen after we've done that
> > change that fixes the test failing because of however gitlab CI is set-up
> > (you mentioned we had to wait for Jenkins to be disabled for that)
> >
> > Cheers,
> >   Albert
> >
> 
> Regards,
> Ben
> 
> 
> >
> > >
> > >
> > > >
> > > > Cheers
> > > > Friedrich
> > > >
> > >
> > > Cheers,
> > > Ben
> > >
> >
> >
> >
> >
> >
>