Gitlab CI: failed unit tests vs. currently passing CI

Albert Astals Cid aacid at kde.org
Sun Jan 23 11:56:57 GMT 2022


El diumenge, 23 de gener de 2022, a les 1:59:01 (CET), Ben Cooksley va escriure:
> On Sun, Jan 23, 2022 at 12:29 PM Albert Astals Cid <aacid at kde.org> wrote:
> 
> > El diumenge, 23 de gener de 2022, a les 0:09:23 (CET), Ben Cooksley va
> > escriure:
> > > On Sun, Jan 23, 2022 at 11:29 AM Albert Astals Cid <aacid at kde.org>
> > wrote:
> > >
> > > > El dissabte, 22 de gener de 2022, a les 6:11:29 (CET), Ben Cooksley va
> > > > escriure:
> > > > > EXCLUDE_DEPRECATED_BEFORE_AND_ATOn Sat, Jan 22, 2022 at 1:31 PM
> > Friedrich
> > > > > W. H. Kossebau <kossebau at kde.org> wrote:
> > > > >
> > > > > > Hi,
> > > > >
> > > > >
> > > > > > seems that Gitlab CI is currently configured to show the green
> > > > "Success"
> > > > > > checkmark for pipeline runs even if unit tests are failing.
> > > > > >
> > > > >
> > > > > That is correct, only compilation or other internal failures cause
> > the
> > > > > build to show a failure result.
> > > > >
> > > > >
> > > > > > Reasons seems to be that there Gitlab only knows Yay or Nay,
> > without
> > > > the
> > > > > > warning state level known from Jenkins.
> > > > > >
> > > > >
> > > > > Also correct.
> > > > >
> > > > >
> > > > > > And given that quite some projects (sadly) maintain a few long-time
> > > > > > failing
> > > > > > unit tests, having the pipeline fail on unit tests seems to have
> > been
> > > > seen
> > > > > > as
> > > > > > too aggressive
> > > > >
> > > > >
> > > > > Correct again.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > This of course harms the purpose of the unit tests, when their
> > failures
> > > > > > are
> > > > > > only noticed weeks later, not e.g. at MR discussion time.
> > > > > >
> > > > >
> > > > > Gitlab does note changes in the test suite as can currently be seen
> > on
> > > > > https://invent.kde.org/frameworks/kio/-/merge_requests/708
> > > > > Quoting the page:  "Test summary contained 33 failed and 16 fixed
> > test
> > > > > results out of 205 total tests"
> > > > >
> > > > > It does the same thing for Code Quality - "Code quality scanning
> > detected
> > > > > 51 changes in merged results"
> > > >
> > > > Don't want to derail the confirmation, but those results are terrible,
> > > > they always say things changed in places not touched by the code of
> > the MR,
> > > > any idea why?
> > > >
> > >
> > > Unfortunately not - my only guess would be that cppcheck reports results
> > > slightly differently which Gitlab has issues interpreting.
> > >
> > >
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Seeing how at least in KDE Frameworks first regressions sneaked in
> > > > without
> > > > > > someone noticing (nobody looks at logs when the surface shows a
> > green
> > > > > > checkmark, e.g. kcoreaddons, kwidgetsaddons, kio, purpose, krunner
> > on
> > > > > > openSUSE
> > > > > > and possibly more have regressed in recent weeks, see
> > build.kde.org)
> > > > this
> > > > > > should be something to deal with better, right?
> > > > >
> > > > >
> > > > > > Bhushan gave two first ideas just now on #kde-sysadmin:
> > > > > > > Well we can add a switch that repos can commit to saying test
> > > > failure is
> > > > > > build failure
> > > > > > > Another alternative is we use bot to write a comment on MR
> > > > > >
> > > > > > IMHO, to give unit tests the purpose they have, we should by
> > default to
> > > > > > let
> > > > > > test failures be build failures. And have projects opt out if they
> > > > need to
> > > > > > have some unit tests keep failing, instead of e.g. tagging them
> > with
> > > > > > expected
> > > > > > failures or handling any special environment they run into on the
> > CI.
> > > > > >
> > > > > > Your opinions?
> > > > > >
> > > > >
> > > > > The switch will need to be around the other way i'm afraid as there
> > are
> > > > > simply too many projects with broken tests right now.
> > > > > The best place for that switch will be in .kde-ci.yml.
> > > > >
> > > > > My only concern however would be abuse of this switch, much in the
> > way
> > > > that
> > > > > certain projects abuse EXCLUDE_DEPRECATED_BEFORE_AND_AT.
> > > > > The last thing we would want would be for people to flip this switch
> > and
> > > > > then leave their CI builds in a failing state - meaning that actual
> > > > > compilation failures would be missed (and then lead to CI maintenance
> > > > > issues)
> > > > >
> > > > > Thoughts on that?
> > > >
> > > > Test failing should mark the CI as failed, anything other than that
> > > > doesn't make sense. The CI did fail marking it as passed is lying to
> > > > ourselves.
> > >
> > >
> > > > We can *still* merge failed MR with failed CI, the Merge button is just
> > > > red, but it will work.
> > > >
> > >
> > > There is a big difference between "this doesn't compile" (because someone
> > > forgot to commit a header/etc, dependency change that isn't in place or
> > > because of a platform specific issue) and "some tests failed".
> > > What that encourages is for people to ignore the results from the CI
> > system
> > > - as they'll get used to ignoring the CI system saying something is
> > failing.
> >
> > I disagree, "this doesn't compile" is usually super easy to fix once
> > spotted, a failing test is usually more nuanced and catching it when it
> > starts happening is paramount.
> >
> 
> Except when it comes to non-Linux platforms where in some cases developers
> are hostile to fixing issues - viewing those platforms as someone else's
> problem.
> One of the benefits we were expecting to get out of Gitlab CI was that
> build failures on platforms like Windows wouldn't end up being merged and
> would be fixed before it lands.
> 
> You are now wanting to remove that.

I can see where you're coming from, but just let it be clear that this is *not* what I want, what I want is more strictness "green means actually working (i.e. test pass), not just compiling" not less.

Cheers,
  Albert

> 
> In that case I guess we should remove everything but Linux CI - because the
> cost of maintaining it will be too high.
> 
> 
> >
> > >
> > > While this is not such a big deal for Linux, it is a massive deal for the
> > > smaller platforms that far less people run.
> > >
> > > Saying you can merge when the CI says it is failing is setting ourselves
> > up
> > > for failure.
> >
> > I don't see how hiding (because of the super low visibility they have)
> > that the tests are failing, or even worse have started failing (status quo
> > and as far as I understand your suggestion) is any better.
> >
> 
> If a project has all passing tests and has a commitment to ensure they
> remain passing, then I have no problem in principle with that project being
> allowed to make failing tests mean a failing CI run.
> Code should never be merged when the CI run fails though.
> 
> We should never normalise merging when the CI reports a failure - it makes
> pre-merge CI pointless.
> 
> 
> >
> > Cheers,
> >   Albert
> >
> 
> Thanks,
> Ben
> 
> 
> >
> > >
> > >
> > > > Maybe this red button will convince people to fix their tests. (one can
> > > > hope, right?)
> > > >
> > > > Of course if we do this change it should happen after we've done that
> > > > change that fixes the test failing because of however gitlab CI is
> > set-up
> > > > (you mentioned we had to wait for Jenkins to be disabled for that)
> > > >
> > > > Cheers,
> > > >   Albert
> > > >
> > >
> > > Regards,
> > > Ben
> > >
> > >
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Cheers
> > > > > > Friedrich
> > > > > >
> > > > >
> > > > > Cheers,
> > > > > Ben
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
> >
> 






More information about the Kde-frameworks-devel mailing list