Getting to 100 % succedding tests (for 2.9), or, simply dropping them all?
Dmitry Kazakov
dimula73 at gmail.com
Thu Feb 5 07:12:37 GMT 2015
Hi, Friedrich!
My notes about the tests in Krita.
1) Quite a lot of tests in Krita are based on comparing to reference
QImage. These tests are really useful for catching regressions and
debugging whole subsystems. But they have a few drawbacks:
1.1) Refernce .png files take a lot of space in repository (current
solution: https://answers.launchpad.net/krita-ru/+faq/2670)
1.2) The rendered images depend on too many things like libraries
installed, their version and CPU model (e.g. Vc, FFTW3, CPU capabilities
found). It means that the test may run fine on developer's PC, but fail on
jenkins.
2) Consequence of 1): from time to time we should manually check what
exactly is wrong with the test. Is it a pixel drift or a real problem.
3) I am firmly against disabling failing unittests from the build system.
We had quite a few cases when the tests were simply forgotten and rotten
after being disabled from build, since we have no system for controlling
it. Spamming (already overloaded) bugzilla is not a solution as well.
4) Is it possible to add some tagging to unittests? Like in cmake:
kde4_add_unit_test(KisDummiesFacadeTest
TESTNAME krita-ui-KisDummiesFacadeTest
TESTSSET integration # <----------------------------special tag
${kis_dummies_facade_test_SRCS})
kde4_add_unit_test(KisZoomAndPanTest
TESTNAME krita-ui-KisZoomAndPanTest
TESTSSET extended # <----------------------------special tag
${kis_zoom_and_pan_test_SRCS})
So that we could have several sets of tests:
make test
make test integration
make test extended
There is one important point: *all* the test sets should be compiled when
KDE_BUILD_TESTS==on
5) It would also be nice to be able to choose different subtests of one
executable to be in different subsets. Though I am not sure whether it is
doable.
As a conclusion:
If we had the tests tagging system implemented before the release, we could
just tag the failing one with 'fix-later' or something.
On Thu, Feb 5, 2015 at 9:11 AM, Friedrich W. H. Kossebau <kossebau at kde.org>
wrote:
> Hi,
>
> currently Calligra (2.9 & master) has 313 tests. Those tests could be used
> to
> automatically catch regressions (even better that CI runs them on every
> push,
> so we do not have to to run them ourselves every time) and thus save time
> compared to only users starting to see problems after a release, reporting
> them incorrectly in the issue tracker and then devs taking time to finding
> the
> real problem and cause.
> They also could be useful during the port to Qt5/KF5, as they reassure to a
> good degree things have been moved the right direction.
>
> Just, other than a commit breaking the build, a change resulting in a test
> suddenly failing does not immediately pos a problem for everyone, so it
> seems
> easy to just ignore that (and fix it tomorrow, well, the other tomorrow,
> ah,
> next WE perhaps).
>
> -> problem 1: no mechanism to enforce people to fix tests they broke
>
>
> CURRENT SITUATION
>
> Just... now I have be the bad boy here and point to
> http://build.kde.org/job/calligra_stable/test/?width=800&height=600
> There are around ~40 tests failing, i.e. 13 %.
>
> Which means 10 more failing tests then at the begin of 2.9 branching,
> where it
> was ~30:
>
> http://build.kde.org/job/calligra_master/Variation=All,label=LINBUILDER/1293/
>
> (And the last build for master right now still visible on build.kde,org
> from
> 26.11.2014 had only 26 of 314 failing:
>
> http://build.kde.org/job/calligra_master/Variation=All,label=LINBUILDER/1235/
> )
>
> Now tests are not coming without a price, everyone waiting on the result
> of a
> Calligra CI build (or locally) knows how much time they take, and if it is
> only linking.
>
> -> problem 2: running current tests takes a lot of time, too much time
> locally
>
>
> PROPOSAL A
>
> Given 10 more failing tests (but no added tests) since the branching of
> 2.9,
> where actually things should have gotten more stable and correct, we should
> ask ourselves, who is actually looking at those tests. Anyone?
>
> So could we just get rid of them if noone is? :) Would save a very, very
> big
> amount of cpu cycles and hard disk space for everyone, including CI. And
> also
> code that would need porting.
>
>
> PROPOSAL B
>
> You are about to hit your Reply button hard after reading proposal A,
> because
> you actually prefer tests? Actually I do as well, and those people who
> spend
> the effort to write, review and maintain all those tests surely also did.
>
> So how could we get back to using the tests as first class utility in our
> Calligra development? With e.g. CI reporting STABLE(=no failing tests)
> builds
> every time?
>
> For fixing problem 2 we should separate the current tests into unit tests
> (so
> those simply testing one thing while mocking the rest of the system as
> much as
> possible), integration tests and other types of tests.
> And make sure that unit tests take less then seconds to run, so no one is
> stopped from using them as part of their workflow, e.g. before pushing
> their
> latest changes to the central repo. "make all test" should be a normal
> habit.
> And leave running all the longer running tests for the CI, he, that's what
> it
> is for.
> -> task T0: specify/document different test types (Calligra wiki/build
> system)
> -> task T1: go through all the tests and mark those tests which can be
> considered quickly runnable unit tests, integration tests, other tests
>
> Even with that test categorization, there is a number of tests failing
> currently that need fixing. Ideally before the 2.9 release and the port.
> Some
> of them are failing since ages (e.g. diff between
>
> http://build.kde.org/job/calligra_master/Variation=All,label=LINBUILDER/1235/testReport/
> http://build.kde.org/job/calligra_stable/1829/testReport/
> )
> -> task T2: find all the long-time failing tests, disable from build,
> possibly
> tag as JJ bugs to fix them)
> -> task T3: list all the new failing tests and lets fix them by everyone
> ASAP
>
> For fixing problem 1, this is a social problem. People need to be aware of
> the
> tests (guess some might not) and value those tests.
> No idea if future CI systems deployed could enforce rejection of commits
> that
> break tests, but ideally people simply feel responsible for breaking tests,
> like they feel responsible for breaking the normal build.
>
> Personally I see tasks T2 and T3 as something to be done first, best before
> 2.9 release. See me subscribed to making that happen :)
>
> But first, please your thoughts and feedback on this.
>
> Cheers
> Friedrich
> _______________________________________________
> calligra-devel mailing list
> calligra-devel at kde.org
> https://mail.kde.org/mailman/listinfo/calligra-devel
>
--
Dmitry Kazakov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/calligra-devel/attachments/20150205/c1bb1727/attachment.htm>
More information about the calligra-devel
mailing list