Getting to 100 % succedding tests (for 2.9), or, simply dropping them all?
Friedrich W. H. Kossebau
kossebau at kde.org
Thu Feb 5 06:11:46 UTC 2015
Hi,
currently Calligra (2.9 & master) has 313 tests. Those tests could be used to
automatically catch regressions (even better that CI runs them on every push,
so we do not have to to run them ourselves every time) and thus save time
compared to only users starting to see problems after a release, reporting
them incorrectly in the issue tracker and then devs taking time to finding the
real problem and cause.
They also could be useful during the port to Qt5/KF5, as they reassure to a
good degree things have been moved the right direction.
Just, other than a commit breaking the build, a change resulting in a test
suddenly failing does not immediately pos a problem for everyone, so it seems
easy to just ignore that (and fix it tomorrow, well, the other tomorrow, ah,
next WE perhaps).
-> problem 1: no mechanism to enforce people to fix tests they broke
CURRENT SITUATION
Just... now I have be the bad boy here and point to
http://build.kde.org/job/calligra_stable/test/?width=800&height=600
There are around ~40 tests failing, i.e. 13 %.
Which means 10 more failing tests then at the begin of 2.9 branching, where it
was ~30:
http://build.kde.org/job/calligra_master/Variation=All,label=LINBUILDER/1293/
(And the last build for master right now still visible on build.kde,org from
26.11.2014 had only 26 of 314 failing:
http://build.kde.org/job/calligra_master/Variation=All,label=LINBUILDER/1235/)
Now tests are not coming without a price, everyone waiting on the result of a
Calligra CI build (or locally) knows how much time they take, and if it is
only linking.
-> problem 2: running current tests takes a lot of time, too much time locally
PROPOSAL A
Given 10 more failing tests (but no added tests) since the branching of 2.9,
where actually things should have gotten more stable and correct, we should
ask ourselves, who is actually looking at those tests. Anyone?
So could we just get rid of them if noone is? :) Would save a very, very big
amount of cpu cycles and hard disk space for everyone, including CI. And also
code that would need porting.
PROPOSAL B
You are about to hit your Reply button hard after reading proposal A, because
you actually prefer tests? Actually I do as well, and those people who spend
the effort to write, review and maintain all those tests surely also did.
So how could we get back to using the tests as first class utility in our
Calligra development? With e.g. CI reporting STABLE(=no failing tests) builds
every time?
For fixing problem 2 we should separate the current tests into unit tests (so
those simply testing one thing while mocking the rest of the system as much as
possible), integration tests and other types of tests.
And make sure that unit tests take less then seconds to run, so no one is
stopped from using them as part of their workflow, e.g. before pushing their
latest changes to the central repo. "make all test" should be a normal habit.
And leave running all the longer running tests for the CI, he, that's what it
is for.
-> task T0: specify/document different test types (Calligra wiki/build system)
-> task T1: go through all the tests and mark those tests which can be
considered quickly runnable unit tests, integration tests, other tests
Even with that test categorization, there is a number of tests failing
currently that need fixing. Ideally before the 2.9 release and the port. Some
of them are failing since ages (e.g. diff between
http://build.kde.org/job/calligra_master/Variation=All,label=LINBUILDER/1235/testReport/
http://build.kde.org/job/calligra_stable/1829/testReport/
)
-> task T2: find all the long-time failing tests, disable from build, possibly
tag as JJ bugs to fix them)
-> task T3: list all the new failing tests and lets fix them by everyone ASAP
For fixing problem 1, this is a social problem. People need to be aware of the
tests (guess some might not) and value those tests.
No idea if future CI systems deployed could enforce rejection of commits that
break tests, but ideally people simply feel responsible for breaking tests,
like they feel responsible for breaking the normal build.
Personally I see tasks T2 and T3 as something to be done first, best before
2.9 release. See me subscribed to making that happen :)
But first, please your thoughts and feedback on this.
Cheers
Friedrich
More information about the kimageshop
mailing list