From Fedora Project Wiki
(initial page on comparison)
 
m (→‎Conclusion: slight rewording)
Line 120: Line 120:


= Conclusion =
= Conclusion =
After comparison, I have come to the conclusion that both nose and py.test are capable of fulfilling our needs for self-testing in AutoQA. However, the point of this comparison was to select one of the two frameworks.
After a detailed comparison of the two tools, I have come to the conclusion that both nose and py.test are capable of fulfilling our needs for self-testing in AutoQA. However, the point of this comparison was to select one of the two frameworks.


I think that py.test would be a better choice given the comparison above. Py.test has better documentation, more detailed output on test failure, more customizability without resorting to custom plugins and better support for test isolation. While we would not be able to leverage as much local experience with py.test, better documentation should lead us towards finding solutions in that documentation instead of having to rely on the experience of others to find those solutions.
Considering the comparison above, I think that py.test would be the better choice for AutoQA. Py.test has better documentation, more detailed output on test failure, more customizability without resorting to custom plugins and better support for test isolation. While we would not be able to leverage as much local experience with py.test, better documentation should lead us towards finding solutions in that documentation instead of having to rely on the experience of others to find those solutions.

Revision as of 21:30, 4 March 2011

Introduction

There is currently a desire to add more test coverage to AutoQA but we need to make a decision on the test tools to use. The following is my comparison of nose and py.test in the context of finding the better solution for AutoQA.

I (tflink) think that I have made it pretty clear what is my opinion and what is a plain comparison (YMMV). Please let me know if I missed some important aspect or if I got something wrong (or just edit the page)

Source Code

What good is a comparison without code to go with it?

Comparison

While nose and py.test are similar tools, they do have their differences and we want to make the best choice for us. Based on the two proofs of concepts I did, I will outline what I see as the advantages and disadvantages of both tools. For the purposes of this comparison, I will be talking about pytest 2.0.1 and nose 1.0.0.

Documentation

From what I saw, the documentation for pytest is far superior to the documentation for nose . The documentation for nose lacked enough detail in order to get some functions working (I had problems getting @with_setup to behave) and lacks examples. Pytest, on the other hand has more detailed documentation, examples and pointers to specific blog entries that detail some non-standard functions.

Test Detection

AutoQA is related to testing and we have some classes and functions that have “test” in the name. Since the default naming convention for tests in python is anything with “test” in it, there are some false positives for tests in our code base. Both nose and pytest use test discovery but there are different effects of the two specific approaches.

Test Detection in nose

Nose uses regular expressions in order to determine what is and is not a test. The exact regular expression used is easily configured from either the command line or a configuration file. This single regular expression is used for file, class, module and function detection. Short of writing custom plugins, this seems to be the only way to change the test detection mechanism.

When differentiating between unit tests and functional tests, it is pretty easy to set up decorators and use command line options to specify which test decorators should be run.

Test Detection in pytest

From a user perspective, pytest relies on multiple glob statements instead of a single regular expression when determining what is and is not a test. There are separate configuration options for detecting files, classes and functions which makes it easier to change the naming convention for functions and have better granularity for eliminating false positives without having to resort to complicated regular expressions and/or strict naming conventions.

It is not difficult to modify pytest to differentiate between unit and functional tests but the easiest way to do so still sets up all of the tests even if they aren't executed. This makes the test run as a whole very slow. I was able to get around the slowdown by using a different method for excluding tests based on filename but it is something to keep in mind.

Integration with unittest

This isn't a huge issue for us seeing as we don't have a whole lot of existing tests but historically, nose has had better integration with unittest than pytest had. There has been an effort to improve this in the newest release and pytest now claims to have unittest support equivalent to nose.

Test Isolation

Both nose and pytest have at least some facilities for reverting in-test changes to sys modules. Pytest has better integration for doing and undoing changes to arbitrary modules on a per-test basis through the monkeypatch plugin.

Neither nose nor pytest would have any issues integrating with virtualenv and both are in PyPI and thus installable through pip with no additional work on our part.

Pytest does have better support for temp directory management than nose does. This would likely have a greater effect fucntional testing more than unit testing but lends itself well to package downloads and any logfiles or output generated by external tools.

Customization and Extending

Both pytest and nose have good facilities for writing plugins so customization wouldn't be a huge issue for either.

Pytest does have several well-defined hooks to override some of its default behavior without writing plugins.

Output Asthetics

I introduced the same failure to both proofs of concept in order to demonstrate the output on test failure. Personally, I prefer the output from pytest over nose.

nose

(test_env)[tflink@localhost autoqa-devel]$ nosetests lib/python/tests/
...F...........
======================================================================
FAIL: test_koji_utils.TestGetNvrRpms.test_should_return_filename
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/srv/code/autoqa-devel/test_env/lib/python2.7/site-packages/nose/case.py",
    line 187, in runTest self.test(*self.arg)
  File "/srv/code/autoqa-devel/lib/python/tests/test_koji_utils.py", line 83, in
    test_should_return_filename assert test_filename == [self.ref_filename]
AssertionError

----------------------------------------------------------------------
Ran 15 tests in 6.387s

FAILED (failures=1)


pytest

(test_env)[tflink@localhost autoqa-devel]$ py.test lib/python/tests/
=============================== test session starts ===============================
platform linux2 -- Python 2.7.0 -- pytest-2.0.1
collected 5 items 

lib/python/tests/test_koji_utils.py ...F.

==================================== FAILURES =====================================
______________________ TestGetNvrRpms.should_return_filename ______________________

self = <test_koji_utils.TestGetNvrRpms instance at 0x135e320>
monkeypatch = <_pytest.monkeypatch.monkeypatch instance at 0x135e758>

    def should_return_filename(self, monkeypatch):
        monkeypatch.setattr(self.testkoji, 'nvr_to_urls', self.nvr_to_urls)
        monkeypatch.setattr(self.testkoji, 'pkgurl',
          'http://koji.fedoraproject.org/packages')
    
        test_filename = self.testkoji.get_nvr_rpms(self.test_nvr, self.ref_dir)
    
>       assert test_filename == [self.ref_filename]
E       assert ['makeitfail-.../rpmdir/rpm1'] == ['/tmp/rpmdir/rpm1']
E         At index 0 diff: 'makeitfail-0.1-2.noarch' != '/tmp/rpmdir/rpm1'
E         Left contains more items, first extra item: '/tmp/rpmdir/rpm1'

lib/python/tests/test_koji_utils.py:96: AssertionError
======================= 1 failed, 4 passed in 0.10 seconds ========================


Other Testing Features

None of these features were used in the proofs of concept but I thought that they were interesting and potentially useful enough to warrant mention. Both of these features are from pytest and while I'm sure you can do similar things with nose, (as far as I know) it would involve custom code and/or plugins.

Test Parameterization

Py.test has an interesting feature called Test Parameterization where you can change some of the test inputs pragmatically. The example they use is to swap out databases for different testing but we might be able to use it for package sources or something similar.

Application Specific Test Fixtures

Another built-in pytest feature of note is the ability to have fixtures specific to an application. The advantage to this is that you can consolidate complex application specific setup code in one place instead of duplicating it across multiple test classes. I can see this also being useful in working with test resources (koji, bodhi, repositories etc. )


Package Availability

Neither pytest 2.0.1 nor nose 1.0.0 are currently available in the Fedora repositories. One potential difficulty with pytest is is related to its recent split from pylib. Since pytest is currently a part of the pylib pacakge that is currently in the Fedora repositories, any farther upgrades would require that package to be split into two separate packages instead of a simple upgrade. I imagine that this will happen in the future but it is one thing to consider.

Local Experience

Local is a relative term here, since most of us are in different parts of the world. As an example, Anaconda's test suite seems to be mostly written unittest using nose as a runner. While they seem to have used a slightly different strategy than what is proposed here, it would be easier to leverage their experience with testing frameworks if we used the same system they do.

Conclusion

After a detailed comparison of the two tools, I have come to the conclusion that both nose and py.test are capable of fulfilling our needs for self-testing in AutoQA. However, the point of this comparison was to select one of the two frameworks.

Considering the comparison above, I think that py.test would be the better choice for AutoQA. Py.test has better documentation, more detailed output on test failure, more customizability without resorting to custom plugins and better support for test isolation. While we would not be able to leverage as much local experience with py.test, better documentation should lead us towards finding solutions in that documentation instead of having to rely on the experience of others to find those solutions.