| Regression test suite for cairo. |
| |
| How to use cairo's test suite |
| ============================= |
| Using this test should be as simple as running: |
| |
| make test |
| |
| assuming that the cairo distribution in the directory above has been |
| configured and built. The test suite here goes through some effort to |
| run against the locally compiled library rather than any installed |
| version, but those efforts may fall short depending on the level of your |
| libtool madness. |
| |
| The results of the test suite run are summarized in an index.html |
| file, which, when viewed in a web browser makes it quite easy to |
| visually see any failed renderings alongside the corresponding |
| reference image, (and a diff image as well). |
| |
| The test suite needs to be run before any code is committed and before |
| any release. See below for hints and rules governing the use of the suite. |
| |
| The test suite is built as a single binary, which allows you to choose |
| individual or categories of tests to run. For example, to run specific tests: |
| ./cairo-test-suite record-neg-extents-unbounded record-neg-extents-bounded |
| Or if you want to run all paint.* related tests you can use: |
| ./cairo-test-suite paint |
| Or if you want to check the current status of known failures: |
| ./cairo-test-suite XFAIL |
| Or to run a subset of tests, use the -k option to run only the tests |
| that include the given keyword: |
| ./cairo-test-suite -k downscale |
| The binary also permits controlling which backend is used via the |
| CAIRO_TEST_TARGET environment variable, so for instance: |
| CAIRO_TEST_TARGET=gl ./cairo-test-suite -k blur |
| This binary should be backwards-compatible with all library versions, |
| allowing you to compare current versus past behaviour for any test. |
| |
| Tailoring tests running |
| ----------------------- |
| There are some mechanisms to limit the tests run during "make test". |
| These come very handy when doing development, but should not be used |
| to circumvent the "pass" requirements listed below. |
| |
| make's TARGETS environment variable can be used to limit the backends when |
| running the tests. It should contain a (space-, comma-separated) list of |
| backends. CAIRO_TESTS environment variable, which is a comma-, space-seperated |
| lists, can be used to limit the tests run. |
| For example: |
| |
| CAIRO_TESTS="zero-alpha" make test TARGETS=image,ps |
| |
| make's FORMAT variable can also be used to limit the content formats when |
| running the tests. It should contain a (space-, comma-separated) list of |
| content formats to test. |
| For example: |
| |
| CAIRO_TESTS="zero-alpha" make test TARGETS=image,ps FORMAT="rgb,rgba" |
| |
| Another very handy mechanism when trying to fix bugs is: |
| |
| make retest |
| |
| This will re-run the test suite, but only on tests that failed on the |
| last run. So this is a much faster way of checking if changes actually |
| fix bugs rather than running the entire test suite again. |
| |
| The test suite first compares the output from the current run against the |
| previous in order to skip more expensive image comparisons . If you think |
| this is interfering with the results, you can clear the cached results using: |
| |
| make clean-caches |
| |
| Running tests under modified environments or tools |
| ------------------------------------------------- |
| To run tests under a tool like gdb, one can use the run target and |
| the TOOL variable. For example: |
| |
| CAIRO_TESTS=user-font make run TOOL=gdb TARGETS=pdf |
| |
| If you want to run under valgrind, there is a specific target for that |
| that also sets a bunch of useful valgrind options. Try: |
| |
| CAIRO_TESTS=user-font make check-valgrind |
| |
| You can run tests under a modified environment you can use the ENV |
| make variable. However, that environment will also affect the libtool |
| wrapper of the tests. To only affect the actual test binaries, pass |
| such environment as TOOL: |
| |
| CAIRO_TESTS=user-font make run TOOL="LD_PRELOAD=/path/to/something.so" |
| |
| Getting the elusive zero failures |
| --------------------------------- |
| It's generally been very difficult to achieve a test run with zero |
| failures. The difficulties stem from the various versions of the many |
| libraries that the test suite depends on, (it depends on a lot more |
| than cairo itself), as well as fonts and other system-specific |
| settings. If your system differs significantly from the system on |
| which the reference images were generated, then you will likely see |
| the test suite reporting "failures", (even if cairo is working just |
| fine). |
| |
| We are constantly working to reduce the number of variables that need |
| to be tweaked to get a clean run, (for example, by bundling fonts with |
| the test suite itself), and also working to more carefully document |
| the software configuration used to generate the reference images. |
| |
| Here are some of the relevant details: |
| |
| * Your system must have a copy of the DejaVu font, the sha1sum of |
| the version used are listed in [...]. These are |
| "DejaVu Sans" (DejaVuSans.ttf) [e9831ee4fd2e1d0ac54508a548c6a449545eba3f]; |
| "DejaVu Sans Mono" (DejaVuSansMono.ttf) [25d854fbd0450a372615a26a8ef9a1024bd3efc6]; |
| "DejaVu Serif" (DejaVuSerif.ttf) [78a81850dc7883969042cf3d6dfd18eea7e43e2f]; |
| [the DejaVu fonts can be installed from the fonts-dejavu-core 2.34-1 Debian package] |
| and also |
| "Nimbus Sans L" (n019003l.pfb) |
| [which can be found in the gsfonts Debian package]. |
| |
| * Currently, you must be using a build of cairo using freetype |
| (cairo-ft) as the default font backend. Otherwise all tests |
| involving text are likely to fail. |
| |
| * To test the pdf backend, you will want the very latest version of |
| poppler as made available via git: |
| |
| git clone git://anongit.freedesktop.org/git/poppler/poppler |
| |
| As of this writing, no released version of poppler contains all |
| the fixes you will need to avoid false negatives from the test |
| suite. |
| |
| * To test the ps backend, you will need ghostscript version 9.06. |
| |
| * Testing the xlib backend is problematic since many X server |
| drivers have bugs that are exercised by the test suite. (Or, if |
| not actual bugs, differ slightly in their output in such a way |
| that the test suite will report errors.) This can be quite handy |
| if you want to debug an X server driver, but since most people |
| don't want to do that, another option is to run against a headless |
| X server that uses only software for all rendering. One such X |
| server is Xvfb which can be started like this: |
| |
| Xvfb -screen 0 1680x1024x24 -ac -nolisten tcp :2 |
| |
| after which the test suite can be run against it like so: |
| |
| DISPLAY=:2 make test |
| |
| We have been using Xvfb for testing cairo releases and ensuring |
| that all tests behave as expected with this X server. |
| |
| What if I can't make my system match? |
| ------------------------------------- |
| For one reason or another, you may be unable to get a clean run of the |
| test suite even if cairo is working properly, (for example, you might |
| be on a system without freetype). In this case, it's still useful to |
| be able to determine if code changes you make to cairo result in any |
| regressions to the test suite. But it's hard to notice regressions if |
| there are many failures both before and after your changes. |
| |
| For this scenario, you can capture the output of a run of the test |
| suite before your changes, and then use the CAIRO_REF_DIR environment |
| variable to use that output as the reference images for a run after |
| your changes. The process looks like this: |
| |
| # Before code change there may be failures we don't care about |
| make test |
| |
| # Let's save those output images |
| mkdir /some/directory/ |
| cp -r test/output /some/directory/ |
| |
| # hack, hack, hack |
| |
| # Now to see if nothing changed: |
| CAIRO_REF_DIR=/some/directory/ make test |
| |
| Best practices for cairo developers |
| =================================== |
| If we all follow the guidelines below, then both the test suite and |
| cairo itself will stay much healthier, and we'll all have a lot more |
| fun hacking on cairo. |
| |
| Before committing |
| ----------------- |
| All tests should return a result of PASS or XFAIL. The XFAIL results |
| indicate known bugs. The final message should be one of the following: |
| |
| All XX tests behaved as expected (YY expected failures) |
| All XX tests passed |
| |
| If any tests have a status of FAIL, then the new code has caused a |
| regression error which should be fixed before the code is committed. |
| |
| When a new bug is found |
| ----------------------- |
| A new test case should be added by imitating the style of an existing |
| test. This means adding the following files: |
| |
| new-bug.c |
| reference/new-bug.ref.png |
| reference/new-bug.xfail.png |
| |
| Where new-bug.c is a minimal program to demonstrate the bug, following |
| the style of existing tests. The new-bug.ref.png image should contain |
| the desired result of new-bug.c if the bug were fixed while |
| new-bug.xfail.png contains the current results of the test. |
| |
| Makefile.sources should be edited by adding new-bug.c to test_sources. |
| And last but not least, don't forget to "git add" the new files. |
| |
| When a new feature is added |
| --------------------------- |
| It's important for the regression suite to keep pace with development |
| of the library. So a new test should be added for each new feature. |
| The work involved is similar the work described above for new bugs. |
| The only distinction is that the test is expected to pass so it |
| should not need a new-bug.xfail.png file. |
| |
| While working on a test |
| ----------------------- |
| Before a bugfix or feature is ready, it may be useful to compare |
| output from different builds. For convenience, you can set |
| CAIRO_REF_DIR to point at a previous test directory, relative |
| to the current test directory, and any previous output will be |
| used by preference as reference images. |
| |
| When a bug is fixed |
| ------------------- |
| The fix should be verified by running the test suite which should |
| result in an "unexpected pass" for the test of interest. Rejoice as |
| appropriate, then remove the relevant xfail.png file from git. |
| |
| Before releasing |
| ---------------- |
| All tests should return a result of PASS for all supported (those enabled by |
| default) backends, meaning all known bugs are fixed, resulting in the happy |
| message: |
| |
| All XX tests passed |
| |
| Some notes on limitations in poppler |
| ==================================== |
| One of the difficulties of our current test infrastructure is that we |
| rely on external tools to convert cairo's vector output (PDF, |
| PostScript, and SVG), into an image that can be used for the image |
| comparison. This means that any bugs in that conversion tool will |
| result in false negatives in the test suite. |
| |
| We've identified several such bugs in the poppler library which is |
| used to convert PDF to an image. This is particularly discouraging |
| because 1) poppler is free software that will be used by *many* cairo |
| users, and 2) poppler calls into cairo for its rendering so it should |
| be able to do a 100% faithful conversion. |
| |
| So we have an interest in ensuring that these poppler bugs get fixed |
| sooner rather than later. As such, we're trying to be good citizens by |
| reporting all such poppler bugs that we identify to the poppler |
| bugzilla. Here's a tracking bug explaining the situation: |
| |
| Poppler does not yet handle everything in the cairo test suite |
| https://bugs.freedesktop.org/show_bug.cgi?id=12143 |
| |
| Here's the rule: If a cairo-pdf test reports a failure, but viewing |
| the resulting PDF file with acroread suggests that the PDF itself is |
| correct, then there's likely a bug in poppler. In this case, we can |
| simply report the poppler bug, (making it block 12143 above), post the |
| PDF result from the test suite, and list the bug in this file. Once |
| we've done this, we can capture poppler's buggy output as a |
| pdf-specific reference image (as reference/*.xfail.png) so that the |
| test suite will regard the test as passing, (and we'll ensure there |
| is no regression). |
| |
| Once the poppler bug gets fixed, the test suite will start reporting a |
| false negative again, and this will be easy to fix by simply removing |
| the pdf-specific reference image. |
| |
| Here are the reported poppler bugs and the tests they affect: |
| |
| [Newest was closed in 2009.] |