Tuesday, 27 October 2015

Why are these tests so damn slow? Part 2

In a previous post I discussed the first of 2 approaches to tackle a slow running CI build, namely to attack the source of the problem and to try to figure out why there are so many tests that take so long to run. In this post I'll discuss the second approach, which is essentially one of brute force.

If after making the effort to really optimise your tests, speed them up individually, and remove any duplication you are still not satisfied with the performance, then perhaps this approach can help you. Here are some ideas of where to start:
  1. Improve the hardware of the machine on which you are running the tests. This may sound obvious but it's commonly overlooked, especially when using a cloud based Jenkins server which a team might assume is highly specced. I remember being frustrated by this point on one particular project when the operations team refused to add more hardware. Suffice to say that they didn't like my suggestion of instead running Jenkins on an old laptop that was lying around, even though it would have been faster!
  2. Run your tests in separate phases. It's very common to have unit and functional test phases but for really slow running builds you might want to consider adding more. For example, if your project has functional tests that test a CMS in the browser and a REST API you could set these up to run in CI as separate Jenkins jobs. One benefit of this is that if you know you've only changed code in the front end CMS code, once these tests pass you should have a high level of confidence that you code hasn't broken anything, without necessarily waiting for the REST API tests to finish. 
  3. Parallelise. Running unit tests in parallel is fairly common, and Maven makes it pretty easy. Unfortunately, on a long running CI build unit tests are not normally the biggest problem, and we need to look at functional tests instead. 
Let's look at point 3 in more details now.

The problem you might have with parallelising your functional tests is that in many cases your functional tests running on one server might be running against a system that's deployed on another.

This was the case on a recent project, where we had a Jenkins master, a Jenkins slave that ran the functional tests, and a server to which the system under test was deployed every build. If you want to parallelise here it suddenly becomes very expensive. Consider the case where you want to split your tests in two, where you'll need:

  1. Another Jenkins slave to run the second set of tests
  2. Another server on which to deploy the system. We can't use the same instance of the system under test as then tests may interfere with each other. Other servers that this system uses may also need to be duplicated (e.g. database).
  3. Some way of orchestrating the process.

This quickly becomes a headache especially when you decide to split your tests again, and again. However, I've had great success using Docker to remedy this particular problem, as it allows you to very quickly create images and run containers. Hopefully I'll talk more about it in another post.

No comments:

Post a Comment