Main Menu
Login
Username:

Password:


Lost Password?

Register now!
Swag
Test-Driven Development Merchandising
Syndicate testdriven.com
Partners
JUnit.org
Artima.com
XProgramming.com
RADSoft
Agile Movement Italy
Skills Matter
froglogic GmbH
Viewtier Systems, Inc.
Hexagon Software LLC
Flosoft Systems
Ranorex GmbH
ArtOfTest
DevAgile.com
Methods and Tools
TestLogistics
Articles : Intelligent Testing Framework or How to Avoid running Every test every time
Posted by AlexBacon on 2007/1/17 23:29:55 (3163 reads)

Background

Anyone working on any large TDD (Test Driven Development) project knows the drill:

- Make changes to code
- Update with latest version of code base
- Fix inconsistencies and conflicts
- Start running tests
- Make cup of tea / have lunch / go home – depending on how far into the project you are
- Fix one or two test failures – fingers crossed you have not screwed up the other tests
- Try to check in code
- Swear loudly because someone has made changes to the code base that conflict with your changes
- Repeat ad nauseum

OK – I am exaggerating a bit but full end-to-end system tests can take a while on a large system – particularly if you have to reset the system to a known state before each test. If you are attempting to get a release out regularly, fixing the build ready for the next release involves banning commits of new code and days of pain and torment fixing all the build issues.

Dividing the tests into groups (e.g. using TestNG) and building the project from separate projects can help – but in the world of Spring, AOP, proxies, ORM technology and so on – it is very hard to tell which group of tests you need to run to validate a code change.

The solution given here (which has been successfully implemented) goes a long way towards eliminating these problems. It uses the JDK 5 Instrumentation technology to record at run time which tests touch which classes – AND conversely which tests you need to run for each class change. Before anyone screams – What about new classes? What about files that are not classes? I am well aware of these issues – but this technique will find the correct tests 90% of the time – and I rely on the build server to catch the remaining 10%. It certainly beats either not running the tests or waiting a long time for the tests to complete in their entirety.

Technology

JUnit was used as the underlying test framework which supports both individual Tests and Test Suites (groups of tests run together).

JDK 5 officially introduced Instrumentation into the Java world. Instrumentation allows classes to be changed as they are loaded for a variety of reasons: to add code to collect coverage statistics, removing not-required log4j messages (http://surguy.net/articles/removing-log-messages.xml), to collect performance statistics and so on.

The instrumentation is performed by code in a JAR file termed an 'Agent'. The location of the agent JAR is passed in as a JVM argument -javaagent:jarpath[=options ]. The manifest within the JAR file specifies which class to use for the Agent.

On start up the JVM calls the 'premain' method of the Agent passing in an Instrumentation object with which the Agent can register an instance of ClassFileTransformer (typically itself):


public static void premain(String options, Instrumentation instrumentation) {
instrumentation.addTransformer(new MyTransformer());
}

Any classes that are loaded by the JVM will now pass through the transform method on MyTransformer before being available to the application.
Unfortunately the JVM specification does not specify a set of tools to perform the bytecode transformations. I chose ASM for performance although there are other frameworks which are simpler to use e.g. BCEL.


Design

Run the tests using JUnit test suites. There are two variants of each Test Suite – one 'Non Intelligent' version - which runs all the tests (this is used on the build server) – and one 'Intelligent' version which looks to see if there is any stored dependency data and uses that to work out which tests to run.
Each Test Suite has an associated Agent Dataset that is persisted between each run containing:

● Which classes are used by which test
● The failed tests from the last run
● The time of the last run
The Agent Dataset starts out empty before the Test Suite is first run.

Non Intelligent Test Suite Version Workflow:
1. Run all the tests in the test suite

Intelligent Test Suite Version Workflow:
1. Load the Agent Dataset from the last run. If there is none then proceed as per Non Intelligent Version.
2. Inject the old dependency database into the Agent (otherwise we will lose a lot of the dependency information as only a subset of the tests are being run).
3. Create a new set of tests to run. Add all the previously failed tests to it. Iterate through the project classes within the dataset. For each project class – compare the modification date of the class file against the timestamp within the dataset (i.e. when the tests were last run) – if the class file has been modified then add all the tests that touched that class file to the set of tests to run.
4. Run only the tests that are required

Common Workflow (for both Test Suite versions):
1. Instrument all the methods on all the classes within the project to add in a static call back to the 'record' method on the Agent passing the name of the containing class.
2. At the beginning of each test make a static call to the Agent to specify which test is currently being run.
3. Use the 'record' method on the agent to lot which classes have been used by which tests.
4. At the end of the test suite save the data collected by the Agent together with a timestamp and a list of failed tests.

Implementation

The Agent consisted of the following components:
● Dependence Database
● Getters and Setters for the Dependency Database
● A 'premain' method to register an instance of the Agent as a transformer
● A transformer which uses ASM to add a static callback to 'record' on the Agent passing the name of the calling class.
● A 'record' method to record which classes were touched by which tests into the Dependency Database
● A 'startTest' method to tell the Agent which test is currently running

In additions there is a helper class to save and load the Agent Dataset and perform the calculation to determine which tests need to be run.

Due to issues with the coverage tool (Emma) there is also an Ant task to instrument the classes before Emma does its own instrumentation. This is more an issue with Emma than with anything else.

This took about 4 days to implement by someone that had not touched ASM before. The lack of standard toolset or documentation / examples made this harder than it need be. For example JDK 5 uses attributes within the class file to store information for generics – whereas the sample code I was using did not operate with attributes. This was not a hard issue to fix, but lack of documentation made it hard to work out what was happening.

Conclusion

I believe this is an extremely elegant solution in that it is almost completely orthogonal to the rest of the project code (the change to the existing code is minimal - just a few extra test suites, a call to 'startTest' on the Agent, and to 'saveAgentData' on the helper class at the end of the suite). Using this technique we achieved an up to 10 fold performance improvement in running test suites.
As a way to make TDD more productive and less painful I heartily recommend it.

Printer Friendly Page Send this Story to a Friend
Google search

Web testdriven.com

Recent Links
ITarra web design
GoldOcean web design