«    »

Faster Builds via Concurrency

Recently I have been looking for ways to make a Java build run faster. This is something I seem to do at least once a year, typically as a result of the application’s production code base and automated test suite both growing in size over time. The build had previously already been split into multiple phases with the slower integration tests running in a later phase. My efforts this time were focused on the initial build phase used by developers prior to pushing their changes to the rest of the team. Speed matters for a routine process such as this that is executed multiple times a day: ideally this build would run instantaneously, giving feedback to the developer immediately whether their change was good to go or not.

So how could this build be made faster? It bothered me that despite having modern multi-core CPUs, the build still executed as a single serial process. Surely massive speed improvements could be obtained by having the build run concurrently over multiple threads or processes?

Parallel Checks

Since the build script used ANT, my first attempted optimization was to use ANT's parallel task. I mapped out the major activities that the build needed to perform and the dependencies between them (which was already defined within the ANT build script). I then implemented as much of the work in parallel in possible, which ended up requiring a four-level nesting of parallel and sequential tasks. The resulting build failed to work properly - it refused to run the tasks in the lower level nested parallel task. So I simplified the build down to a single parallel task. This worked, most of the time. Rarely the build would hang for no apparent reason. However, the build time improved significantly, with just under a 50% reduction in the time required. Here is a simplified version of the ANT code:

<target name="parallel-build" depends="compile-unit-tests">
	<parallel failonany="true">
	<sequential>
		<unit-test/>
	</sequential>
	<sequential>
		<findbugs/>
	</sequential>
	<sequential>
		<pmd/>
	</sequential>
	<sequential>
		<antcall target="compile-integration-tests"/>
	</sequential>
	</parallel>
	<antcall target="parallel-build-result"/>
</target>

One interesting problem I ran into was that I wanted to address the frequent scenario where a developer runs this build on their workstation, prior to checkin. I wanted the build to summarize the result of the various checks (unit tests, FindBugs, and PMD) at the end, and ensure that all the checks had been performed instead of failing when the first problem was found. My initial use of the antcall task within each sequential task proved problematic as ANT prevents any property settings from being passed back from the child antcall. So I switched these invocations to ANT macros which let me set a property for each check that the final parallel-build-result target is able to use to provide a summary report. Here is a simplified version of the ANT code to accomplish this:

<target name="parallel-build-result" 
	depends="unit-test-result, pmd-result, findbugs-result"/>

<target name="unit-test-result" if="unit.tests.failed">
	<antcall target="unit-test-html-report"/>
	<echo message="Unit tests failed!"/>
</target>

<!-- 
pmd-result and findbugs-result targets omitted 
as they are very similar to the unit-test-results target
-->
 

Parallel Tests

Analyzing the performance of this new parallel build quickly revealed that it was the automated unit tests that were the bottleneck. Since unit tests are designed to run independently from one another, I was optimistic that I could run the unit tests in a highly parallel fashion and obtain further significant reductions in build time. My expectation was that it should be possible to accomplish this by simple changes to the build script without having to change individual unit tests or manually partition the tests into multiple suites.

When I investigated the options for accomplishing this via ANT and JUnit, I discovered, to my surprise, that this was not explicitly supported. The easiest option was running multiple tests within a single test class concurrently, which is not what I wanted and required modifying existing test code. I found experimental support in JUnit for running tests in parallel, but this was not easily invoked via ANT.

Having hit a brick wall, I considered other technologies. I discovered that TestNG (test framework alternative to JUnit), Maven (build tool), and Gradle (build tool) all provide built-in support for running test suites in parallel. Since I have played with Gradle in the past and like it as an up-and-coming build tool, I decided to give it a try.

The challenge with Gradle was getting it set up and working to replicate enough of the pre-existing ANT build in order to run the tests. Once the tests were running, making them run in parallel was extremely simple, as the following Grade code snippet shows:

test {
	maxParallelForks = 3
	
}

However, the performance gains turned out to be far poorer than I was expecting. I observed a 20% improvement only when maxParallelForks was set to three - at other values like 2 or 4, performance was the same or worse than the non-concurrent version. I did run these tests on a multi-core workstation, so CPU was not the limiting factor. What was going on? Why didn't I see an improvement much closer to 300%?

Further investigation revealed that others have encountered similar results when running unit tests. Running tests with significant I/O like integration-style tests or web tests in parallel can experience significant performance improvements since while one test is stalled waiting for I/O, others can still run. Unit tests, however, typically have no such I/O waits.

Any parallelization effort must factor in the additional effort to launch new concurrent executions. In the case of Gradle, this means spinning up a new Java virtual machine and loading all the necessary classes. This is a lot of additional work duplicated across the parallel executions which reduces the efficiency of parallelization . (Other technologies such as Maven, use threads in the same VM to parallelize testing and thus have less start-up overhead, but a much higher risk of contention.) In the case of the unit test suite I was working with, we use a shared spring test context for some of the tests. As a singleton test fixture it is only set up once for all the tests when run sequentially, but when running concurrently in multiple VMs it needs to be initialized for each VM. While our unit test suite is fairly large (~5000 tests), it is also quite fast, so all the start-up overhead quickly dwarfs the speed increase from running in parallel. Given all this, I am fortunate to have seen any performance improvement at all.

Conclusion

Overall I am not satisfied with the current level of support for concurrency in Java build and test technologies. Some of the features I would like to see are:

  • Build scripts already specify dependencies between tasks, so the build tool should be able to run unrelated tasks in parallel without a developer having to spell it out explicitly.
  • All tools involved in builds such as compilers, static code analysis, and test frameworks should take advantage of concurrency when possible when doing their own work.
  • Testing frameworks should support running test suites concurrently via a combination of multiple threads and multiple VMs, to allow developers control over the level of separation vs overhead.

Given the trend towards more cores rather than faster clock speeds, I believe the use of concurrency in builds is inevitable.

If you find this article helpful, please make a donation.

«    »