Continuous Integration

The macro process of object-oriented development is one of "continuous integration." ... At regular intervals, the process of "continuous integration" yields executable releases that grow in functionality at every release. ... It is through these milestones that management can measure progress and quality, and hence anticipate, identify, and then actively attach risks on an ongoing basis. -- GradyBooch, Object-Oriented Analysis and Design with Applications, 2nd ed.

Conventional code management systems use techniques like check-in/check-out to help you be sure you're working on the current version when you edit something. Development teams use CodeOwnership to minimize conflicts among people editing. The longer engineers hold on to modules, the more important it is to minimize conflicts.

What if engineers didn't hold on to modules for more than a moment? What if they made their (correct) change, and presto! everyone's computer instantly had that version of the module?

You wouldn't ever have IntegrationHell, because the system would always be integrated. You wouldn't need CodeOwnership, because there wouldn't be any conflicts to worry about.

Note: You shouldn't release unless the code works. Therefore to have ContinuousIntegration, you should have RelentlessTesting. And if your testing process takes days (and some legitimately may), then ContinuousIntegration may not work well for you since much work must go on in parallel with the testing. Even in such a case, however, it depends on the probability that the class will be edited again before testing completes.

This is not to say that you cannot try ContinuousIntegration without extensive testing. It's just strongly not recommended.

"ContinuousIntegration" is a slogan, not a description. It is deliberately overstated to emphasize that DailyBuilds are not enough.

That said, there are things that make integration hard, and things that make it easier --

MultipleCodeLines?. Doctor, it hurts when I do this. Don't. OneCodeLine?. Always.

Check in conflicts. Smaller pieces reduce the probability of conflicts, so make methods the grain size of CM. -- KentBeck

And yet the description is of "continual integration". The two words are often confused with each other.

Continual means a discrete action/event repeated endlessly, such as the sound made by a playing card stuck in bicycle spokes.

Continuous means a single event that proceeds without even a momentary interruption into the indefinite future, such as the sound made by a stream of water flowing downhill.

Isn't the waterstream sound continual, only with a small and unpredictable interval?

So "continuous integration" is impossible; only "continual integration" is physically possible.

But this can't work...

Please enter here the things that make you think this won't work. Interrogation will help us answer how we make it work.

Having lots of people working on the same class can only lead to chaos, unreliability, and the warm heat death of the universe.

That turns out not to be the case. You have complete UnitTests to test that class, and all the other classes with which it may collaborate. No code is released unless all the UnitTests work.

The guy who edits my class may not be as good as I am.

Partner with him. Both of you will learn something. Besides, what do you mean by "my class"?

Most developers think they are better - how is this possible? ;-) Fine grained objects avoid collisions

What if I'm changing the class and you are, and our changes conflict?

This happens very rarely in practice, especially if you have lots of small classes (which you should). When it does, the person releasing is responsible for resolving conflicts. Our code manager tells us when you are releasing a version which is not based directly on the current release. This means someone slipped in ahead of you. Browse changes, sit with them if need be, resolve and release.

What if you release from your machine and I release from mine and we step on each other's changes?

It hurts when you do that. Don't do that. Consider having a SingleReleasePoint. Go to it, load your changes, test, release. This makes releasing single-threaded, which makes it quite simple to detect conflicts.

Single release machines are good. Although we allow developers to version and release code that they have tested, we don't update a config map and create a package until the code has been tested on a single machine, integrated with all of the other changes. That way, if all is not well, we can recover by reloading the previous map. The new map is created on the same single machine. -- Matt Stephenson

ChryslerComprehensiveCompensation engineers release their code frequently, generally daily. Everyone loads the config frequently, generally more than once a day. The code you are working on is the official released version, all the time.

This works because we use RelentlessTesting to ensure that there are no regressions. Our UnitTests tell us right away if we have broken anything. We run a subset of our AcceptanceTests, called CoffeeTalk, as well. We push code to production weekly or more often. Our AcceptanceTests are run before push to make the final check that we're OK.

The result is that we can go very fast. And we avoid IntegrationHell. See ContinuousIntegrationRelentlessTesting. -- RonJeffries

ContinuousIntegrationWithBigProjects? OR WhenContinuousIntegrationWorks? OR WhenContinuousIntegrationDoesntWork?

Jim, many thanks for your contribution here. You know tons more about BIG projects than I do (or ever hope to). I've tried three times even to formulate questions and responses and not having much luck. More thought needed. I put a couple interleaved requests for clarification below, please check 'em and respond.

I think we may need to split the ContinuousIntegration issues and the CodeOwnership issues apart to tease out the contexts. -- Ron

I've worked both on projects where ContinuousIntegration works and on ones where it fails. The success of this technique has something to do with the complexity of the version space. I also think it may be rooted in an assumption that the most interesting versions are temporal and fade with time, as deltas (user-owned temporal changes) do in a delta/MR/MR list/release hierarchy.

Yes ... the fundamental assumption of CI is that there's only one interesting version, the current one.

If you have a simple delivery stream that easily maps to the source stream, I can see it working fine. But consider 5 different projects working from the same source (where a project corresponds to a corporate revenue stream working on its own schedule). The source can be configured to produce any one of several images. But there are more than 5 images. Because of staged testing (individual, subsystem, integration, and acceptance) there are 4 major versions of each of these 5 images. Each of these versions may contain different changes. And in each of those major versions, there may be slightly different images for each of three processor types. It actually gets more complicated because of separately marketable features (of which there are thousands). There is a possibility of having hundreds of virtual hands in a piece of code at any given time.

When you say hundreds of virtual hands, do you mean "if you don't have ownership" or "even if you do have ownership"?

Such systems are real; I know of many of them. They solve their problems either with code ownership or with onerous methodological techniques that are unbelievably cumbersome--the latter quickly discover CodeOwnership.

To convince you that this doesn't easily lend itself to an architectural solution would take more space than I care to spend here. The simple answer is that feature intrinsically cuts across function and structure, so any simple geometric partitioning is hopeless. Or you can think of AspectOrientedProgramming as an analogy (or metaphor, or something) for the problem, where aspects correspond to features or to deltas that don't fade in time.

It's the fact that feature cuts across function that drives us away from CodeOwnership. If we want to go fast, the person doing the feature needs to add all the function, no matter what class he needs to touch. Then, to avoid complex integration problems, we integrate as close to continuously as possible.

The problem with ContinuousIntegration is manifest in the second paragraph above where the phrase "my version" appears. This is a particularly vicious form of ownership that fails in way analogous to the way Ron sees CodeOwnership failing. Not all versions are owned by individuals. Once they're put back into the version management system, they belong to the project. And because the "project" in this case may be many conventional projects, that must deliver different versions of the same code, ContinuousIntegration of versions doesn't work.

You seem to be saying in the above that the project owns a class, not a code owner. So do we. Editors just pass through: change, test, release. But you are coming out in favor of CodeOwnership here? What am I missing?

I do see that for each moment in time where code goes to customer, you need to fork that class for updates, so the class has N "live" versions. And I see that I'd probably want one developer to determine for all N whether and how his particular change applies to that version. I don't see that the next change needs the same developer (CodeOwnership), and I don't see why version N.2 shouldn't be released to the code base right away (ContinuousIntegration). Please help.

There are places ContinuousIntegration works. I accept that claim. Now, accept my claim that there are also places it doesn't work. And let's differentiate the contexts.

Totally in agreement with teasing out where and why it won't work. You know the big problem space far better than I! See also my new Note in the top part. Thanks -- RonJeffries

-- JimCoplien

I don't see this one as controversial - I've expected daily updates for years. Actually I used to have a backup system which ran nightly and only backed up key parts of the network server, so anything not put under VCS by the end of the day was not backed up. It wasn't acceptable for programmers to have masses of community code that only existed on their machines. It makes me feel uncomfortable just thinking about it.

Maybe it's worth noting that people put stuff under VCS, and other people updated, when it fitted their personal work schedule. In other words I don't see your changes until I want to. It's daily rather than truly continuous. This avoids some of the feeling of quicksand you can get when the ground changes from under you unexpectedly. You have a stable environment for as many hours as you need to nail down your bug.

I tried getting people to send out email to the group when they put stuff back, which both notified that a change had been made and also described briefly what it did. The other guys could then go and play with the new features, give them a bit of ad-hoc testing and say, "That's cool" or "That sucks". The email messages were archived to form a permanent record of daily progress. I don't think this really worked so it got dropped. It might be worth trying again if the group were larger. -- DaveHarris

Dave, can I ask you to explore a bit what you mean by "I don't see your stuff until I want to"?

We did some really cool editor technology here several years back called colored delta editing; it's similar to work done in the same era at IBM by VincentKruskal?. It allowed you to apply "version filters" so you could work in the context of any version point, line, plane, hyperplane or hyper*plane in version space. (Different versions had different colors, and you could apply "color filters". It also supported truly concurrent editing and real-time change notification, as well as other goodies.)

If you work in the context of *all* the versions and see them all the time, you get nothing done but trying to anticipate how to accommodate the stuff flying at you.

If you ignore the changes, then you end playing catch-up when the changes do become visible.

The editor technology we used experimentally minimized the physical dependency problem, but all the deployed systems had the problems of physical dependencies (which happens when there is a contradiction about whether to include or exclude different sets of lines under the influence of pairs of deltas with conflicting insertion/deletion information). This problem becomes even worse when you have physical dependencies--because then you must take the other person's changes.

The most serious problems related to logical dependencies. Even though you don't change my module, you interact with it in subtly different ways. Or you change both something that calls me and something that I call in a consistent way, but if I depend on the semantics instead of just passing the results through, I'm screwed. You can't automatically detect these things: it takes architectural dialogue.

So you don't even get a chance to ever see how the other person's changes affect your code, unless you're looking at all the changes in the system all the time.

If there is code ownership, you can at least go to the person in charge of the module and ask if they are affected. A knowledgable owner can give a quick answer most of the time. In absence of a code owner, everyone has to look at everything. It's rare that everyone knows enough about everything to render snap opinions like this.

You're damned if you do and you're damned if you don't. -- JimCoplien

Ah, I'm beginning to see some things. Help me here, Jim, because we're clearly not damned over here.

In Smalltalk, methods are maybe 8 lines long, and each one is edited alone. Most changes are extensions and by their nature cannot negatively impact anything. Even if they are real changes, in ExtremeProgramming, we have UnitTests that check whether (a) class X works, and (b) all other classes work in the context of class X. The UnitTests run in under 5 minutes, checking everything. (We are only testing around 1000 classes, with only about 20,000 individual checks, but, well, we all know Smalltalk is slow. C++ would probably be a lot faster. ;-> )

This means that when any developer changes a class, she can immediately tell whether she has broken anything in that class, or elsewhere.

In your last riff above, Jim, you refer to people having to "ask if they are affected", and something about "depending on the semantics". What are the things in your world that can't be determined by testing (and yet which would break the application if done)? I would think that if it could break the app, it could be put into a test.

Thanks -- RonJeffries

It sounds like my environment was less complex than Jim's. We were able to give each programmer a complete local copy of the code, so he was never forced to update it by either physical or logical dependencies. Although updating was an all-or-nothing affair; if you took one file you'd usually have to take them all, and people were encouraged to update each file before they first changed it so that they were working on the latest version of it. If cascading updates seemed likely to cause a problem they could skip that step and just merge later.

You're damned if you do and you're damned if you don't. - Quite so. Subtle interactions between work done by different programmers is a problem regardless of ContinuousIntegration. Maybe your question is more about how to write UnitTests for this stuff.

I do think ContinuousIntegration means you are more likely to discover the problem early. If it doesn't work today and it did work yesterday, you can use the VCS to revert to yesterday's code base, add your changes back, look at the logs to see what other people have been doing, go and talk to them about it. They are more likely to remember what they did yesterday then if you wait a week before integrating. With a daily update, you are less likely to get two simultaneous problems interacting.

So there are lots of advantages to integrating often. On the other hand, I need to be able to focus on fixing the bugs I just put in, and only those. One thing at a time. I need to eliminate the extraneous changes created by other people.

The way I manage this is to get my stuff into a decent state before accepting an update. Then I know the new bugs are due to other people's changes and I can concentrate on what they've done. One thing at a time. I alternate between doing real work and merging in the work done by other people.

Updates are frequent, but not truly continuous. They are not allowed to interrupt the real work - nothing is. Real work requires periods of unbroken concentration, between half an hour and a few hours. The exact time varies; when I'm done, I'm done, and then I can look around, see what everyone else is up to, update my code base, read email or whatever. This point will be reached at different times for different programmers, but every programmer should reach it sometime during each day. Frequent, but at a time of his choosing.

Another issue is that an update is fairly time consuming. Say 20 minutes in my (C++) environment? For that reason alone you wouldn't want to do it every hour. You can gain a little bit by postponing several updates and doing them all at once, especially if a single file changes in each update. But you don't gain much. I'd expect DailyIntegration? provides the best balance of forces for quite a fair range of team sizes.

I've rambled; I hope this helps anyway. -- DaveHarris

Does help. I'm putting this down here 'cause I'm not ready to put it above.

How frequently should you integrate?

Don't integrate in mid-task. It can only hurt you then. (actually, it can be okay here if you have a checkpoint+rollback facility in your VCS. See SteveBerczuk's PrivateVersioning pattern)
Do integrate after you complete each task, if practical. This minimizes conflict resolution next time.
If integration takes longer than a coffee break, or requires manual intervention, you'll be tempted to do it less frequently. Ideally, fix the integration process. If you don't, you'll increase conflicts.

Even in C3, ContinuousIntegration isn't continuous. We do try to integrate our individual workspaces after every task, and that can be a very short time, never less often than daily. The idea, however, is that (a) you never want to edit over anyone else's edits, therefore you want to be integrated, and (b) you don't want other guys' edits to get in your way as you work, and (c) the longer you wait to integrate, the harder it is. -- RonJeffries

Like Jim, I too have worked both on projects where ContinuousIntegration works and on ones where it fails. In both sets of cases, although CodeOwnership certainly affected things positively much of the time, it was not required for success. The success of this technique most assuredly has something to do with the complexity of the version space (as Cope noted earlier). More on some of that in a minute.

There are a number of issues arising from the sentiment "What if they made their (correct) change, and presto! everyone's computer instantly had that version of the module?"

The above is probably not really what you want, nor what you are truly doing. If it was, all you need is to have all of you make your changes in the very same workspace (with some versioning and concurrent editing facilities). Even if you managed to work on separate files all the time, this is still not what you usually want to happen. The problem is that others don't get to choose when they are ready to deal with your code. You may make some changes which impact the coding task I'm working on, even if it doesn't touch the same files, and now you've just broken my code. I was perhaps in the middle of some important train of thought and now I do not have the option of finishing my flow first, and then dealing with your changes.

You folks may be doing "daily" or even "hourly" integration (or even more frequently), but it's not quite the same thing as continuous integration because it's still in discrete chunks, and you still retain the choice over precisely when to integrate (pull); its not thrust upon you while your changes are in a volatile state (push). If you have ever used ClearCase and had people selecting the LATEST versions on the codeline while checking them in and out at the same time, then you will run into this problem (courtesy of its Virtual File System). It happens far more often than you would think. And it hurts when it bites you.

Fortunately, you aren't doing true continuous integration, more like "Quantum Integration." The integrated change-quanta are very small and are done very frequently; but it's not instantaneous, and the developer controls when it happens in their workspace (which is a crucial difference).

As for the context in which I've seen ContinuousIntegration succeed... Like Jim said, the version-space is very constrained in terms of the dimensionality exploited. Almost all codelines are along a single dimension, that of delivery (functionality to be delivered in a release - at varying levels of scale. e.g., fix, feature, major/minor release, or patch). And there aren't that many parallel codelines going on at the same time either, even though it's almost all in one dimension. Anywhere from 1-3 codelines is the average (a "recent maintenance" line, a development line for the next release, and possibly early/overlapping development for the subsequent release). I don't think I've ever seen it work well with more than 4 concurrent codelines, or with more than 2-3 dimensions of the version-space being employed at the same time.

-- BradAppleton

All code in every ExtremeTask? has the same owner. Luck is not involved. Clearly a typical change affects at least one method. (Exceptions for change to a class definition with no methods, or const definitions if your language has them. Principle applies anyway.) Release all those methods changed. It is sufficient to release no others. If you release more methods than you change, there is a higher probability of conflict. So don't do that.

Some code managers can't manage below the file level. Is it possible to put one method per file? If not, try one class per file and accept more conflicts. It won't be better, but it might be the best you can do. -- RonJeffries

Many thanks for the clarification! So the logical granule is a single method/function and the physical granule might have to be force-fitted to make the mapping one-to-one. The file-per-function approach is/was very common in C, and in other languages for this very reason. I imagine good factoring and merciless refactoring help ensure a one-to-one mapping between methods and logical-change-tasks (maybe not just of the code, but of the work that creates & modifies the code).

The VcapsProject has had much success with the practices of ContinuousIntegration. A couple of key factors we found were required to be successful:

UnitTests are an absolute gotta-have. No one releases unless they run at 100% !! If they don't, your changes are not going in.
Everyone has CollectiveCodeOwnership for all of the code. That is, each time a pair programmers touch the code, it should get improved, refactored, and simplified
There must be a SingleReleasePoint. NO ONE releases from their own workstation.
As Ron (and others) have stated above, INTEGRATE OFTEN !! The longer you wait to integrate, the more pain you will create.
ContinuousIntegration (as a slogan) is done in our environment (both for VisualWorks code, and GemStone code) in a matter of minutes, usually seconds! Developers integrate multiple times per day in practice. If it takes too long to integrate your changes, then the task unit of work is too big! The pair is responsible for creating their own IntegrationHell, or heaven, whichever they prefer...

A counterpoint to some of the arguments against ContinuousIntegration noted above:

Simply implementing one or two ExtremeProgramming practices without adopting the entire mental model of ExP, is like a jigsaw puzzle with missing pieces (per DonWells). Of course, you can't see the whole picture, and you can find numerous reasons why any one practice by itself seems flawed, won't work for you, and difficult for you to succeed. The components of ExtremeProgramming practices should be seen as an entire methodology (Systems Thinking), each piece facilitates and is facilitated by another. -- JeanineDeGuzman

I understood much of the above to be trying to flesh out the context in which ContinuousIntegration does and doesnt work, not necessarily trying to argue against it. I know from firsthand experience that this practice is not specific to XP because Ive seen it used successfully many times in non-XP environments. Ive also seen it fail in many different environments (also non-XP). So I am confident that the keys to its success transcend XP, and arent specific to the rest of XP. The discussion above that point out potential flaws or weaknesses help serve to identify the forces which drive the pattern and the context in which it can succeed.

Questions about applying ContinuousIntegration

I'm currently advising in a company that has wants to split codelines because one department is sitting at the other end of the world. They are willing to accept IntegrationHell every few weeks. Does anyone have any experiences in getting ContinuousIntegration to work in such a setting?

I work for a company that has been going ContinuousIntegration for years, only we didn't know what it was called. Our VCS tools (wrappers around CVS written in Python) update our 'build tree' upon every release, and kick off a build of the changed code, thus immediately integrating the changes into our product.

Of course, the programmer is expected to have already tested his/her code before releasing it. Alas, UnitTesting is a new concept to us, and thus testing at the programmer's level is basically whatever the programmer thinks of. This has lead to an unfortunate laziness of 'the Testing Department will find any bugs I missed', which often just ain't so. -- MikeKent

As Ron (and others) have stated above, INTEGRATE OFTEN !! The longer you wait to integrate, the more pain you will create.

This is stated as fact, but it is often not true, especially if the system is designed with good interfaces and layering. In fact, if you find integration painful it's a huge smell and sign of excessive coupling. I've developed many features over months and then integrated with everything working fine. Sometimes there are minor tweaks, but more often than not it works. Sometimes it is a huge disaster, but not often enough to go to any extreme.

The usual disaster scenario is integrating separate release streams into the main line. These can't be continuously integrated because there are multiple releases being worked on at the same time. There is no such animal as a current release. There are lots of releases with features that can move in and out of releases because of feature completion/quality, system test time restrictions, and customer demands. Features are often separate from releases, especially if you must practice time boxing to meet schedules.

UnitTests are often not enough to let code back into the tree. It some products you must perform system tests to validate code. UnitTests don't cover CPU time, memory use, priority inversion, etc. These require system tests to flesh out in a real running system responding to actual usage patterns.

Hang on. Does that mean you run system tests on code that hasn't been versioned?

We can make install packages from private builds for tests. We also use feature branches that are completely tested and then merged into the release mainline after testing. Then the release mainline is retested. Then code is merged to any other release mainline that might be interested.

See ContinuousIntegrationPatterns, ContinuousIntegrationApplied, FrequentReleases, IncrementalIntegration, HorizontalStripes, IsXpMostContinuous, UseOneCodeLine, AntHill, CruiseControl, CruiseControlNet, IntegrationGuard, ContinuousIntegrationIsMoreThanVersionControl, ZopePythonContinuousIntegration, DamageControl, CostOfBranching, LuntBuild, TestAutoBuild, ParaBuild?

EditText of this page (last edited December 24, 2005)
FindPage by browsing or searching

This page mirrored in ExtremeProgrammingRoadmap as of April 29, 2006