Code wants to be simple. If you are aware of CodeSmells, and duplicate code is one of the strongest, and you react accordingly, your systems will get simpler. When I began working in this style, I had to give up the idea that I had the perfect vision of the system to which the system had to conform. Instead, I had to accept that I was only the vehicle for the system expressing its own desire for simplicity. My vision could shape initial direction, and my attention to the desires of the code could affect how quickly and how well the system found its desired shape, but the system is riding me much more than I am riding the system. -- KentBeck, feeling mystical, see MysticalProgramming
Refactoring is the moving of units of functionality from one place to another in your program. Refactoring has as a primary objective getting each piece of functionality to exist in exactly one place in the software. -- RonJeffries
It's not OAOO, and this comment probably ought to be somewhere else, but doesn't refactoring also cover replacing one piece of code with another, simpler piece of code that has the same external "appearance" and function?
OnceAndOnlyOnce is a profound concept, but difficult to apply. I've spent my entire professional life (25 years) learning how to apply it to programs. This page [many versions ago] ... was rewritten to make OnceAndOnlyOnce seem like a simple rule to apply, instead of a prime principle. OnceAndOnlyOnce is NOT easy! And it was wrong to refactor this page so that all hints of tension and disagreement are removed from it.
OnceAndOnlyOnce is not a pattern. A pattern is something you can teach someone to do in a fairly short amount of time. A day, usually. Perhaps a few weeks. But learning how to refactor classes to form a TemplateMethod does not help you see how to use XML to represent your user interfaces (a recent OnceAndOnlyOnce technique applied to Squeak), or how to make a good virtual machine. These are patterns; OnceAndOnlyOnce is not a pattern. OnceAndOnlyOnce is a principle. -- RalphJohnson
Well said. OnceAndOnlyOnce is not just a simple rule, but a (not the) core goal of all software design. It's why functions were invented. Remember that your program could have been written as a single long function using only ifs, whiles, and try/catch blocks for flow control, and primitives for all the data. Consider what that would look like. For "hello world", it's the default.
I once saw Beck declare two patches of almost completely different code to be "duplication", change them so that they WERE duplication, and then remove the newly inserted duplication to come up with something obviously better. -- RonJeffries, from the XpMailingList
In the slides for XpImmersion, RobertMartin mentions parallel inheritance hierarchies as an example of OnceAndOnlyOnce. I find it to be one of the hardest repetitions to refactor away, though. Does anyone have any hot tips? -- JohannesBrodwall
There are two ways to go. To remove the parallel: refactor either or both hierarchies until their members are congruent, then collapse pairwise. To remove duplication between the parallels: define distinct responsibilities refined by each hierarchy and relocate methods as appropriate. -- WardCunningham
This all became a little abstract for me. What about an example? Say I have an accounting and logistics system. I have an abstract class Account and an abstract class Transaction. Each transaction transfers something from one account to another. Now, I have several types of accounts: SecuritiesAccount, BondsAccount, MoneyAccount, etc. For each type of account there is a transaction: SecuritiesTransaction, BondsTransaction, MoneyTransaction, etc. Now what? -- JohannesBrodwall
To solve your dillema, try building the abstracts Transaction and Account. If you find something abstract enough to put there, you then might find something to reuse between different types of Accounts and Transactions. If not, it might be that those are just abstractions with no functional meaning for your context of usage. Another idea (some get it as an abuse, some as a smart oop move) is to move from hierarchy/inheritance design to aggregate/composition design, in your case, refactor to Transaction, Account and Transactionable (or Accountable) and descend from Transactionable to Money, Bond, etc. This way all your similarities (read code duplicates) should cut up to Transactionable, leaving Account and Transaction to deal only with the abstract behaviour/data that their names imply. The con is that you will have to invent notions to tie Transactionable to Account/Transaction together. Actually, that's the point where abstraction meets creation, but that's another story (see AbstractionAddiction, TooMuchAbstraction, ParallelInheritanceHierarchies). -- CosminApreutesei
A good example indeed. My experience on WyCash was that the prevailing domain classification exerted way too much influence over our initial hierarchies as it may have yours too. We couldn't merge the two hierarchies because Transaction and Account instances have different lifetimes. Instead, as per the second choice above, we focused on redistribution of responsibilities which turned out as follows.
Transactions -- long lived private factual information Accounts -- organizational structure related to reporting Calculators -- industry accepted analytic components Advancers -- mechanics of interpreting transactions
We were stunned at the discovery of advancers two years into maintenance. We never would have gotten this far without a long term commitment to aggressive refactoring. -- WardCunningham (See WhatIsAnAdvancer)
OnceAndOnlyOnce sounds like a nice principle but, when taken at face value, would lead to untested code. Even XP advocates stating each fact at least twice, preferably three times.
The three places where are fact is stated are:
This is a good point, Dave. I have another rule I use when information must be duplicated: When you must duplicate information make sure you will automatically detect if the duplicated information falls out of sync. Tests do this implicitly, and by definition. There are other common cases where the duplicated information is not self-testing, though. Assertions are a classic tool for handling this kind of problem. -- CurtisBartley
This point is valid only if the code is perfect. Otherwise the two types of test add information to the overall system, and hence are not duplicating knowledge. Also, to be fair, what you're critiquing here is not really OAOO (which is about code refactoring), but DontRepeatYourself, which is wider ranging. -- DavidThomas
The OAOO principle in XP refers specifically to the program. The program should express each idea once and only once - there should be no duplicate code.
Further, comparing the code:
square(NUMBER x) { return x*x; }with the test:
assertEquals(4, square(2)); assertEquals(9, square(3)); assertEquals(4, square(-2)); assertEquals(9, square(-3));we see no duplication of fact (though the test could be optimized OAOO-wise). As for duplication of tests of specific things, there's no inherent objection, except, of course, that you have to find and change all the places when a change is needed. -- RonJeffries
I like the idea of OAOO. However, much of my work is RDBMS based (Oracle), meaning that I write a fair bit of SQL. Sometimes, the simplest way to get the data you need is to throw everything into one big (possibly ugly) SQL statement. The database optimizer then works out the best way to get what you need. If the data changes, it does it a different way. The problem is that if I want to encode something once and only once, that means I have to break up these large SQL statements and do things procedurally instead (PL/SQL). This means that the code looks good but runs like a dog and all that money spent on the database and its optimizer is as good as wasted. Thoughts anyone? -- ChrisRimmer
Anything stopping you from building up the massive SQL statement from the decomposed procedures, and running the optimizer on that?
Yes. I think you've misunderstood the problem. The point about SQL is that you tell it what you want, not how to do it. These procedures actually return data, not a fragment of SQL. So there's no way to combine them to produce some big SQL statement.
The answer is, I believe, to use database views to encapsulate multi-table relationships. I had already pretty come to this view (pardon the pun) when I read this article by Martin Fowler:
http://www.martinfowler.com/articles/dblogic.html
-- ChrisRimmer
You won't get much direct benefit to performance by practicing OAOO. The benefit comes in reducing complexity and therefore increasing understanding/maintainability/extensibility. In the statement above, "one big (possibly ugly) SQL statement" can translate to one big complex, off-putting statement. It may run well, but the next guy that sees it, even if it is you one month from now, may find it sooo hard to understand that he shies away from it, or spends 5 times the effort in understanding and modifying it as would have been necessary for a few separate procedures. -- Jeff Santini
I have found that I usually get a big performance benefit from this practice; as the refactoring continues and code is isolated and shared, the processing pattern of higher level application layers becomes clearer, function pointers start coming into play (for those of us who did a lot of C and assembler) which makes these higher layers of code properly use polymorphic function/method definitions. Redundant decision making/branching is always reduced or eliminated by doing it. -- Grant Wesley Parks
I didn't expect to get a benefit to performance. I was looking for the kind of maintenance gains you talk about. However, what I don't want is a thousand-fold decrease in performance! This is quite possible in the situation I describe. See my comments above about the use of views. -- ChrisRimmer
I don't get it: you think views solve the problem of the execution plan or not? I think they do, with all the benefits of the OAOO. -- CosminApreutesei.
OAOO is once and only once of human input. If a copy can be regenerated without a human then it does not violate OAOO. Consequently CodeGeneration, for example, does not violate OAOO. Any automated duplication does not violate OAOO. For example, automatically generated program documentation does not violate OAOO even if it contains many duplicates of many pieces of code, as there are two different aspects - OAOO of human input (PrimaryInformation) and OAOO of computer generated (SecondaryInformation). -- AlekseyPavlichenko
OAOO of PrimaryInformation doesn't care about space or runtime performance efficiency; OAOO of human input is what helps produce better code.
OAOO of SecondaryInformation affects runtime execution performance or space, but doesn't affect programmer performance. For example, use of C++ templates can easily result in duplicated code, which is not a violation of OAOO because it is code that would not, under normal circumstances, be seen by (or even accessible to) a human.
See also DontRepeatYourself, RedundancyIsInertia, ExtremeNormalForm, WikiPagesAboutRefactoring, CopyAndPasteProgramming, DuplicationRefactoringThreshold
This page mirrored in ExtremeProgrammingRoadmap as of April 29, 2006