20. SUMMARY
• Violations are not visible to stakeholders
• Violations have a greater effect on code quality
• Remove a defect/violation only if doing so costs less
• Over a longer timeframe, avoiding/removing
violations dominates, resulting in higher code quality
• Over a shorter timeframe, avoiding/removing defects
dominates, resulting in lower code quality
• The timeframe of a startup should be short
• Rule #1: Always have a working system
• Rule #2: Do not attempt to future-proof your code
• Use the development stack to improve code quality
Editor's Notes
German WWII Tiger Tank at the Bovington Tank Museum in the UK.
You want an automated system to perform some behavior.Software is the description of that behavior, using specifications (code) that can be processed by the automated system.Software is an expression of what you want the automated system to do.Thus, software is an expression of your intention.== EXTRA ==A software system is a collection of formal specifications.A formal specification conforms to a formal specification language.• well defined limit on complexity• unambiguous translation semantics (syntax)• unambiguous execution semanticsSoftware is difficult because formal specification languages are cognitively dissonant from spoke languages.(Many lawyers effectively “program” using spoken languages by attempting to eliminate ambiguity.)
A defect is an inconsistency between your software and your intention.Defects may cause problems (symptoms) when executing the software:• crash• hang• incorrect outputA symptom is indicative of one or more defects.A symptom is not itself a defect.Example: A hang is a symptom. Acquiring locks in an order that can cause a deadlock is a defect.
A violation is an inconsistency between your software and how an “ideal” developer would have written it.A “violation of best practices”.Each technology (language, library, etc.) has its own set of best practices.Best practices are recommended constraints on the use of a technology, constraints that are not part of the definition of the technology.Violations may cause problems when maintaining the software:• confusion (you don’t know what the software does)• misunderstanding (you think you know what the software does, but it does something else)Example: Acquiring locks in an order that is inconsistent with their total ordering is a violation.Unlike defects, which may have symptoms, violations are not visible to stakeholders.== EXTRA ==Violations may be subjective (and contentious), since they may simply be a matter of style; e.g., “the nested statement of an ‘if’ statement must be a block”.
Code quality is determined by the combined lack of defects and violations (maximize the green area).A defect is not necessarily a violation, and vice versa.Code quality is considered worse when the combined severity of defects and violations is higher.Code quality is considered better when the combined severity of defects and violations is lower.Code quality is determined by the severity of defects and voliations, not the number of defects and violations.The severity of a defect or violation corresponds to its maximum potential cost, not its actual cost.== EXTRA ==Messy software can be free of defects (but probably isn’t).Buggy software can be free of violations (but probably isn’t).Virtually no software system is free of defects.Virtually no software system is free of violations.
Pseudo-functions:• quantities are not all independent• quantities are subjective estimatesFollowing processes that help to avoid introducing defects has an operational cost.Example: TDD.Following processes that help to avoid introducing violations has an operational cost.Example: Adhering to naming conventions.Processes that help to avoid introducing violations theoretically also help to avoid introducing defects; e.g., “define a total ordering for locks, and acquire locks according to the total ordering”.Removing a defect has a cost, which can be high. (Think Dr. House.)• detect the symptom• diagnose the probably causes of the symptom• identify the defect (or defects)• remove the defect (or defects)Removing a violation has a cost.Removing a defect or a violation is a code modification.Any code modification can potentially introduce defects and violations.The presence of violations increases the probability of introducting defects and violations.The presence of a defect has a cost only if it causes a problem when executing the software.The presence of a violation has a cost only if it causes a problem when maintaining the software.A symptom of a defect has a cost.== EXTRA ==c(Sd) = ∑(p(si)*c(si)), each si caused by dca(Dv) = ∑(pa(di)*c(di)), each di introduced due to vca(Vv) = ∑(pa(vi)*c(vi)) , each vi introduced due to vcr(DD) = ∑(pr(di)*c(di)), each di avoidedcr(DV) = ∑(pr(di)*c(di)) , each di avoidedcr(VV) = ∑(pr(vi)*c(vi)) , each vi avoided
Everything has a cost.• following processes that help to avoid introducing defects/violations in the code• removing defects/violations from the code• leaving defects/violations in the codeCosts are ultimately:• the engineering effort required to avoid introducing defects/violations • the engineering effort required to remove defects/violations• the problems caused by the symptoms of defectsRemoving a defect or a violation is a code modification.Any code modification can potentially introduce defects and violations.The presence of violations increases the probability of introducting defects and violations.The maximum potential cost of a defect is limited by the worst combination of problems that can be caused by the symptoms of the defect; this cost can be high.The maximum potential cost of a violation is theoretically unlimited (though intuitively moderated), since the number of times the relevant code can be modified is unlimited.Consequently, violations have a disproportionate effect on code quality.However, the actual cost of a violation is limited by an engineering timeframe, which limits the number of times the relevant code can be modified.
Engineering is the maximizing of net value over a timeframe with respect to building and maintaining something.Question: Should a startup be targeting a longer timeframe or a shorter timeframe?Processes to avoid introducing defects should be followed if the cost of the processes is less than the cost of defects that would otherwise be introduced.A defect should be removed if the cost of removing the defect is less than the cost of the defect.Over a longer timeframe, the cost of a defect is lower.• the cost of the defect diminishes with respect to cumulative functionalityOver a shorter timeframe, the cost of a defect is higher.• the cost of the defect is high with respect to initial functionalityProcesses to avoid introducing violations should be followed if the cost of the processes is less than the cost of violations that would otherwise be introduced.• When learning a technology, the cost of the processes can be high.A violation should be removed if the cost of removing the violation is less than the cost of the violation.Over a longer timeframe, the cost of a violation is higher.• usually remove violations• more likely to follow processes to avoid introducing violationsOver a shorter timeframe, the cost of a violation is lower, even though it has the same effect on code quality.• seldom remove violations• less likely to follow processes to avoid introducing violationsOver a longer timeframe, violations are more of a consideration than defects. This results in higher code quality.Over a shorter timeframe, defects are more of a consideration than violations, even though this results in lower code quality because violations have a disproportionate effect on code quality.== EXTRA ==Architecting, on the other hand, is the maximizing of experience with respect to building and maintaining something.• Architect: Determines what should be built in order to maximize experience.• Engineering: Determines how it should be built in order to maximize net value.
Working code corresponds to a lack of defects.Clean code corresponds to a lack of violations.It’s probably safe to assume the scale of both axes is logarithmic.Avoiding defects/violations, and removing defects/violations, both contribute to engineering effort.
If you do not have a working system, you might as well not have a system.Your stakeholders don’t care how hard you worked on almost finishing a feature.
Over-engineering occurs when the engineering effort goes beyond what is necessary for the timeframe, resulting in a suboptimal net value.Return to the Tiger Tank.• German WWII tanks were superior to Russian WWII tanks.• German WWII tanks were over-engineered because they didn’t respect the timeframe; they could not be manufactured quickly enough, thus reducing their net value.With respect to features:• Concentrate on the immediate need.• The later need will probably change and you will have reduced net value by wasting engineering effort.• In particular, this is a problem with engineering for performance before testing the performance, or engineering for scale before needing to scale.With respect to defects/violations:• Concentrate on the shorter timeframe.• The longer timeframe will probably change and you will have reduced net value by wasting engineering effort.• Follow processes to avoid introducing defects/violations only if the cost of doing so is less than the cost of defects/violations that would otherwise be introduced.• Remove a defect/violation only if the cost of doing so is less than the cost of the defect/violation.•If your code works but your code quality is shit, get over it.• If your code is pretty, you are spending too much time doing things that stakeholders don’t care about.•Quit dicking around with making the code pretty, because you’re going to break it (see rule #1).
Problem: You don’t want your code quality to be shit (as shitty).Eventually (hopefully), the cost of violations will go up as you consider longer timeframes.Eventually (hopefully), you will be in a position to sell and your code quality will become part of your company’s valuation.Solution: Improve code quality by using techniques that have low cost.The ammortized cost must be lower than the cost of violations, which is already low for the shorter timeframe.• familiarity with best practices• discipline• automation with low configuration and maintenance costsAdd infrastructure to your development stack in order to encourage discipline and to automate.
Know and understand the best practices for relevant technologies.Your familiarity with the best practices of a technology should be proportional to your use of the technology.Learn and apply best practices as you move forward.Do not go backward to apply newly learned best practices.Sometimes it is cheaper to hire.
Be disciplined about following a software development process.A software development process aids coordination among developers and helps to eliminate confusion and consequential mistakes.If you do not currently follow a software development process, start. Now.Integrate the software development process with your business process.I recommend Scrum for development teams of 5-7 people.I recommend a lean version of Scrum for development teams of 3-5 people.I recommend “whatever works” for development teams of 1-2 people.
The best way to improve your code quality is to use someone else’s code.Prefer an existing technology when available.• check the license• check the communityThe time to find+learn+integrate an existing technology is usually less than building it yourself.Even if the time is greater, an existing technology is likely to be more robust and will thus have a lower net cost.
Clean up your mess.Apply best practices to existing code while writing related new code.Large changes can be implemented piecemeal, or in rare cases using an experimental source branch (with the help of version control).Confidence in refactoring is bolstered by tests; the more tests, the higher the confidence.Immediately kill a new hire who proposes sweeping changes to the code.== EXTRA ==A rewrite of working software is almost never necessary.A rewrite is necessary only when the average cost of a modification is a net loss.
If you do not currently use a version control system, start. Now.Version control facilitates rolling back unwanted modifications.Version control aids coordination among developers and helps to eliminate confusion and consequential mistakes.Prefer a hosted solution:• outstanding availability (even Pingdom uses github; can even work offline with git)• better security and privacy than you can probably maintain• transparent backupsI recommend github. Start by splurging for a low-end paid account.
Perform analyses of your code.Many freeware tools exist, particularly in the Java space.• If you write Java, you should at least use FindBugs, Static analysis (analyzing the formal specifications that comprise the software system):• prefer tools that attempt to find serious violations (that is, violations that are likely defects or are likely to cause defects)• style checkers can be difficult to configure and may produce many false negativesDynamic analysis (analyzing the processing of the formal specifications that comprise the software system):• testing suite (at least unit testing and regression testing)• testing code coverage• concurrency profiling• performance profiling (especially for mobile apps)
Automate your build.Build management also automates analyses that can find defects and serious violations.Local builds (using tools such as make, Ant, or Maven):• include “fast” analyses• exclude “slow” analyses• require before commit to version controlRemote builds (contiguous integration, using tools such as CruiseControl, Hudson, or Continuum):• include all anayses• require after commit to version control
Be disciplined about recording symptoms of defects and other issues.Issue tracking centralizes knowledge of problems and even requests.Issue tracking aids coordination among all personnel and helps to eliminate confusion and consequential mistakes.Train non-development personnel to use the issue tracking system.Limit the issue tracking system to the software system (and related artifacts, like user documentation).• The issue tracking system is not the place to record that the build server is too slow.Integrate the issue tracking system with the planning tool of your software process, if possible.