Small Business Programming

Learn how to be a wildly successful small business programmer

With great power comes great responsibility

I recently got into a discussion in the comments section of someone else’s blog where I argued that many software developers are overly confident in their abilities. I further argued that this overconfidence leads to the kind of behavior that would be considered professional malpractice if a lawyer, doctor, or professional engineer did something similar in their respective field.

A fellow software developer expressed an opposing view. He made the following points:

  • only a small percentage of software can cause as much damage as medical or legal malpractice and that software is highly regulated
  • if we stop people from pursuing their interests it will stifle innovation, which he was very much against

I hear variations on this view quite often and I think it is worth exploring.

Software as a force for good

Software has enabled our modern world. We can communicate with anyone virtually anywhere in the world for free or very low cost. It puts the world’s knowledge at our fingertips. It multiplies our productivity, manages our supply chains, opens up new markets, and keeps track of our money. We use software to discover new science, improve our health, and fight disease. Plus, software reduces the cost of just about everything.

And we, the software developers, are right in the middle of it and it’s glorious!

But we do have some problems

Let me paint a picture for you:

  • the industry average is 15-50 errors per KLOC for delivered software (that’s one error every 20-67 lines of code on average!)1
  • only 29% of projects completed in 2015 were considered successful (on time, on budget, with a satisfactory result). 19% of projects outright failed and the rest were somewhere in between2
  • software failures cost hundreds of billions of dollars EACH YEAR3
  • 90% of mobile health and finance apps are vulnerable to critical security risks4
  • consumer routers. Need I say more?5

Do you see what I’m getting at here? As a profession, we’re not exactly knocking it out of the park on every at bat.

Software developed for unregulated environments can’t cause that much damage. Really?

I don’t think that’s a defensible position. We (software developers) are creating plenty of harm in unregulated industries through our mistakes and negligent practices.

While our software probably isn’t directly responsible for killing people very often, I have no doubt we are indirectly responsible for countless deaths. Plus we enable property damage, theft of personal data and intellectual property, financial ruin, lost productivity, privacy violations, voyeurism, stalking, blackmail, discrimination, loss of reputation, interference in elections, espionage, and all kinds of criminal enterprises.

I can come up with links if you don’t believe me or you can just take a look at the thousands and thousands of entries in the ACM’s Risks Digest database. Here’s just a taste from recent issues:

I purposely chose examples from unregulated industries to illustrate a point. We don’t have to build drones with guns mounted on them or faulty autopilots that fly planes into mountains to hurt people and cause serious damage.

I know we kind of expect software to be terrible but keep in mind that none of these things are supposed to happen if we are doing our jobs correctly!

Evading responsibility

I expect that someone will want to split hairs in the comments about email not being secure and it not being the programmers’ fault that someone broke into the real estate agent’s email account and impersonated him because his password was “password123456”. And that might be true if you’re looking at an individual developer. But we (software developers) know how people are using email and we know better than anyone that it’s not secure but we’re doing very little about it.

We can also consider another plausible scenario. Perhaps the real estate agent created an account for some harmless website. Perhaps the website didn’t store their user’s passwords securely. Further imagine a copy of the website’s database ended up in the wrong hands and the criminals either just read the clear text passwords straight from the database or broke the unsalted MD5 hashes and recovered the password that the real estate agent used for all of his accounts.

Here, again, we know people re-use passwords and we should know better than to store them in clear text or unsalted hashes, even if our website doesn’t contain particularly sensitive information. But this could never happen, right?

The software we create is powerful, which means it can also be dangerous. So, you need to be thinking about security and the unintended consequences of any software you produce. Security isn’t just for sensitive projects.

Stifling innovation?

The claim here is that I somehow want to stop new people and ideas from entering the field and stifle innovation. I haven’t proposed any actual action and I have no power in the industry so I’d say that my power to stifle innovation is pretty minimal.

But let’s say I did propose something for the sake of argument. Maybe you can’t work on software where you’d have access to personal information if you’ve been convicted of identity theft or computer crimes. Is that an unreasonable rule where innovation is concerned?

How about this one: if you want to work with money or personal information in a system that’s connected to the Internet, you have to pass a simple test. Passing the test would show you have a basic grasp of security principles (TLS, password hashing, SQL injection, cross site scripting, maybe that it’s a bad idea to post your private encryption keys on your blog, etc.). And let’s throw in some ethics while we’re at it. Unreasonable?

I can’t think of any reason why a person who is capable of creating innovation in any area of software development would have any trouble demonstrating his or her basic competency as a programmer. Nor do I believe a reasonable testing process would stifle innovation.

Unleashing innovation

What if keeping certain people out of software development increases innovation? We know there are huge differences between the productivity of the best and the worst developers–5-28 times by some estimates6.

So what if we had some licensing mechanism to keep the worst of the worst from becoming professional software developers. Instead of having them bounce from job to job causing chaos wherever they go, maybe we could create our own version of the bar exam or something but set the bar really low. The mechanism and details aren’t important right now but just imagine if you never had to deal with another net negative producing programmer for the rest of your career. How much more could you accomplish?

Wrapping up

Pretty much every jurisdiction in the world requires people to demonstrate basic competency to drive a car. While cars are incredibly useful machines, they can also be incredibly dangerous. Ensuring the basic competency of all drivers is in the public interest. Likewise, I’d argue that ensuring the basic competency of computer programmers is also in the public interest for similar reasons.

Whether you agree with my view or not, licensing software developers is not going to happen any time soon. So, go build the next amazing thing! Please just keep in mind that any software worth building can also cause harm. So, I hope you’ll skim the Risks Digest and maybe think about the specific risks of your software and how you can minimize them.

With great power comes great responsibility.

What do you think? Agree or disagree? I’d love to hear your thoughts.

Additional resources

Here are some links you might enjoy if you want to dig a little deeper:


  1. Code Complete (Steve McConnell) page 610. If you have a 100,000 line application it will contain 1,500-5,000 errors! 
  2. Standish Group 2015 Chaos Report 
  3. The estimates are all over the map but they are all HUGE. See Software failures cost $1.1 trillion in 2016 and Worldwide cost of IT failure: $3 trillion 
  4. 90% of mobile health and finance apps are vulnerable to critical security risks 
  5. The quality of consumer routers is brutal in just about every way imaginable. And people can do bad things to you if they break into your router, really bad things 
  6. Facts and Fallacies of Software Engineering (Robert Glass) Fact 2. But I’d actually argue that the spread is infinite. I’ve met programmers who couldn’t solve a toy question in a screening interview. So their productivity is going to be zero (or negative), which makes the spread infinite. 

Worthless software: risks and prevention

We (software developers) write astounding amounts of worthless software. I find it hard to fathom just how much junk we are producing. As someone who spends a fair bit deal of time thinking and writing about how to be a 10x programmer and effectiveness, I believe we have lots of room for improvement here. In this post, I’m going to examine the problem of worthless software and what you can do about it.

How much worthless software are we producing?

Tons. The Standish Group 2015 Chaos Report, concluded that only 29% of projects completed in 2015 were considered successful (on time, on budget, with a satisfactory result). 19% of projects outright failed and the rest were somewhere in between. Ouch. But project failures aren’t the only kind of worthless software.

Features users don’t use or care about

None of the Chaos Report numbers take into account all the stupid, useless, rarely used, and unused features in successful projects. (I’m looking at you, Clippy.) But seriously, look through the menus in Microsoft Word or Excel. Do you even know what half of those things do? I don’t. Can you imagine being the Microsoft developer who maintains the kerning feature in Word? It’s hard to get a good handle on the number of features that fall into this category but I feel confident saying that it is well above 25%.

Features that cost too much

And then there are the successful features within a successful project that cost way too much to deliver and maintain. I don’t have a statistic for you on these costly software engineering failures but I think we can agree this is the rule more often than the exception.

Unneeded code

Plus, you have all kinds of code that doesn’t need to be in the project at all. I’m talking about dead code, automated tests for said dead code, duplicated code, over-engineered code, piles of code that can be replaced by one or two standard library calls, etc.

Nonviable business model

We must also count all the software produced by failed startups. The vast majority of that software dies with the startup that wrote it.

But it gets worse, so much worse

You don’t just pay for worthless software when you create it. You pay for it in each phase of the development process forever. So you might look at a feature that you wrote that turned out to be worthless and think it took 100 hours to implement, no big deal, right? Wrong.

Measuring the true cost of worthless software

If your team is like my team, you record your time spent directly building, testing, integrating, and releasing a feature but you don’t count the cost of:

  • the initial discussion and investigation of the feature
  • all the time you spend talking about it in meetings
  • reporting on its progress
  • reading the code when maintaining other parts of the system
  • coupling other parts of the system to the feature (while you attempt to keep your software DRY)
  • people leaving your team because your software is bloated mess and a nightmare to work on
  • hiring new programmers
  • bringing new programmers up to speed
  • upgrading your code to deal with deprecated libraries, language features, or APIs
  • keeping your code styled and formatted like the rest of your project
  • tracking, isolating, and fixing defects
  • extending the code
  • waiting for its tests to run a thousand times a day when you are working on other features
  • trying to maintain and evolve your architecture
  • increasing the attack surface of your system to malicious users
  • supporting the feature until the end of time (almost always happens) or porting it to the next version of the software (frequently happens) or eventually removing the feature (almost never happens)

Complexity

As if all this stuff isn’t bad enough, software complexity scales non-linearly. So, after you reach a certain level of complexity, every worthless feature you add to your project makes it significantly harder to work on the rest of your software for the life of the project. Let that sink in for a minute.

Opportunity cost

And the final kick in the groin is the opportunity cost of your worthless software. What could you have done instead of pursuing and maintaining this worthless software?

So, what’s the total lifetime and opportunity cost of building, supporting, and maintaining that “100 hour” feature? Who knows–nobody keeps track. It’s almost certainly more costly than anybody would be comfortable admitting.

How to keep worthless software out of your project

I hope I’ve sufficiently scared the crap out of you. Keeping worthless software out of your projects is a major engineering challenge. Software is so easy to change compared to hardware or physical things like buildings or cars, that people find it impossible to resist the temptation to change it. So, you can’t expect your stakeholders to stop making change requests because that’s not going to happen. What you need is a system for dealing with change requests and filtering out the ones that will lead to worthless software.

This is a huge leverage point. If you can consistently filter out ideas for worthless software from your project before you build them, you are so much better off than trying to figure out how to create software more efficiently. And think about how much time you devote to that.

I don’t have a silver bullet here. But I do have some ideas for how you can keep worthless software from creeping into the various stages of your development process.

Talk to your stakeholders about the total cost of ownership of worthless software

Your first step is just to talk about the problem of worthless software and the total cost of ownership with your stakeholders. Look at your project and identify the worthless software in it. How bad is it? In which development stage did you introduce most of the worthless software in your project? To what extent is your progress on high priority features slowed down by worthless software? How much effort are you expending to maintain this software?

Can you make a business case that it would be cheaper to remove some of the worthless software from your project than it would be to continue supporting and maintaining it?

Spike and stabilize

Instead of bringing every feature to 100% production quality before you release it, in this patten you produce a sort of working prototype of a feature and get it in front of your users as fast as you can. Then you gather feedback from your real users to determine whether the feature should be killed right there or receive additional investment to make it more complete and production quality. Dan North describes spike and stabilize a million times better than I can so just checkout his talk on the subject.

Always work on the highest priority tasks

You must advocate for working on only the highest priority tasks at all times. Add your change requests to your backlog but make sure the truly important stuff is always at the top. If someone pushes something potentially worthless to the top of the backlog, don’t argue against it directly. Pick the highest priority thing in the backlog and make your case for why task B is a higher priority than task A. If that’s not enough, you can “go negative” on the task you think will lead to worthless software and make a case for why it shouldn’t be at the top of your backlog. You won’t always get your way but if you are discussing things openly and honestly, you should keep the majority of the bad ideas out of your project.

Use your lack of staff to question every requirement

Almost all software projects are understaffed or behind schedule. And if you are in this situation, you can use it to your advantage to trim the “fat”.

Just because the overall task is important, doesn’t mean that every single part of it is also important. So you need to question every requirement. It’s way too easy for anyone to write a few words into a requirements document that turns into weeks or months of work or completely thwarts your architecture.

Find the “fat” and trim it wherever you can. And if you can’t trim it, at least schedule it in the later stories. With a little luck, your stakeholders will change their minds and drop or deprioritize the stories that include the “fat” before you implement them.

This strategy also gives you time to come up with good arguments for why the “fat” shouldn’t be in the project if those stories do eventually rise to the top of your backlog.

“Fat” in this case spans all types of requirements, not just stuff your users can see. Really think about if you need to add logging and email alerts for a new feature when you’re not even sure anyone will use yet. Does it really need to be that fast at the beginning? Documentation? A full suite of automated unit tests?

We’re trying to make “question every requirement” into an official step in our development process right now because we see it as such a high leverage point.

Use vertical slicing

We often split our features into multiple stories to align more closely with our optimal pull request size. We work to implement the high uncertainty and “must have” requirements in the earlier stories and work towards the “nice to have” requirements in subsequent stories. And it’s shocking how often one of two things happens. Either we completely drop the “nice to have” stories because something more important comes up and our efforts are redirected elsewhere. Or, we learn something by deploying the early stories that radically changes how we want to approach the later stories.

Either way, it helps us avoid adding worthless software to our project. And it will work for you too.

KISS and YAGNI

KISS means “keep it simple stupid” and YAGNI means “you aren’t going to need it.” You should burn these acronyms into your brain. Every line of code you add to your project is a liability. So, if you’re adding code “just in case” or because this is where the books told you should should use the observer pattern or for any other reason other than that you need it right now, I urge you to reconsider. It’s almost always pure downside.

Similarly, you should ignore most automation, infrastructure, and scaling concerns if you’re just starting out. Address them as they become inadequate, not before.

Further reading

Here are some resources you might find helpful:

Wrapping up

Because you can add worthless software to your project any stage of the development process, you need multiple strategies to combat it. It can be difficult to maintain this level of vigilance, but the cost of supporting and maintaining worthless software is so high that you can expend significant effort up front to keep that stuff out of your project and still end up ahead.

How do you keep worthless software out of your projects? I’d love to hear your thoughts?

How to make time to repay your technical debt

There are hundreds, if not thousands, of articles about how to pay down your technical debt and most of them miss step one. They don’t tell you how to make the time to repay your technical debt. It’s unlikely that you’ll be able to convince management to let you stop working on features and bug fixes to repay your technical debt for long enough to truly make a difference. And since it’s always hard to quantify the impact of repaying technical debt, it rarely gets to the top of anyone’s backlog. Well, I’m going show you how to get around this problem.

Stop making messes

In my last post: Technical debt: we need better communication, not better metaphors, I made the distinction between technical debt and crappy code. I also talked about my team’s decision to stop letting crappy code into our project. It feels slower at first but that’s almost certainly not the case. It’s just that we were not seeing all the unplanned work that didn’t happen after we didn’t push crappy code into production.

So, your first step is to stop making things worse. Use technical debt strategically–and probably sparingly unless you are working for a startup–and stop allowing crappy code into your project because crappy code has no upside.

You shouldn’t repay your technical debt in some projects

You may find it unprofitable to repay your technical debt in certain projects or in projects at certain phases of their life cycle. This includes projects that are:

  • nearing end of life
  • throwaway prototypes
  • built for a short life
  • will not be changed ever again
  • really, really small (any competent developer can probably figure out what’s going on in a 100 line script)

If this sounds like your project, keep reading anyway. This post is about getting more efficient so you can repay your technical debt. But you could just as easily use your new efficiency to deliver more features.

How to make time to repay your technical debt

Okay, let me give you an overview of how this all fits together and then I’ll spend the rest of this post filling in the details.

Overview

If your company and your team are interested in paying down technical debt–really interested, not just paying lip service to the idea–then making time to repay your technical debt isn’t too difficult. Just kidding, it’s pretty hard. That’s why so few teams are able to do it. But here’s the thing: most teams are just drowning in waste and inefficiency. So you almost certainly have plenty of room to improve.

You can use your efficiency gains to repay your technical debt. Just make a deal with management where you agree that any efficiency you pick up gets redirected to repaying your technical debt or gaining more efficiency. If nobody’s buying that, you can try pitch it as a two month experiment. Experiments are less threatening than “change” or “new policies.”

An example

Let’s say you are delivering 30 story points per sprint of features and bug fixes with a “no crappy code” policy in effect. You promise to keep doing that. But if you have extra time because you become more efficient, then you get to use those, say, 5 extra story points to repay your technical debt or clean up messes or invest in more efficiency.

That’s the most extreme scenario where management won’t let you slow down feature release. But almost every team can convince management to devote some time to cleanups. So let’s say you start at 30 story points and management says you can use 5 of those points to repay your technical debt and after a few sprints you get more efficient and pick up another 5 story points. That’s probably a realistic starting point.

So apply your 10 story points to your highest priority ideas at the beginning of each sprint. Let’s ignore the details for now but just imagine you have a prioritized list of high value ideas at the ready.

If you invest carefully, you should have even more time to deliver your agreed upon features and bug fixes while you continue to invest in cleanups and efficiency. Repeat this every sprint. Measure your progress as you go and correct your course as you learn where to find the easy wins.

What’s in it for you?

Make no mistake–repaying your technical debt is a long term project. But the benefits are worth your effort:

  • developers who learn how to pay down technical debt and clean up messes without extra resources are very valuable
  • crappy code is a major source of low morale and developer dissatisfaction so cleaning up your code base should make your life more pleasant

How to find your waste and inefficiency

Every team is different but I’ve never heard of a team that couldn’t improve. Because this is such a big topic and I don’t know what tools and techniques your team is using, I’m just going to enumerate ideas that I think are useful. You can use these ideas as a starting point for your conversations. Pick one or two things that your team thinks will help you the most and work on them until they are automatic. Then repeat. Doing everything all at once is a recipe for chaos and burnout.

How to eliminate your waste and inefficiency

Some of the simplest things have the best payback. Let’s look at a few ideas:

  • establish a shared sense of urgency. Figure out why you–as a team–want to become more efficient. Talk about it often. Write it on a white board for everyone to see.
  • brainstorm initiatives with your team, rank them by likely payback, and work from the top of the list
  • do your refactoring/efficiency stories at the beginning of each sprint (if you leave them to the end of your sprint you’ll be much less likely to have time for them)
  • adopt some kind of agile process with short iterations (SCRUM, KANBAN, etc.)
  • manage your code with a VCS like GIT or Subversion
  • everyone on your team uses the best IDE money can buy on a fast computer with ample screen real estate
  • decide how the code in your project will be formatted, setup your IDE to format your code to the project standard with a pre-commit hook and never format your code by hand again (you can use a language standard instead of arguing about your code formatting – for example PHP has PSR-1 and PSR-2, Python as PEP 8)
  • do self and peer code reviews (hour-for-hour code reviews find more defects than testing but they also increase understanding and spread knowledge)
  • automate part of your code review with static analyzers, linters, and by turning up your compiler/interpreter warnings as high as you can without overwhelming yourselves. Let your tools find as many errors as possible so you can focus your code reviews on the stuff computers cannot catch
  • only work on the highest priority stories and do as few of them at a time as you can (minimize multi-tasking)
  • stop wasting time working on low priority stories
  • stop talking about the details of things you might work on in a month or two (half of what you talk about won’t be done or will change so much that it’s a waste of time to talk about it now)
  • stop adding stories to the backlog if you have no intention of working on them in the next two sprints (it’s a waste of time)
  • stop adding “nice to have” features to your stories–stick to the essentials (remember YAGNI: you aren’t going to need it)
  • stop wasting time in meetings and stand-ups. Be prepared, be on time, get through your agenda, and get back to work as soon as you can
  • don’t get bogged down in disagreements. If you can’t agree on an initiative, dump it and move on.
  • stop interrupting your co-workers when they are working (knocking people out of flow is very costly)
  • don’t work overtime unless it’s absolutely necessary (avoid burnout)
  • get management to commit to not interrupting your sprint unless it’s an emergency
  • if there’s an easy way to increase the productivity of a team member, consider investing a little time to make that a reality
  • do your retrospective meeting, review how things are going, and figure out how to make them go better/faster
  • find your optimal pull request size (it matters more than you might think)
  • communicate your progress to your stakeholders, get people excited about what you’re doing

Measuring productivity is a mistake

Depending on who you believe, measuring individual developer productivity is somewhere between impossible and very, very difficult. If you do go this route, you’ll soon find your developers gaming the system. Limit your measurements to team velocity and defects in production. Look for a trend over months. And don’t place too much importance on the numbers or people will game them.

Instead of trying to measure productivity, you should look for and eliminate waste. You’ll see waste everywhere once you start looking for it. If you somehow get your team to the point where you can’t find any more waste, you’re amazing! Write a project management book, start your own consultancy, and go on the speaking circuit because we need to know how you did it (I’m not kidding).

Agile is based on bad assumptions but you can improve it

Agile borrowed ideas from manufacturing like Lean and Six Sigma. Lean came out of Toyota and focuses on the elimination of waste. And Six Sigma came out of Motorola and focuses on the minimization of variation. Agile adopted these ideas because they were so successful in manufacturing and we wanted the same success in software development. However, we failed to see that the fundamental assumptions underlying both Lean and Six Sigma don’t hold in software development because software development is nothing like manufacturing!

But I have hope. I recently read “The Principles of Product Development Flow: Second Generation Lean Product Development” (Donald Reinertsen). This book blew my mind. Reinertsen’s insights are so obvious that I can’t believe we haven’t figured this out yet. My team is working on implementing his ideas right now.

So if you want efficiency, read Reinertsen’s book. Or you can get a taste of his ideas in this video.

Wrapping up

Technical debt and crappy code are everywhere. And while many articles and books can help you figure out how to tackle it, they generally don’t help you find the time to repay your technical debt.

Your first step is to stop allowing crappy code into your project. You can make time for code reviews, unit testing, and other QA activities with the time you save not fixing the defects you didn’t release into production.

Your next step is to steer some of your team’s effort toward increasing efficiency. I’ve listed over 20 ideas to get you thinking about eliminating waste and inefficiency. And once you’re done with the easy stuff you can dive into Reinertsen’s book and work on the harder stuff. You can use your increased efficiency to become even more efficient, repay your technical debt, or clean up crappy code. No matter which combination of the above you choose, if you follow this plan you’ll still be available to continue releasing new features and improvements. And, maybe, you’ll enjoy your job more too.

Have you worked on a project that successfully paid down a pile of technical debt? How’d you find the time to do it? I’d love to hear your about your experiences.

Technical debt: we need better communication, not better metaphors

Technical debt as a metaphor is not serving our profession well. It was meant to help us talk to business people and make better decisions about our software projects. But it has largely been a failure. Part of the problem is that business people aren’t afraid of future interest payments.

In business school, I learned about the power of leverage: what it is, how to get it, how to measure it, and how to manage it. Debt is a tool. My fellow business school grads are comfortable with leverage and debt. So when the programmers come to the business people complaining about technical debt, the business people are unconcerned. And why would they be? The metaphor is misleading; technical debt is only superficially similar to financial debt.

What’s at stake

I’ve noticed that business people tend to be better at convincing programmers to damage the long term health of their code base for short term gains, than programmers are at convincing business people about the importance of having healthy software with few messes and low levels of technical debt.

This is really unfortunate for three reasons:

  1. once you allow your project to turn into a mess, it’s really difficult to clean it up (in many cases it might be impossible)
  2. the decision to take a shortcut or make a mess for short term gain is very often sub-optimal in the long term. If the business people truly understood the risks and costs of these shortcuts, they would likely make better choices
  3. trying to maintain and extend a software project full of crappy code is frustrating, soul-crushing work. Many programmers quit their jobs to get away from low quality code–it’s that serious

No matter how much technical debt your project has, you can always make it worse. You’ll always face both internal and external pressures to take a shortcut here, avoid a refactoring there, and release that new feature without automated tests.

Therefore, we need to do a better job at defending the code with the business people. It’s their job to advocate for the business objectives and it’s our job, as programmers, to advocate for the code. Don’t believe me? Read this: Software Engineering Code of Ethics and Professional Practice

How to advocate for the code

You need to educate stakeholders in your company about the software development process. Your job is not done until you are sure they understand the trade-offs and risks involved in taking shortcuts. I’m going to use the remainder of this post share my tactics.

Find a hook to get business people interested in how your company creates and maintains software

You need to get the business people involved in your process. The more they internalize your struggles and challenges, the better.

Appeal to efficiency

You’d be hard pressed to find a business person who didn’t want to discuss an opportunity to have the software developers deliver:

  • more features, more quickly
  • fewer defects
  • faster response time to requests for changes
  • more predictable release schedules
  • lower development costs
  • lower programmer turnover

Your general thrust here could be that your company is missing out on high value opportunities. And that you’d like some advice from the business people on how you can capitalize on those opportunities and make things run more smoothly/efficiently/effectively.

Appeal to risk avoidance

Depending on your audience, you might find another approach to be more persuasive. Business people are trained to manage risk and you can use that to your advantage by framing the situation as one of managing increasing risks.

Here are some risks of making messes in your code:

  • competitors release superior software
  • defects soar
  • costs soar
  • unable to meet regulatory requirements
  • unable to add new features
  • progress grinds to a halt
  • data loss/corruption/breaches
  • lawsuits/fines
  • lose customers
  • bad publicity
  • lose use of the software completely
  • good programmers leave the project
  • and here’s the biggie: risks interact in unpredictable, non-linear ways

One of these two strategies will likely get the attention of your business people.

Get them involved in your meetings

The best thing my team ever did was adopt SCRUM. Our product owner really ‘got it’ once he attended a few of these meetings. He also became a captive audience for an hour or so per week during these meetings. I took full advantage of the opportunity to increase his understanding whenever I could.

These meetings make abstract ideas about ‘going slow’ or ‘bad code’ more concrete. For example, we look at all the stories where we missed our estimates by a wide margin at our retrospective and talk about why that happened. It’s one thing for the business people to generally know that the programmers think the code is bad. But it’s quite another thing to listen to a programmer take 20 minutes to explain all the reasons why a simple change to the system took 40 hours (10 times the estimate). When that happens sprint after sprint, the business people can’t help but buy into it. I find it especially helpful if you get them involved in the root cause analysis and help you formulate solutions to the problems.

The meetings also allow the business people to get a sense of the team. How do the programmers handle themselves? Are they competent? Are they complainers? How’s the morale? Are they lazy? Do they care about the business? The answers to these questions aren’t nearly as important as allowing the business people to reach the inevitable conclusion that ‘this programming thing’ is harder than it appears from the outside.

Make a distinction between technical debt and crappy code

Technical debt has expanded from its initial definition to include almost any undesirable quality of code. At my workplace, we are resisting this trend. We make a distinction between technical debt and crappy code.

Technical debt:

is a deliberate decision to take a shortcut to achieve a short term goal at the expense of an unknowable long term cost.

Crappy code:

is any other kind of shortcut that is not technical debt.

It’s our policy to only allow technical debt into the code base as a result of a considered decision made by the business people and the programmers. Our software is pretty mature so we only allow technical debt into the code base to deal with emergencies. For example, if we discover a critical defect in production, we might ‘hack’ in a solution to resolve the problem immediately and call that technical debt. And then we’ll make a story to fix it properly later.

Whereas we allow a limited amount of technical debt into our code base, we have a zero-tolerance policy for crappy code. We made this policy after a long series of discussions and root cause analyses. We decided that the only reasonable way to move forward with our project was to stop allowing crappy code into the code base. To that end, all code is reviewed and it doesn’t get merged with master until it meets our quality objectives (see our code review checklist). If that means a change takes ten times longer than we originally estimated, so be it.

Of course, it’s not as black and white as that. Programmers make judgment calls every day. But our talk, actions, and behaviors are all aligned towards the production of zero lines of crappy code. We’ve made it part of our culture.

Show them the research on the cost of crappy code

Business people generally have no idea how bad crappy code is for the health of their software. So, I’ve made it my job to educate them.

I lent my copy of “The Software Project Survival Guide” (Steve McConnell) to two senior people in my company. This book is getting dated but the first 5 chapters are absolutely worth their weight in gold.

I show anyone who will pay attention to me page 4 of “Clean Code” (Robert C. Martin). On this page Uncle Bob describes the total cost of owning a mess. And I especially love pointing to the graph at the bottom of the page (reproduced below) and asking them where they think we are on it.

Developer productivity vs. time

I also show people my copy of “Facts and Fallacies of Software Engineering” (Robert Glass) and use it to stimulate software engineering discussions.

I advocate aggressively for our code, our team, and our sanity. I’m really trying to get three points across:

  1. developing software is hard, expensive, and prone to outright failure so we should take it seriously
  2. as a profession, we have some idea about which practices influence the risks, schedule, quality, and cost of software development
  3. we should adopt the practices associated with the outcomes we want

Show them examples of crappy code in your project

It’s pretty easy to convince people that crappy code is bad in theory. But it’s another thing to convince them that they own crappy code.

I’ve gone as far as literally emailing screenshots from my IDE to our senior people. I found one email I wrote last year that had screenshots showing:

  • all the static analysis errors for one of our projects. It showed over a thousand serious warning and errors and thousands of weak warnings and notices.
  • the static analysis errors for a single 3,189 line file containing 5 languages and over 500 warnings
  • all the errors and warnings for a single file flagged down the right side of the editor window. It was 95% covered with warning indicators

I wrote that email to:

  • make our code quality problems visible
  • show why our code is so expensive to change
  • and advocate for not making the situation worse

Quality and speed are not opposites

I love this one. Steve McConnell published an article titled Software Quality as Top Speed. He argues that:

In software, higher quality (in the form of lower defect rates) and reduced development time go hand in hand.

This article is well argued, written by an expert, and it is backed by research, which is clearly referenced in the foot notes. Read it. Bookmark it. And realize that it is a powerful weapon in your arsenal to combat suggestions that your team would go faster if they’d just stop doing so much QA or upstream planning.

Bend over backwards to make the business successful

All your efforts will be for nothing if the business people think–even for a second–that the best interest of your business are not your top priority. You can’t just say that you aren’t going to accept any technical debt from now on no matter what. You can’t say you are going to do things “the right way” from now on and they’ll just have to get over it.

You’re all on the same team. And your objective is to make as much money for your company as possible. So when the business people say they need a new feature in 5 weeks and you think it will take 20 weeks, you (collectively) have a problem that needs solving.

And you, as the software expert, have plenty to contribute. Why do they need this feature? Do they need the whole feature all at once? Could it be broken into phases and you just deliver the first phase in 5 weeks? Could you change the spec to make the feature easier to implement? Could you build a completely different feature in five weeks that delivers the same sales and profit impact? Could you integrate with an API and cut your development effort? Could you deliver a minimal version of the feature in five weeks and follow up immediately with an enhanced version? Could you outsource some of the work?

There are lots of possibilities besides hacking something together, and kicking it out the door. And if you’ve been laying a good foundation with the other steps, perhaps you can keep things from coming to that.

Act like a professional at all times

You need to act like a professional if you want to be treated like a professional. Whining and complaining are not the behavior of professionals. But that’s not all there is to it. You need a plan to manage your software development process effectively and deliver results. You need to make honest estimates. You need to hold yourself accountable for the outcomes of your team. You need to be truthful and avoid exaggeration.

For example, I avoid blanket statements about our code base. Some code is bad, some isn’t so bad, and some is actually pretty good. So I never say that we can’t make a change. But I will say that this change would require substantial modification of that really ugly module without tests I was telling you about last week. The original author is long gone and we don’t really know how it works. I would say this change would take a many months of effort; would you like me to work up a detailed estimate?

In a case like this, my estimate would include a range that communicates the uncertainty involved in a change of this magnitude. My estimate for this change might be 3-8 person months with 85% confidence. That means I think there it an 85% chance that we could deliver this change in 3-8 person months and that there is a 15% chance the actual effort could exceed that range on either end.

And no, I can’t do better than that. That’s the estimate. We can create another estimate if we modify the scope of the change but I always give my best estimate the first time. And I don’t allow myself to be bullied into lowering it.

The real wins come when you understand where the business people are trying to go and you can make suggestions and plans to help them get there faster or at least avoid the land mines along the way.

When all else fails, you might have to quit

Sometimes you just won’t be able to convince the people you work with that anything other than hack on top of hack on top of hack is possible. They don’t care about software engineering research. They don’t care about best practices. They don’t care if you have to deal with crappy code. They don’t care if your life is miserable. They don’t want to talk about how you develop software. They just want results, and the cheaper, the better.

If you find yourself in this situation, you have some decisions to make. Let me just say two things. First, there are companies that actually care about the quality of their software and the morale of their developers. And second, high quality software developers are in huge demand so you shouldn’t have too much difficulty finding another job if you wanted to make a change.

Wrapping up

When programmers add crappy code to a project with a long life expectancy both research and hard-won experience show they are almost certainly taking on more cost and risk than they realize. You’ll face all kinds of internal and external pressures to deliver features faster. But always remember it’s your job to advocate for the best interest of your code and your business.

A code review checklist prevents stupid mistakes

My team uses a code review checklist to prevent stupid mistakes from causing us problems and wasting time. In this post, I want to share the reasons we decided to implement a code review checklist, what’s on our checklist, how we created, use, and improve our checklist, and how it’s improved our effectiveness.

Why create a code review checklist?

My team is on a mission to increase our effectiveness. We’re following the Theory of Constraints model where we identify our constraint and find ways of overcoming it.

When we looked at the flow of stories across our sprint board, we immediately zeroed in on our review process as our bottleneck. More often than not, stories ended up in our “reopened” status (failed code review) than our “done” status. And when we tracked why our stories failed code review, we found all kinds of reasons related to the quality of the code itself. But we were surprised to find just how often we made a stupid mistake. Examples include forgetting to run the unit tests or missing a requirement. In fact, “stupid mistakes” caused of the vast majority of our failed code reviews.

But we’d also occasionally end up with a defect in production when we missed a step or performed an ineffective code review. It’s really embarrassing to tell your boss that you took down the website because you messed up something simple.

I had read The Checklist Manifesto: How to Get Things Right (Atul Gawande) a few years ago and immediately recognized that this would be an excellent opportunity to use a checklist.

Benefits of a code review checklist

Checklists are great way to ensure you cover all the steps in a complicated task, like a code review. Instead of having to remember what to look for and review the code, you can just review the code and trust the checklist to ensure you cover all the important points. As long as you actually use your code review checklist (and the checklist is well-constructed), you should catch the vast majority of stupid mistakes.

What’s on our code review checklist?

Our code review checklist has evolved over time. Here’s what my personal version looks like right now.

DO-CONFIRM

Code Review Checklist:
1. scope – story is high priority, small, minimize creep, no stray changes, off-task changes added to backlog
2. works correctly – specification is correct and complete, implementation follows specification, testing plan created, unit tests created and/or updated, master merged into branch, all changes tested, edge cases covered, cannot find a way to break code, cannot find any ways these changes break some other part of the system, all tasks ‘done’, ZERO KNOWN DEFECTS
3. defensiveness – all inputs to public methods validated, fails loudly if used incorrectly, all return codes checked, security
4. easy to read and understand – appropriate abstraction and problem decomposition, minimum interface exposed, information hiding, command-query principle, good naming, meaningful documentation and comments, fully refactored (use judgement with existing code)
5. style and layout – all inspections pass, code formatter run, no smelly code styling, line length, styling consistent with project guidelines
6. final considerations – YOU FULLY UNDERSTAND THE CODE AND THE IMPACT OF THE CHANGES YOU HAVE MADE AND IT… is unbreakable, is actually done, will pass code review, would make you proud to present your changes to other programmers in public, is easy to review, correct branch, no stray code, schema changes documented, master merged into branch, unit tests pass, manual testing plan complete and executed, all changes committed and pushed, pull request created and reviewed, Jira updated

My team’s official checklist doesn’t include step 6. That’s my own personal reminder of things that I find important to check at the very end of the code review.

Some people might say that our checklist contains way too much detail. That’s probably a valid complaint. However, this is the checklist our team developed and it works for us.

How we created our code review checklist

We created our checklist by looking the steps in our code review process and the problems we were having getting pull requests to pass review. Then we gathered all those details into an initial code review checklist. We organized and refined our initial checklist over several weeks.

Once we were happy with our code review checklist we converted it into a Google form. We also included some additional fields for data that we wanted to capture about our code reviews such as:

  • the name of the author and reviewer
  • how long the code review took
  • the outcome (pass/fail)

These fields are important because they give us the feedback we need to monitor the success of our code review improvement program.

Here’s what the Google form looks like:
Google form for capturing the results of code reviews

How we use our code review checklist

The author of a pull request does a self-review on his code using the code review checklist. He corrects any issues he catches and then releases his pull request for review.

The reviewer uses the Google form version of the checklist to guide the review and capture a summary of the outcome. We typically complete the Google form in under two minutes. We attach specific and detailed feedback for the author about the code itself in BitBucket.

The Google form stores the results of our codes review in a Google spreadsheet. We look at the data every two weeks as part of our retrospective meeting where we evaluate the effectiveness of our code review process and see how it’s changing over time.

We’ve found self-review to be just as important as the peer review. It’s surprising how many issues you catch when you step back from the details and subject your code to a checklist. Our checklist is especially good at catching things we forget to do. For example everyone occasionally forgets to implement a requirement. Without a checklist to remind us, we find it hard to see what’s missing from a pull request.

How we improve our code review checklist

Our code review checklist is a living document. We review it periodically and add or remove issues as necessary. We also encourage programmers to keep their own version of the code review checklist. Personalized checklists contain reminders that are important only to the person who wrote them (like section 6 is for me – see above).

Results

We couldn’t be happier with the results of our code review checklist. We’ve made a serious dent in the number of code reviews that fail for stupid reasons. Every failed code review costs us a serious amount of time. And code reviews that fail for stupid reasons–like a missed requirement–are an extremely preventable form of waste. Furthermore, we conduct our code reviews more quickly and the process is more consistent because we use a checklist. We’ve significantly increased our effectiveness.

We also use our code review checklist to identify automation opportunities. When we first started doing code reviews we had lots of issues with code formatting, styling, naming, method complexity, etc.. We quickly realized that these issues were costing us lots of time and that they would be worth standardizing/automating. So we configured a code formatting tool and various static analyzers. They run automatically right in our IDE. Automating those steps gave us a huge boost in effectiveness–really outstanding.

These days our checklist is showing us the high cost of our manual testing procedures. So we’re putting more effort into testing automation in cases where we think there’s a healthy ROI.

Further reading

There are tons of resources for programmers who want to start using checklists. But I’ll just mention a few here:

Wrapping up

Checklists are a proven way to increase the effectiveness of your code reviews. It took us maybe a dozen hours to get our initial version of our code review checklist up and running. And that tiny investments pays us dividends every day.

Do you use a code review checklist? Share your thoughts in the comments.

Optimal pull request size

Every team has an optimal pull request size, it’s likely much smaller than you think, and making your pull requests your optimal size will improve the performance of your team. I’m going to convince you that these three statements are true by the end of this post.

What does the research say?

A large body of research has shown that code reviews are the most cost-effective way of finding defects in code1. But most of the value of a code review comes from the first reviewer in the first hour2.

How much code can my team review in an hour?

My team tracks how much time we spend on each code review. I compared those times to the number of lines changed in our pull requests to produce this scatter plot3.

Relationship between pull request size and review time

The data is pretty messy but if we want a 90% chance of completing a code review in an hour, we probably need to cap our pull requests at around 200 lines of code.

Finding your optimal pull request size

This is really just an optimization problem. You have a cost associated with creating, reviewing, managing, and merging a pull request, which we can think of like a transaction cost. But you also have a cost associated with letting the pull request remain as work in progress, which we can think of like a holding cost. I’ll define these costs in more detail in a minute. I just want to establish that you are trying to find the pull request size that minimizes the sum of your costs (the lowest point on the blue line below).

optimal pull request size

Pull request transaction costs

It takes time and effort to:

  • break stories into bite sized chunks
  • make the them independent or schedule them so they don’t block each other
  • then actually make the changes required
  • create a testing plan
  • test the changes
  • create a pull request
  • review it
  • merge it
  • deploy it to production (we do continuous delivery)

If you did this for every line you changed, you would be at the far left on graph above. Not good. But making larger pull requests creates costs of its own.

Pull request holding costs

The larger the pull request, the:

  • longer it takes for the reviewer to find a big enough slot of time to address it (and the less motivated the reviewer will be to review it)
  • less chance it will pass code review on the first attempt and therefore require multiple reviews
  • longer it will take to review
  • more chance the review will exceed an hour (or whatever your personal limit is for effective code reviews) and become be ineffective
  • more chance it will grow larger still and need even more review
  • bigger the risk that the author made a bad assumption and will have to redo much more work
  • bigger the risk that the requirements will change before the code is released
  • more demoralizing it becomes to work on this code (multiple reviews of the same code sucks for everyone)
  • more it sucks to have to re-review and re-test the whole pull request if some little part of it fails
  • less certainty we have over when the code will be merged
  • more chance of a merge conflict
  • more chance it will block other stories in the sprint and derail the schedule
  • longer it will take to get the code into production and start delivering business value

If someone submitted a 10,000 line pull request in our project I can guarantee that all of these points would come into play. But our data and experience show that much smaller pull requests are still incurring huge holding costs.

Why would smaller pull requests by better?

Smaller batch sizes increase flow, according to Donald Reinertsen in his book: “The Principles of Product Development.”

Specifically:

  • Reducing batch size reduces cycle time
  • Reducing cycle time reduces variability in flow
  • Reducing batch sizes accelerates feedback
  • Reducing batch sizes reduces risk
  • Reducing batch sizes reduces overhead
  • Large batch sizes reduce efficiency
  • Large batch sizes inherently lower motivation and urgency
  • Large batch sizes cause exponential cost and schedule growth
  • Large batches lead to even larger batches
  • The entire batch is limited by its worst element

Applying lean manufacturing concepts to pull requests

Some very bright people at Toyota invented the Toyota Production System, which includes single-piece flow. And they used this system to take over the auto industry. This is a similar idea but Reinertsen tweaked it for product development. The key to making this really work is to lower your transaction costs. If you become really efficient at processing pull requests, your overall cost curve will shift down. And your optimal batch size will shift to the left (get smaller). It’s a virtuous cycle.

You want to get stories moving across your sprint board very rapidly. And you want to get new code into production as quickly as possible. You don’t want pull requests stalled in review for days at a time. And you really don’t want failed pull requests and all the rework that goes with them.

Still not convinced that the optimal pull request size is small?

Let’s do some math. The more code in your pull request, the more chance that you’ve introduced a defect. I know software defect rates vary widely but let’s use the often quoted 10-50 defects/kloc for released commercial software. But, we are talking about a code review–not released software–so let’s assume that your defect rate is 50 defects/kloc. That’s one defect every 20 lines of code on average4.

Good code reviews catch 60-90%1 of the defects in the code so let’s pick a middle value and say that you can catch 75% of them. That means you should find one defect for every 27 lines of code on average.

Smaller pull requests contain fewer defects

A 200 line pull request should contain 10 defects and you should expect to find 7.5 of them. That means it’s extremely unlikely that it will pass code review on your first attempt. Now let’s redo the math with a 50 line pull request. There should be 2.5 defects in this pull request and you should expect to find 1.9 of them. You’ve got a much better chance of passing code review on the first try with a smaller pull request.

When your big pull request fails code review, you have to fix the whole pull request, re-review the whole pull request, re-test the whole pull request, and hope that the nobody has introduced a new error along the way. When your small pull request fails code review, you still have to do all the same steps. But it will go much more quickly because you are dealing with much less code.

What’s wrong with the way my team handles pull requests?

My team is struggling with our pull requests and our problems are all related to what I like to call “mega pull requests”. Mega pull requests are pull requests that have gotten out of control and have become difficult to review and deploy. We recognize them by their large diffs, multitude of comments, multiple reviews, many days spent in review, and many hours spent trying to get them into the ‘done’ column.

How a mega pull request is born

Suppose my colleague creates a 200 line pull request. I see it in Jira and just from the title I know it’s going to need my full attention. So I schedule it in a big block of time tomorrow and plan to tackle it all at once. I review it and find a couple of problems. I write my comments in BitBucket and bounce it back to the author. He’s moved onto something else so it takes him until the next morning to review my comments, change the code, update the testing plan, retest the branch, and put it back in review.

We repeat this cycle once or twice while we deal with other stories, meetings, unplanned work, and other interruptions. By now, this pull request has been in review for the better part of a week and now it might have merge conflicts. If so, the author fixes them, retests, and sends it back to review. At some point I eventually get all the tests in the testing plan to pass, I approve the pull request, merge into master, and move on to the next story.

We probably have one pull request like this per sprint5. But on rare occasions, we’ve had pull requests bounce several times over several weeks. I’m not joking. Mega pull requests kill our productivity.

What’s helped

We’ve tried all kinds of things to prevent the formation of mega pull requests but our ideas have mostly been ineffective. The only things that helped a little were:

  • automating our code styling and formatting with a commit hook so it’s never an issue
  • doing a refactoring story in front of the main story
  • avoiding stray changes in the main story no matter how ugly the code – we just put it in the backlog and make a comment in the code with the Jira story ID

We think we’ve still got room to improve so we’re interested in finding our optimal pull request size.

How do we find our optimal pull request size?

We conduct experiments. Our current policy is to only create a pull request that will take less than an hour to review. But we suspect our velocity will improve if we create smaller pull requests. So we’ll try a 45 minute limit for two months, examine the data, set a new target, and repeat the experiment until we find our optimal pull request size.

Wrapping up

Every team has an optimal pull request size, it’s likely much smaller than you think, and making your pull requests your optimal size will improve the performance of your team. You can find your optimal pull request size by doing experiments and measuring the change in your velocity.

I think I’ve made a pretty strong case for smaller pull requests. But I’d love to hear what works for your team. Have you experimented with the size of your pull requests? Have you found your optimal pull request size? How big are they and how has your velocity changed since you found it?


  1. Fagan ME (1976) Design and code inspections to reduce errors in program development. IBM Syst J 15: 182-211. 
  2. Cohen J (2010) Modern code review. Oram A, Wilson G, editors. Making software: what really works, and why we believe it. Sepastopol (California): O’Reilly. pp. 329-336. 
  3. I removed spurious data like file renames, automatically generated files, and changes caused by running a formatting tool over the code. I also removed pull requests that didn’t pass review. 
  4. This is just a back-of-the-napkin math. You can use a wide range of defect and detection rates and still reach the conclusion that large code changes should contain detectable errors that cause your code review to fail. Please don’t clobber me in the comments about the exact numbers I’ve used here. 
  5. We have a ‘zero known defects’ policy so code goes back in forth in code review until we think it has zero defects and meets our other quality objectives. We’re trying to improve the quality of our code base and repay our technical debt. This will change our optimal pull request size compared to a team that has a ‘ship it and we’ll fix it in maintenance’ policy. 

Is Uncle Bob serious?

Robert C. Martin (Uncle Bob) has been banging on the “software professionalism” drum for years and I’ve been nodding my head every with every beat. As a profession, I believe software developers need to up their game big time. However, I’ve become concerned about Uncle Bob’s approach. I reached my breaking point the other day when I read his blog post titled Tools are not the Answer.

He took issue with a recent Atlantic article titled The Coming Software Apocalypse. Let me see if I can summarize the theses of these two articles.

The Atlantic:

We are writing more and more software for safety-critical applications and the software has become so complex that programmers are unable to exhaustively test or comprehend all the possible inputs, states, and interactions that the software can experience. We are attempting to build systems that are beyond our ability to intellectually manage.

We need new ways of helping software developers write software that functions correctly (and is safe) in the face of all this complexity. The current methods of producing safety-critical software are especially dangerous to society because when software contains defects we can’t observe them in the same way we can observe that a tire is flat–they’re invisible.

Uncle Bob:

The cause:

  1. Too many programmer (sic) take sloppy short-cuts under schedule pressure.
  2. Too many other programmers think it’s fine, and provide cover.

And the obvious solution:

  1. Raise the level of software discipline and professionalism.
  2. Never make excuses for sloppy work.

Does Uncle Bob’s argument even pass the sniff test?

Safety-critical software systems, which are the topic of the Atlantic article, are held to shockingly high quality standards. The kind of requirements analysis, planning, design, coding, testing, documentation, verification, and regulatory compliance that goes into these systems is miles beyond what any normal organization would consider for an e-commerce website or mobile app, for example.

Read They Write the Right Stuff and tell me if you think Uncle Bob’s on the right track (note this article was written 21 years ago and the state-of-the-art has advanced significantly). Does it sound like the NASA programmers just need more discipline and professionalism coupled with never making excuses for sloppy work?

What does an expert in safety-critical systems from MIT have to say?

Dr Nancy Leveson was quoted several times in the Atlantic article but Uncle Bob completely ignored those parts.

So let’s review an excerpt from one of her talks:

I’ve been doing this for thirty-six years. I’ve read hundreds of accident reports and many of them have software in them. And every someone (sic) that software was related, it was a requirements problem. It was not a coding problem. So that’s the first really important thing. Everybody’s working on coding and testing and they’re not working on the requirements, which is the problem. (emphasis added)

She can’t say it much clearer than that. Did I mention that she’s an expert? Did I mention that she works on all kinds of important projects, including classified government weapons programs?

How about Dr. John C. Knight?

In his paper Safety Critical Systems: Challenges and Directions, Dr Knight describes many challenges of building safety-critical systems but developer discipline and professionalism are not among them. This is as close as he gets:

Development time and effort for safety-critical systems are so extreme with present technology that building the systems that will be demanded in the future will not be possible in many cases. Any new software technology in this field must address both the cost and time issues. The challenge here is daunting because a reduction of a few percent is not going to make much of an impact. Something like an order of magnitude is required.

Developing safety-critical systems is extremely slow, which adds to cost. But QA practices virtually ensure delivered software functions as specified in the requirements. Uncle Bob could possibly argue that some projects are slow because the developers on those projects are undisciplined and unprofessional. But a claim like that requires evidence and Uncle Bob offers none.

Yes, tools are part of the answer (but not the whole answer)

My goodness, we need more and better tools. When I first started programming, I started with a text editor with basic syntax highlighting, that’s it. I used to FTP into the production server to upload my code and run it; I didn’t have a development environment.

Better tools have helped me become a better programmer

Later I moved to Eclipse and thought I was stupid for not doing this sooner. Eclipse caught all kinds of errors I missed with the basic text editor. It just highlighted them like a misspelled word in a word processor–brilliant.

A couple of years later I adopted Subversion as my VCS and I thought I was stupid for not doing this sooner. I could see all the history for my project, I could make changes and revert them. It was awesome.

Ditto for:

  • code reviews/pull requests/Jira
  • advanced IDEs with integrated static analysis, automated refactoring tools, automatic code formatting, and unit tests that run at the push of a button
  • GIT/bitbucket/GitHub
  • TDD
  • property-based testing (QuickCheck family)
  • virtual machines
  • frameworks
  • open source libraries

It’s been nearly twenty years since I started programming and my tools have changed significantly in that time. I can only imagine how the tools that become available in the next twenty years will change how we write and deliver code.

Let’s look at some possibilities.

Better static analyzers

My static analyzers still don’t understand my code and can only pick up simple mistakes. They flag tons of false positives. They can be slow on large code bases. And I’d love it if I just have one static analyzer that did everything I wanted instead of 4-5. It’s also time consuming to write custom rules. There’s plenty of room for improvement there.

Correct by construction techniques

Then there are “correct by construction” techniques. I watched this video. He had me at “a provable absence of runtime errors”. So I got a book on Spark (a subset of Ada) and started learning. Wow, you might be able to write highly reliable and correct software in Spark but it’s going to be a slow process (aka expensive).

Is this the future? I don’t know but maybe if it was easier to program in Spark it might have a better chance in safety-critical software circles. It would also be interesting if someone developed formal method capabilities for my favorite programming language that were accurate and easy to use. “No need to write tests for this module, the prover says it’s mathematically sound,” yes please.

Software to track each requirement to the code that implements it and the tests that prove that it was implemented correctly

I watched a video where the presenter was talking about the difficulty her team has with tracking thousands of requirements to specific code and test cases and back for regulatory compliance purposes in safety-critical systems. The task became much more difficult as they tried to keep everything in sync while the requirements, tests, and code changed as the project progressed. That team and every team like them needs better tools. And, eventually, I’d love to see that kind of thing built into the IDE for my favorite programming language, if it was easy to use.

Formal specification languages/model checkers

Then there are formal specification languages to consider. The Atlantic article mentions TLA+ but there are others. Now imagine that these languages were easy to use. Imagine that you had a tool that could help you construct a formal specification in an iterative way where it coached you to along to make sure you covered every case. And when you were done, you could get it to generate some or all of the code for you. Plus, if you got stuck you could just find the answer on StackOverflow. Cool? Hell, yes!

And more…

I’m sure we can brainstorm dozens of new or improved tools in the comments that would help us write better, more correct code at a lower cost.

Why increased discipline and professionalism are not the answer

The fundamental problem is that even the brightest among us don’t have the intellectual capacity to understand and reason about all the things that could happen in the complex interacting systems we are trying to build. It’s not an issue of discipline or professionalism. These system can express emergent behavior or behave correctly but in ways unforeseeable by their designers.

That’s why Dr Leveson’s book is so important. Instead of trying to figure out all those states and behaviors we “just” have to specify the states and behaviors that are not safe and prevent the software from getting into those states. Well, it’s more complicated than that but that’s a part of it.

Conclusion

I’m all for increasing software professionalism and discipline but Uncle Bob’s wrong about how to prevent “The Coming Software Apocalypse” in safety-critical software systems. Experts in the field don’t rank programmer professionalism and discipline anywhere near the top of their priorities for preventing loses.

More programmer discipline and professionalism can’t hurt but we also need ways of taming complexity, better tools, ways to increase our productivity, ways to reason about emergent behavior, research on what actually works for developing safety-critical software systems, new and better techniques for all aspects of the software development process, especially better ways of getting the requirements right, and so much more.

I know there are tons of programmers churning out low-quality code. But organizations building safety-critical systems have processes in place to prevent the vast majority of that code from making it into their systems. So if the software apocalypse comes to pass you can be pretty sure it won’t be because some programmer thought he could take a short-cut and get away with it.

What do you think? Agree or disagree? I’d love to hear your thoughts.

Additional resources

Here’s a video of Uncle Bob’s software professionalism talk: https://youtu.be/BSaAMQVq01E

Nancy Leveson’s book Engineering a Safer World is so important that she released it in its entirety for freehttps://www.dropbox.com/s/dwl3782mc6fcjih/8179.pdf?dl=0

Excellent video on safety-critical systems: https://youtu.be/E0igfLcilSk

Excellent video on “correct by construction” techniques: https://youtu.be/03mUs5NlT6U

How to impress your boss AND earn major karma points

Your boss likely has a to-do list with projects she would like to get done but can’t. She would jump at the chance to do these things if she could just pass them off to someone competent and manage them at arm’s length. These projects add value if they are done well (and your boss doesn’t have to babysit them) but they probably aren’t critical to the success of your company.

My adventures in to-do list outsourcing

My boss and I had a to-do list like this. Over the years we’ve tried various strategies to get those jobs done. Unfortunately, our failures outnumbered our successes.

For example, I once hired a guy from Pakistan through one of those freelancer websites to crop and resize a couple of hundred some photos for our website. Easy right? Not so much. He quickly edited the photos and sent them to me and about half of them had the wrong aspect ratio. We emailed back and forth and he assured me that he understood his mistake and would correct it. The next day I get the photos back and they’re still wrong. So I spent two hours chatting with him trying to communicate aspect ratios across a significant language barrier. He eventually got it. But, by that time I could have just done the job myself with less frustration and zero expense.

We also tried temp workers for some manual labor at our warehouse. Some of the temps were good workers but most were not. And one guy was so out of shape that we had to keep telling him to slow down because we were afraid he was going to have a heart attack.

I once hired a stay-at-home mom to work for me part time when I owned my business. That arrangement worked for the better part of a year. But she eventually decided to focus on other things and quit.

Good help can be hard to find, as they say.

The chronically ill as an untapped labor resource

Yup, you read that right. I know this is a little out of left field but hear me out.

I have a chronic illness and I’ve been working from home for more than a decade. I can only work part-time and if my employer insisted I work from the office I would have to quit.

Many people in the support group for my illness found themselves in that exact situation. They couldn’t convince their employers to let them work from home or negotiate reduced hours so they lost their jobs. It’s not just people with my illness in this situation, there are millions of highly skilled, highly educated people out there sitting idle because they can’t handle a 40 hour work week in an office or at a job site.

I bet you know somebody like this. Maybe this person is a relative or a friend of a friend or a former co-worker. Is there any chance this person would be a fit for one or more of the projects on your boss’s to-do list?

Here’s what you’re potentially getting:

  • a highly educated/skilled/experienced person
  • available for odd jobs on demand
  • reasonable compensation expectations
  • might have to work remotely
  • might not be able to commit to a rigid schedule
  • potentially available for a long-term, low intensity work relationship
  • might have to work at their own pace

Things to consider

Some ground rules:

  • protect your company first and foremost. Don’t even consider hiring someone out of pity or because “it’s the right thing to do”
  • look for projects that either help with your constraint or free up time for you or your boss to work on your constraint
  • only pitch a relationship to your boss if it will be win-win. Just because you know someone who needs money doesn’t mean that you need to be the person to help them get it
  • select appropriate projects for this kind of work. Prefer tasks that are:
    • easily done remotely
    • not time-critical
    • don’t require much training or supervision
    • are easy to verify as done correctly
  • do your due diligence. People with chronic illnesses can be thieves, be crappy workers, have drug problems, etc. just like any healthy person
  • be fair with compensation. People with chronic illnesses can be desperate for money. And while you might be able to get someone to agree to unfair compensation, I hope you’ll pay people fairly in the hopes of cultivating a profitable long term relationship
  • consider starting with a short trial period where either party can walk away if things aren’t working out

Final thoughts

So that’s my pitch. People with chronic illnesses are an untapped resource you can use to outsource small projects and allow you to focus on the important aspects of your job. They won’t be a fit for every task. But if you keep your eyes open you might just find the perfect task for the perfect person, which will impress your boss AND earn you some major karma points.

Tell me about your experiences. Have you hired someone with a disability or chronic illness?  How did it go? If you’ve never done it, would you consider it?

What you should know about the Effective Executive: part 6

This post is part of a series on The Effective Executive (by Peter F. Drucker). You can find the first post here. In this post I’m going to tackle chapter 7: effective decisions.

Chapter 7: effective decisions

This is the second chapter on decision-making in The Effective Executive. Click here to read my post on chapter 6: the elements of decision-making if you haven’t already read it.

In the real world any decision worth making is going to be messy.  You won’t find any 100% right answers for most problems. Instead, you’ll be faced with various possible alternatives that rank differently on different criteria. Much of the information that you’d like to have to help you make the decision will not be available.

Yet you need to make a decision because decision-making is part of your job. In chapter 7 Drucker gives us some advice for making responsible decisions using good judgement.

Some decisions I’m facing

Let me give you some examples of some decisions I’m facing:

  • How can we produce high quality software faster?
  • Would hiring another programmer increase our overall productivity enough to offset the cost of having another programmer?
  • Manual testing through the UI is really slow and error-prone. But automated testing through the UI won’t be easy to do with our code base. Many of the paths we need to test require code changes to get the system into the correct state to run the tests (example: non-admin IP address while the shopping cart is off). Plus we want to reskin our website in the near future so that would likely break many automated UI tests we write. What’s the “best” course of action?
  • We have lots of low quality legacy code. Should we spend time refactoring our code to increase its quality? Should we refactor our code to a framework? Or should we keep moving ahead with new features and ignore the old code whenever possible?

I learned this 7 step decision-making process in university. It seems like a reasonable process. Each step is clear and understandable. But if you try to apply this decision-making process to any of the questions I posed above, you quickly run into trouble.

The normal decision-making processes are problematic

How can we “gather relevant information” for my question: “should we continue with manual testing or move to automated UI testing?”

The more you look at the “gather relevant information” step, the more problematic it becomes. There isn’t going to be a study that conclusively settles the question of whether companies in our exact situation should automate their UI testing. And it is impractical to do the same project once each way with different teams to see which is better.

Everything has a context

“Facts” are meaningless without criteria of relevancy. Even if you could gather a bunch of “facts” about the decision you’re facing, they won’t necessarily share your context. For example, just because x other companies automated their UI testing and felt that it was the right thing to do, that isn’t evidence that you should move to automated UI testing.

Does adopting automated UI testing still make sense if your company is:

  • going to lose a big source of income if you can’t fix a major defect in your software before the end of the month?
  • racing to release a product before you run out of cash?
  • planning to replace your software in two months?
  • understaffed to the point of not being able to keep your existing software functioning?

Context matters.

Drucker’s approach

Drucker cautions us against using the usual decision-making approaches for all of the reasons listed above and more. Instead he asks us to start with opinions.

Gather opinions

Opinions are natural and usually plentiful. And if we think about opinions as untested hypotheses about reality then we should ask what would we expect to see in the real world if this hypothesis was correct? Drucker wants us to seek opinions from knowledgeable people and stakeholders. But we should make them responsible for also explaining what factual findings we should expect to see in the real world if their opinions/hypotheses are correct.

For example, if Bill says that automated UI testing will “save us tons of time,” he should be able to direct us to several companies similar to our own that made the switch to automated UI testing and measured specific benefits. Or maybe Bill can point us to high quality studies that have repeatedly found automated UI testing provides ‘x’ specific benefits.

“Facebook or Google or NASA or whoever does it so we should do it too” isn’t good enough. “It’s the right thing to do” or “It will payback eventually” aren’t good evidence either.

Choose a measure

We need a new measure to evaluate these opinions. This is where context is important. We can think of many possible measures for the question of “automated UI testing”:

  • how will automated UI testing fit in with our overall objectives and software development processes?
  • do we have a better chance of meeting our project deadline if we do manual or automated UI testing?
  • how will automated UI testing effect our story points completed per sprint?
  • how many sprints will take to break even on the installation, setup, and learning of UI test automation software?
  • can we get a better return on investment with an alternate quality practice such as code reviews or automated testing one level below the UI?
  • how much does automated UI testing reduce the risk of us introducing a critical bug into our production environment?
  • what tasks must we forgo to free up time to develop an automated UI testing capability?

Do you see where this is going? These are judgment calls. Would you prefer go slower now while you learn automated UI testing for the chance to possibly go faster in the future? How bad is it if a defect slips by your manual testing? How often does that happen? What’s your risk tolerance? Will your team embrace automated UI testing? How important is your next deadline?

Coming up with an appropriate measure is hard. Drucker recommends we gather several good candidate measures, calculate them all, and see which one works the best using our experience and judgment as a guide.

Generate conflict

Drucker trained as a lawyer and you can see that coming through in his adversarial approach to decision-making. He writes:

Decisions of the kind the executive has to make are not made well by acclamation. They are made well only if based on the clash of conflicting views, the dialogue between different points of view, the choice between different judgments. The first rule in decision making is that one does not make a decision unless there is disagreement.

Okay, this is probably not a programmer’s natural inclination when faced with a decision but it contains a certain logic.

What if Bob took the “let’s automate UI testing” position and Sally took the “let’s keep using manual UI testing” position and you sat as the judge in a mock trail? When each person presents their case, they will also be cross-examined by the other side and by you, the judge. During this exercise you will expose the strengths and weaknesses of all arguments. If you do this well, the chance that you miss a major factor in the decision will be negligible.

Drucker has three reasons for his insistence on disagreement:

  1. It’s the only safeguard you have against being led astray by office politics, manipulation, and hidden agendas.
  2. Disagreement can provide you with alternates to a decision. Without alternatives, you aren’t really making a decision because you aren’t choosing between any alternatives.
  3. Disagreement is needed to stimulate the imagination. In all nontrivial situations, you need creative solutions to new problems. Here’s more from Drucker:

Disagreement, especially if forced to be reasoned, thought through, and documented, is the most effective stimulus we know.

Commit and take action

Finally, Drucker encourages us to commit to a decision and execute it.

I once saw a documentary on how the US Marine Corps trains its leaders to make decisions under life and death circumstances. They call it “The 70% Solution“:

…when faced with the likely prospect of failure amidst a sea of uncertain, vague, and contradictory information, most people are extremely hesitant to make a decision. We tend to forget that the enemy is also facing a similar information shortfall. Understanding the factors that degrade our decision-making ability on the battlefield and realizing that they will never be absent are absolutely vital to relevant decisions in conflict. As leaders, we must guard against waiting for a perfect sight picture, which may never come, leading to inaction.

Drucker also warns us against hedging and half-measures:

…the effective decision-maker either acts or he doesn’t act. He does not take half-action. This is the one thing that is always wrong, and the one sure way not to satisfy the minimum specifications, minimum boundary condition.

Don’t delay the decision under the guise of “more study.” Unless you have a good reason to believe that additional information will significantly improve the quality of your decision, you should make it and move on. Avoid the temptation to delay and hedge, even if the decision will not be popular or tasteful. You are paid to make decisions, not be popular.

Further learning

High quality software engineering studies are few and far between. But the following books have the best research I’ve seen:

  • Code Complete: A Practical Handbook of Software Construction, Second Edition (Steve McConnell)
  • Facts and Fallacies of Software Engineering (Robert Glass)
  • Making Software: What Really Works, and Why We Believe It (Andy Oram and Greg Wilson)

These books might help you make decisions about the problems you’re facing. Or, at the very least, steer you away from doing something completely stupid. I highly recommend these books to all programmers.

Watch this video for a depressing take on the primitive state of computer “science”: https://youtu.be/Q7wyA2lbPaU. There is very little, if any, evidence in computer “science” (hence the sarcastic quotes) that rises to the level of “smoking causes cancer” or “global warming is real.” We’re basically making things up as we go along. And ignoring or misapplying the scant evidence we do have. This is the reality of our young profession.

Wrapping up

Making effective decisions is hard because the real world is messy. Yet most programmers make important decisions without decision-making training. Drucker recommends we do the following things to help increase the soundness of our decisions:

  • gather opinions from knowledgeable people along with testable conditions you should see in the world if their opinions are true
  • treat each opinion as a hypothesis to be tested
  • generate conflict
  • develop and choose an appropriate measure to evaluate your alternatives
  • commit and take action even when it’s hard or distasteful

What you should know about the Effective Executive: part 5

This post is part of a series on The Effective Executive (by Peter F. Drucker). You can find the first post here. In this post I’m going to tackle chapter 6: the elements of decision-making.

Chapter 6: the elements of decision-making

I had a hard time getting my head around this chapter. I’ve read it several times over the years and I’ve always glossed over it with the takeaway that effective executives make important decisions by following a decision-making process. Beyond that, my eyes kind of glazed over. I’ve encountered so many decision making processes over the years and they all have failed me in one way or another.

But since I committed myself to writing about this chapter in this post, I dove into it and I finally think I understand what Drucker was trying to tell us.

Overview

Effective executives make decisions of consequence for their organizations. They do not make a great many decisions but they follow a five step process that establishes a rule or policy that addresses the issue at the highest possible conceptual level. Once you make a decision it is not considered complete until it has “degenerated” into someone’s job and it has a feedback mechanism.

Step 1: is this a generic situation or an exception?

The first step is to determine which of four situations you are in:

1. truly generic – all the “problems” are just symptoms of the underlying condition

Until you address the underlying condition the symptoms will not improve. For example, I had a situation where my VM was crashing randomly, I was getting weird reboots, and my GIT repo became corrupted. At first, they all looked like unrelated problems. I wasn’t able to diagnose or eliminate these problems until I realized that they might be linked. I tested my RAM and learned it was faulty. That was the underlying condition tying those seemingly unrelated events together. Once I replaced my RAM (and repaired my repo), my problems went away.

2. a unique event for your company but it is actually generic in business

Examples might include deciding to migrate to the cloud, handling a merger and integrating multiple IT systems, or responding to a data breach. These might be unique events in the history of your company but they are quite common events in business.

3. a truly exceptional and unique event

This is an truly unique event without comparison in the history of your business or any business. Drucker mentions the thalidomide birth defects tragedy as an example. A parallel in programming might be the complexities and challenges of deploying self-driving car software. There’s no real template for the legal, ethical, social, and technological challenges of releasing that kind of software.

4. an early manifestations of a new generic problem

If you deploy machine learning-backed pricing bot you might start to see a whole new classes of problems such as:

  • pricing bias/discrimination (illegal pricing strategies adopted by your bot)
  • price manipulation by competing bots (bots manipulating your bot)
  • sub-optimal pricing (your bot fails to outperform humans)

Bad things might even happen to you if a major competitor adopts a pricing bot. You might suddenly find your pricing strategy is ineffective even though nothing has changed inside your company.

An example

You have inevitably struggled to read code other programmers have written. For example, people write overly complicated code or write in an unfamiliar style or with unfamiliar formatting. Lots of teams descend into endless arguments over the acceptable naming standard, white space formatting, programming paradigms, language features, etc. And the whole tabs vs spaces debate is alive and well.

Drucker would encourage us to solve this problem at the highest level possible. In PHP you can simply adopt the PSR standards. And you can even setup your IDE to automatically format your code in that standard. At my company we used a modified PSR standard. Our IDEs automatically format our code before every commit. And we have a couple static analysis tools that flag things we don’t want to see in our code base.

Anything you can do to avoid negotiating every detail of how your code looks or the language features you use is a win in my book.

Summary

Drucker argues that you must handle truly unique events (situation 3) on a case-by-case basis. Everything else requires a generic solution, a rule, a policy, or a principle. Once you have established the right principle, you can handle all manifestations of the generic situation pragmatically.

Step 2: develop a clear specification of what the decision must accomplish

How are you going to identify a solution that satisfies all your needs if you don’t know the boundary conditions of the solution? The short answer is that you can’t. Unless you have a clear understanding of the boundaries of an acceptable solution, you won’t be able to tell when a plan will solve the problem.

An example

This makes sense. If your boss says that he wants your website to be faster but doesn’t give you any other guidance, how will you know what actions will satisfy “faster” in your boss’s eyes? You probably need target average and worst case response time targets at a minimum. You may also need to get your boss to set a limit on this project.

Without knowing the boundary conditions of an acceptable solution, your solution space is unacceptably large. Should you rewrite your website in C? Add a caching layer, dump your ORM and write raw SQL queries? Optimize your images? Serve your images from a CDN? Split your database from your web server? You won’t have any idea which–if any–of these solutions will solve the make the website “faster” problem.

Step 3: stick with what’s right rather than what’s acceptable

I can’t say this any better than Drucker:

One has to start out with what is right rather than what is acceptable (let alone who is right) precisely because one always has to compromise in the end. But if one does not know what is right to satisfy the specifications and boundary conditions, one cannot distinguish between the right compromise and the wrong compromise–and will end up making the wrong compromise.

Need I say more?

Step 4: convert the decision into action

Here’s more Drucker:

In fact, no decision has been made unless carrying it out in specific steps has become someone’s work assignment and responsibility. Until then, there are only good intentions.

How often has your boss said “From now on we are going to do x” only to have that decision last a couple of weeks (at most)? How often has some policy or initiative you’ve tried to implement followed that same path?

Drucker has some advice for us. Converting a decision into action requires we answer the following questions:

  • who has to know about the decision?
  • what action has to be taken?
  • who has to take it?
  • what does the action have to be so that the people who have to do it can do it?
    • the people assigned have to be capable of the task
    • make sure measurement, standards for accomplishment, and incentives are changed simultaneously otherwise people get caught in paralyzing internal emotional conflicts

If you can’t answer these questions, you might be in trouble.

Step 5: build feedback into the system

Drucker stresses that effective executives know they must go back and look at the results for themselves. It’s not enough to make a decision and communicate it to the appropriate people. You need to actually go look at the results yourself because that’s the only way to get reliable feedback.

It’s also the only way to see if the assumptions underlying the decision are still valid. Managers who fail to perform this step are the typical reason for persisting on a course of action long after it has ceased to be appropriate or even rational.

An example

I worked as a solo programmer for many years. But at one point we needed more programming than I could provide so we hired a second programmer. I gave my new co-worker a few assignments and sent him on his way. Big mistake. We tried again but this time I gave my new co-worker more explicit instructions and gave him the opportunity to ask many more questions about the task. He still missed the mark. It took us a long time to make code reviews our policy. Code reviews were the only way we could ensure that all new code we wrote met our standards. Code reviews are the programming equivalent of “going to look for oneself.”

Now that we’ve made code reviews part of our culture, we are trying do more unit testing, which is a great form of fast, automated feedback.

Wrapping up

I’ve never practiced the decision making process Drucker advocates in this chapter. I find Drucker’s advice to be practical in a lot of ways but it somehow doesn’t fit with how my mind works. I don’t naturally try to figure out if I’m faced with a symptom of a generic situation or an exception, even if I can see the value in solving a problem at the highest conceptual level. The “5 whys” analysis technique fits my brain better.

I also get the feeling that Drucker envisioned people using his framework to solve bigger problems than I typical face. Drucker’s examples are from situations faced by famous CEOs, military generals, and American presidents.

So your mileage my vary. Read the chapter if Drucker’s decision making process sounds interesting to you. Or skip it and know that all the other chapters are still very applicable to small business programmers.

« Older posts