Wednesday, December 12, 2007

Pair Programming

I thought that I would take on the highly controversial topic of "Pair Programming". This article gives a brief overview of Extreme Programming (XP) and addresses our experience with pair programming.

Why Extreme Programming?
Scrum is the most popular agile methodology used by game developers today. Its practices are simple and can be used with cross-functional game teams. However, for all the great practices of Scrum, it has no engineering practices to speak of, by design. While this is fine to start with, many teams using agile soon find that their original engineering practices have a hard time keeping up with the changing requirements. These teams may turn to the practices of Extreme Programming (XP) for help.

XP was a methodology developed after Scrum which adopted many of the Scrum practices. Iterations, backlogs and customer reviews, though slightly different, aren't hard to integrate into Scrum teams used to these concepts.

XP introduces new practices for programmers. Among them are the practices of Test Driven Development (TDD) and pair programming. This article focuses on TDD and the practice of pair programming since it is the most controversial and misunderstood XP practice.
First I'll start with the benefits of TDD which will help explain the reasons behind pair programming.

The Benefit of Test Driven Development
The major benefit of XP is derived from the TDD practices. There are lots of great articles about TDD out there, so I'll just give an overview. TDD practices consist of writing a number of unit tests for every function you introduce. Each unit test will exercise the function being written by passing in data and testing what the function returns or changes. An example of this would be a function that sets the health of the player in a game. If your game design defines 100 as full health and 0 as dead, your unit tests would set and test valid and min/max parameters and check to make sure those values were assigned by the function. Other unit tests would try to assign invalid numbers (above 100 or below 0) and test to make sure the function handled those bad values correctly.

If you follow the strict TDD practices, the tests are written before you write the logic of the function that will allow those tests to pass, so the tests will actually fail at first. I can't say that many of us do this all the time, but it does create better results so it's something to encourage.

Once all the tests pass, the code is checked in. Code is checked in quite frequently when the team is using TDD, sometimes every hour or two per pair. The requires a server that takes all the changes as they are checked in and runs all the unit tests to insure that no changes have broken the code. This server is called a Continuous Integration Server (CIS). By running all the unit tests, it catches over 90% of the problems that commits usually cause. When a build passes all the unit tests, then the CIS lets everyone know that it is OK to synchronize. When a submission breaks a Unit Test, the CIS lets everyone know that it is broken. It then becomes the team's job to fix that problem. Since the culprit programmer who checked in the error is easy to identify, they are usually the one who gets to fix the problem.

There are a number of very big benefits that this provides to the team and project. As a project grows, the number of unit tests grows into the thousands. These thousands of tests continue to catch a great deal of errors by automated servers that would otherwise have to be caught by people. These tests also create a safety net for refactoring and other large changes that an emerging design will require.

Catching bugs as quickly as possible allows those bugs to be fixed as quickly and inexpensively as possible.   The code which created the bug is still fresh in the mind of the programmer who wrote it so they, or another programmer assigned to fix the bug, won't have to relearn the code months after it was written. Also there is the danger that in fixing the bug, the programmer may be changing the behavior of the system that may produce other bugs. By catching these problems early, we prevent the future code from becoming dependent on a flawed foundation.

There are additional benefits of TDD. One of them is that unit tests create living documentation of the code. The unit tests are usually named to describe the behavior they are testing. When you look at all the unit tests for a particular function, you get a firm idea of the purpose and behavior of that function. This is far better than documentation that is written separate and difficult to maintain. In fact there are utilities which will extract the unit test names for functions and produce nice documents describing all your functions for you.

One of XP's philosophical foundations is that the programmers create the absolute minimal amount of functionality to deliver what the customers request every iteration. For example, the customer wants to see one AI character walking around the environment. Most programmers will want to architect an AI management system that will handle dozens of AI characters because they "know that the game will need it sometime in the future". With XP, you don't do this. You write the code as if one AI character is the only one that will be needed. When the customer asks for more than one AI character in a future iteration, you might then introduce an AI manager and refactor the original code. Although this might represent more work up front, you will usually produce a better AI manager this way.

A major benefit of TDD is that it supports the concept of this "constant refactoring" of the code base to support this behavior. There are a number of reasons for this:
  • Systems created from refactoring often match their requirements more closely.
  • Refactored code has a much higher quality.
Personally I don't fully agree with the purist approach that XP programmers should always do the absolute minimum. I believe that knowledge and experience factor into how much architecture should be pre-planned. It’s very easy to plan too much ahead and write code "the right way the first time" based on that plan, but I believe there is a sweet spot that you can find between the two.

TDD is very useful and in fact is not a difficult practice for programmers to adopt. In my experience, if a programmer tries TDD for awhile, the practice of writing unit tests becomes second nature. The practice of refactoring takes longer to adjust to. Programmers resist refactoring unless it is necessary. This reinforces the mindset of writing code "the right way, the first time" which leads to a more brittle codebase that cannot support iteration as easily.

Pair Programming
This brings us to the practice of pair programming. Pair programming can be thought of as a continual peer review. Two programmers sit at a workstation. One types in code while the other watches them and provides input on the problem they are both solving.
This practice creates a great deal of fear at first. These are some of the concerns:
  • "Our programmers will get half the work done"
  • "I do my best work when I am focused and not interrupted"
  • "Code ownership will be destroyed, which is bad"

You can use studies and statistics to argue these fears are unfounded, but we're dealing with people's personal workspace and work habits here so things can get rather emotional and subjective on the topic of pair programming.

There are benefits from pair programming to the team:
  • Spreading Knowledge.
  • Assures that you'll get the best out of TDD
  • Eliminates many bottlenecks caused by code ownership
  • Creates good standards and practices automatically
  • Focuses programmers on programming
Spreading knowledge
Pair programming isn't about one person typing and the other watching. It's more of an ongoing conversation about the problem the pair are trying to solve and the best way to solve it. Given one problem and two separate programmers, you'll often produce two separate results. If you were to compare these results, you might find that each solution had strengths and weaknesses. This is because the knowledge of each programmer does not entire overlap with the other. The dialog that occurs with pair programming helps to share knowledge and experience widely and quickly.

While this is good for experienced programmers, it is an outstanding benefit for bringing new programmers up to speed and mentoring entry level programmers. Pairing will bring a new programmer up to speed in half the time and will eliminate many of the bad coding practices they may be bringing to your shop.

Assures that you'll get the best out of TDD
TDD requires that comprehensive tests are written for every function. This task is made easier by pairing. First, it's in our nature to occasionally slack off on writing the tests. From time to time, the partner will remind you to write the proper test or take over the keyboard if you are not fully motivated. Secondly, it's common to have one programmer write the tests and the other write the function that will pass the tests. Although this doesn't become a competition between the two, it almost always insures better test coverage. When the same programmer writes both the test and function, then they may not consider the entire range of tests based on their assumptions. As the saying goes "two heads are better than one". This definitely applies to TDD.

Eliminates many bottlenecks caused by code ownership
How many times have you been concerned about a key programmer leaving the company in mid project or getting hit by the proverbial bus that seems to be driving around hunting down good programmers? Pairing solves some of this by having two programmers on each problem at all times. Even if you are lucky enough not to lose key programmers, they are often busy on other tasks to instantly solve every critical problem that comes along. Pairing smooths out many such constraints.

Creates good standards and practices automatically
How many times have you discovered in Alpha that one of your programmers has written thousands of lines of poor quality code that you depend on? We generally try to solve this problem by defining "coding standards" and conducting peer reviews. Coding standards are often hard to enforce and are usually ignored over time. Peer reviews of code are a great practice, but they usually suffer from not being applied consistently and are often too late. Think of pair programming as a "continuous peer review" practice. It catches many bad coding practices very early. As the pairs mix, a company coding standard will emerge that is improved daily. It doesn't need to be written down because it is documented in the code and in the heads of every programmer.

Focuses programmers on programming
When programmers start pairing, they usually discover that the first few days are exhausting. The reason is that they do nothing but focus on the problem the entire day. Mail isn't read at the pair station. The web isn't surfed. We have shared email stations setup for when a programmer wants to take a break and catch up on mail. You never realize how much of a distraction those things were until you pair. Not everyone agrees this is good, but for those that just want to be at work for eight hours it generally seen as a positive thing.

Pairing: All or Nothing?
We don’t enforce 100% pair programming, but have done it long enough that it is pretty much second nature. If we abandoned pair programming I would want to make sure that we were still seeing the benefit above.

Pair Programming Problems
There are some problems to watch out for with pair-programming
  • Pair chemistry
  • Pairing very junior people with very senior people
  • Hiring issues
Pair chemistry
Some programmers do not make good pairs. The chemistry does not work and you can not force it. For us the pairs are usually self selecting, so things work out. In more rare cases some programmers can't pair with anyone. Any team large enough that is switching over to a pair programming practice will have some programmers that cannot pair. It's OK to make exceptions for these people to program outside of pairs, but they will still need a peer review of their work before they commit. They usually will do some pairing as time goes by and may even switch over to do it all the time. You just need to give them time and not force it.

Pairing very junior people with very senior people
Pairing between the most and least experienced programmers is not best. You'll usually have the senior programmer doing all the work at a pace which the junior programmer can not keep up with. Matching junior level programmers with mid-level programmers is better.

Hiring issues
You want to make sure that every programming candidate for hire knows that they are interviewing for a job that includes XP practices and what that entails. We include a one hour pair programming exercise with each candidate in the later stages of our hiring process. This does two things. It is a great tool for evaluating how well the candidate will do in a pair situation where communication is critical. It also gives the candidate an exposure to what they are in for if they accept an offer. A small percentage of candidates will admit that pairing is not for them after this and opt out of consideration for the job. This is best for them and for you.

Measurable Improvements?
I'm often asked about how much XP has improved productivity. We haven’t measured the changes in productivity mainly because we implemented XP during a transition to a new engine and set of consoles. It was very clear that pace of new features slowed down from this start. This corresponds to measurements outside our industry have shown that a pair of programmers using TDD are about 1.5 times as fast as one separate programmer. The additional benefits clearly put XP/TDD well above the break-even level of productivity especially for games:
90+% stability in new builds at all times. This adds to the productivity of designers and artists
Post alpha debugging needs are vastly reduced. We spent a lot more time tuning than debugging
Iterative methods like Scrum tend to cause more bugs due to the higher level of change. TDD helps addresses that.

Introducing XP
The thing that worked for us in introducing XP was to have one team take it on to either prove the benefits and dispel the fears or to show there was no value. The team was composed of some of the more open minded programmers that were also very influential opinion holders in the company. After their initial success, everyone else wanted to try XP and prove that they could be successful with it as well. It’s good to note that our programmers will still write unit tests and refactor on their own personal programming projects.

Bottom Line
TDD is the most important beneficial element of XP, but I’m not convinced you get the full benefits without pair programming. It continues to be a touchy subject among programmers and there is a lot of opinions about it from people that haven't tried it (or tried it for a few days). All I can say is that it does work under some circumstances, but I can't give you solid figures about how much things like stability, code quality and innovation improved since we switched. As with any practice, you need to experiment with it, modify it for your needs and keep your eye on the principles you establish for your own products and teams.


malik said...


great article, I just wanted to know which unit testing framework and CIS do you use or recommend?

Anonymous said...

Hi ishaq,

here is my wild guess:


Plus, the following is a good addin:
UnitTest++ GuiRunner

malik said...

Thanks Kim, I did some searching too, and it turns out that UnitTest++ would be the testing framework of the choice, I also downloaded the GUI Runner from your site (didn't understand the content of the site though, may be that won't hinder me from using it, right?).

Clinton Keith said...

Kay Kim is correct on both counts.

Anonymous said...

Interesting article. One thing I noticed: in the section on people's gut reactions to pair programming, you cite "studies and statistics to prove these fears are ungrounded." However, following that link, the article doesn't mention pair programming at all; it's all about TDD. Additionally, the word "pair" only shows up in two of that article's references. In one of these, the results were an increase in effort of 17% (bad), but a reduction in residual defect quantity of 36% (good). The other one shows a decrease in quality of 25-45%, with no comment on improved productivity.

Most of the other references have titles that imply they were just looking at TDD, not pair programming or other XP principles. I don't think that reference supports pair programming at all. I'm willing to consider it, but do you have any evidence that pair programming doesn't have the negative effect on productivity that it intuitively would seem to?

Clinton Keith said...


Two of those studies reference pair programming (10 & 18). The point of the paragraph was not that "I" am using them to prove these fears ungrounded but that "You could use" them to prove it. The point was that the problem was not in the proof, but the subjective issues with pairing. I changed that to say "you could use [the studies] to argue".

The metrics on any of this are shaky. That's why I say in the bottom paragraph that "I’m not convinced you get the full benefits without pair programming". We continue to try variations of the practices and measure the resulting stability of the builds. This all takes time.