Search

Wednesday, September 12, 2007

Book Review: Implementing Lean Software Development

Implementing Lean Software Development: From Concept to Cash

We've been applying some lean principles to game production recently during production. The results have been great and I have been returning to this book time and time again.

Lean
I've been reading about Lean for several years. It doesn't have as much of a defined set of practices like Scrum or XP. It has a very solid set of principles that can drive a wide range of practices over any type of product development and manufacturing effort. This made it particularly appealing for game development. Developing a game involves not only product development but also a production line effort to create the content for 8-20 hours of gameplay.

I’ve enjoyed reading anything that Mary and Tom Poppendieck write or say about Lean. Their writing is dense with value but easy to read. Their lean books tie in all the principles with great examples and historical perspective.

Overview

I’ll go over the main sections of the book. Although the book addresses software development, there is a lot there for people who want to work with content creation as well (producers, art and design leads, etc).


History of Lean

The book starts with the historical background of Lean in manufacturing, how the efficiencies created in the United States to ramp up war production met the philosophy of Sakichi Toyoda and his decedents as their family business evolved into Toyota Motors. The resulting Toyota Production System created a “just-in-time” workflow and “autonomation” or “automation with a human touch”. Toyota has extended this system to their product development side as well. The result is that Toyota, using their development and production systems, has become the world’s most successful car manufacturer.


Coined “Lean” in 1990, many of Toyota’s methods have been copied world-wide. There have been a number of books about the "Toyota Production System".


Principles

The chapter starts with a definition of principles: “principles are underlying truths that don’t change over time or space, while practices are the application of principles to a particular situation”.

Any book on “Implementing Lean” has to start with the principles. Mary and Tom define seven:
  • Eliminate Waste - Not only work done wrong, but work that shouldn't have been done at all.
  • Build Quality In - Don't fix it at the end. Build quality in as you create features.
  • Create Knowledge - In your organization, about your product and your market.
  • Defer Commitment - Make irreversible decisions as late as you can. Keep your options open.
  • Deliver Fast - Give your customers something quickly. Continually iterate.
  • Respect People - The people building the product know best how to improve the process. Create a culture where continual improvement comes from everyone.
  • Optimize the Whole - Local optimization actually makes the whole less effective. Use tools such as "Value Stream Maps" to visualize how everything flows and optimize there.
From there the book's chapters focus on the following key aspects of implementing Lean Software Development. I'll briefly describe each.

Value

How do you align your development progress and decision making closely with the value you are delivering to your customer? This chapter has a number of stories about companies and products that continually succeed at delivering value. Towards the end of the chapter it touches on leadership, team and development principles for doing this.


Waste

This chapter is one of the core chapters of the book. It covers the seven wastes as defined by the Toyota Production System (and their parallels in software development) and introduces "Value Stream Maps" which are an effective tool in mapping out your process and identifying waste.


You'd think that "waste" is pretty easy to define, but the seven wastes show that it can come in many forms. Focusing on waste and eliminating it is the core of Lean.


Speed
Speed is about delivering fast. It is defined by the absence of waste. If you practice Scrum, this chapter is an especially good read to see how the underlying theory of Scrum and Lean have the same roots.

People

The book
Peopleware by Tom Demarco is one of the bibles that anyone working on games should read. This chapter reflects much of the wisdom and common-sense of Peopleware as applied to Lean.

Knowledge

One of the most valuable assets that you have is knowledge:

  • Knowledge of your technology
  • Knowledge of your tools and process
  • Knowledge of your market
  • Knowledge of what you are making
The question of "how we build knowledge in a company" is a big one, especially to a game development company: How do we make decisions about features? How do we decide which technology to pursue? This chapter is focused more on specific examples and practices that have worked for other companies. The key is to finding own practices for your organization. Anyone familiar with the principles of XP (Extreme Programming) will be familiar with the principles being expressed in this chapter as they apply to building knowledge.

Quality

How is quality built into a lean development effort? It's built by addressing it continually using an iterative process and by being disciplined in doing the extra work required.


This chapter starts with the example of the Polaris Submarine Project. The Sputnik launch convinced the US that the Soviet submarine missile threat was imminent. As a result the US successfully moved a deadline to build the first US missile carrying submarine from 9 years to 2.5 years. How was this done? By abandoning the big plan and then using vertical slices and iteratively prioritizing features that the customer (the US) valued the most. Although the Admiral in charge of the project knew nothing about Lean, he applied many of the same principles to the project. Unfortunately these lessons were lost on most subsequent defense projects and are still not applied today.


The chapter discussed some of the Japanese practices that have been adopted for quality and how team discipline is critical to making sure that quality is part of the daily effort. Once again as you read about this you see the strong roots of practices in Scrum and XP.


Partners
This chapter addresses how the lean organization works with partner organizations (such as Boeing and its suppliers). The information here was interesting when you apply it to how the game industry is using more and more outsourcing. The best point made was that the value stream needs be simplified as much as possible before outsourcing parts of it (or all of it). It seems as though the game industry outsources a lot of work that is done the wrong way. If you address the value stream first, you may not need to outsource at all.

Journey

The final chapter summarizes the book, takes a look at some of the other initiatives that have been used (Six Sigma and the Theory of Constraints) and ends with some solid advice on how to start using Lean.


Recommendation

I'd highly recommend this book to anyone who has been using agile to develop a video game. The principles of Lean are directly applicable to all aspects of development and production.


As we explore fitting game development under the agile umbrella, we find that different practices have different coverage. Scrum has a great set of multidisciplinary practices that cover preproduction very well, but may not have complete coverage for production. XP practices are great for all of the development cycle, but only for the programmers on the team. Lean inspired practices for preproduction and production hold the potential of covering much of what we may lack in content production. It makes sense based on the production roots of Lean with Toyota.

Wednesday, September 05, 2007

Sprint Burndown Charts for Feature Teams

The Sprint Burndown chart is a one of the most useful tools in Scrum. It helps make the progress of the work being done against the Sprint Backlog visible to the team and others.

The team can use the backlog to estimate if they are going to meet the goals based on the velocity (slope) of the Burndown and react as early as possible if there is a threat to meeting their commitment.

There are a number of issues that have come up with Burndown charts as they are being used for Feature Teams however. I've been thinking about the Burndown charts and would like to share some of the reasoning behind tweaking a core practice.

The teams are cross-disciplined
Early in its life, Scrum was mainly practiced by IT teams that were dominated by programmers which influenced the standard practices. For example, if a team is made up of C++ programmers and the Burndown is showing that they are falling behind, maybe due to problems with one story, then the entire team can pitch in and help each other reach the goal. Maybe they all come in on a Saturday and work together.

For game teams, especially feature teams, we have a much wider range of disciplines within each team. Cross-functional teams can lack this all-for-one and one-for-all benefit that comes more easily to teams of one discipline. Take the example of a Feature Team that has a single character artist that is falling behind in their work. The Burndown probably won't show this clearly well in advance. How does the team respond to being behind on character models? If the artist comes in on Saturday to catch up, does the entire team come in and do nothing or take the day off? Either solution does not foster the best team spirit.

Dependencies can also effect the progress of a story through the hand-off of work. If there is a chain of tasks that occur after the character modeler is completed with their work, then a delay in one task can shift the entire story and endanger the Sprint. Again, this is a problem that is not usually seen on teams of just character modelers or programmers.

Another problem is that artists and designers work better with time-boxed estimates rather than completion estimates. The difference is that time-boxed estimate are the limit that the designer or artists will take to work on something and completion estimates are more definitive estimates better applied to programming tasks. Artists can iterate, refine and polish a piece of art for as long as they like. It's often a subjective judgment to say it's good enough. Programmers have a bit easier time of determining completeness. Code either works or it does not. We usually don't have the urge to refine code forever until it is "beautiful enough" (there are exceptions though!). Should this difference be reflected in a Burndown differently?

Sprint Task Backlogs Change during the Sprint
The stories that the team committed to completing don't change, but teams are allowed to change their task backlogs during a Sprint. We've found that the Backlog task list can grow up to 50% depending on how well mechanic is understood. We tried to spend more time in Sprint planning to eliminate the uncertainty, but it produced worse results at the reviews by limiting flexibility within the team to explore and discover. We tried defining a great deal of detail within the "conditions of satisfaction" for the stories, but they didn't define the qualitative bar we were seeking; "Finding the fun" involves uncertainty even at the Sprint granularity. So teams adopted Product Owners as Pigs to shrink product decision loops down to necessary.

The obvious problem with so much flux in the backlog is that the Burndown chart projections aren't as transparent of the backlog progress as they should be. Fortunately there is still some predictability from Sprint to Sprint. A team that estimated 250 hours of work and successfully finished 500 within their Sprint can safely commit to 250 hours the next Sprint and expect the same expansion. Still, having a good day-to-day measure of progress is desirable.

The Value is in the Stories and not the Backlog
With so much change among a cross-disciplined team, the connection between a Burndown chart hitting zero hours and "doneness" in the stories becomes a lot fuzzier. In fact there is a real danger of teams becoming too fixated on the Burndown and not enough on "doneness". Traditionally you can look at a Burndown chart and read into the issues of the team. A consistent velocity on the Burndown (straight line from start to end) usually is a good indication that the team is too focused on keeping a constant velocity though the unconscious manipulation of their task estimate. Nature abhors a straight Burndown. This causes problems which are apparent at the review:
  • The team shows consistent lower velocity in value added to the feature/game.
  • The technical/design debt of the team can grow. Bugs, loose ends, etc are in full view.
As a customer I tell the teams "I don't care about you completing all the stories so much as seeing the team nail the higher priority ones". Great games are the ones that have a few great features rather than many mediocre ones. I would rather see a team blow us away with a few of the top stories and leave the bottom ones for the next Sprint than complete them all to the "letter of the law, but not the spirit". When you tell a team to keep going back to the same story (or minor variants of it) in an attempt to find the fun, then the team is not taking proper ownership of the feature. A great feature is going to take the time it needs to be fully developed. It's difficult to predict.

Creativity = Fun = Success
What makes working in the game industry so great is that value in the product is built day-to-day through the creativity of everyone on the team. Hmmm…what makes the product better is what makes working on the product so much fun? Give me more of that! Somehow writing "make it REALLY fun" as a condition of satisfaction isn't enough. It needs to happen at the team level.

An Ideal Feature Team Burndown - Does it exist?
So what can a team do to address these issues? As Ken Schwaber says "Scrum is about making things visible to that you can make common sense decisions".
So what are we trying to make visible?
  • Progress of each story towards being "done".
  • Dependencies and handoffs within the team.
  • When we are running short of time (as early as possible) to finish everything.
Suggestion - Visualizing Time-boxed Workflow
Perhaps a story is to finish a section of a level. Rather than detail each task out, the team will create time-box estimates of the stages that the production pipeline requires:
  • Initial concept - 1 days
  • Design modular pass - 3 days
  • Art pass - 3 days
  • Design tuning - 2 days
  • Art fine pass - 1 day
Time-boxing turns out to be a more useful tool for this kind of work (see above). When a pipeline of time-boxed work exists for a story, it would be great for the team to see where they are in the time-box versus where they should be based on remaining time.
How would we visualize this? The simplest thing would be to show:
  • What stage are we currently working in for a story?
  • How are we tracking for total time and stage time?
This could be shown on the task-board next to the stories. The boxes would be scaled for each stage (in calendar days) to match the two timelines on the bottom (one for calendar timeline, one for actual progress time). The focus of the daily Scrum would not be going around the room and asking the three questions, but starting at the highest story and asking about the completeness goal of each story. The answers to the three questions would naturally emerge from discussing progress at the story level.

Based on the Daily Scrum reporting, the progress token would be moved over. If the progress was slipping behind schedule it would become apparent each day and discussed. In the figure above, this is the case. Progress is behind schedule (current time).
The problem I see with this is that it is not as effective for stories that don't have such a pipeline of time-boxed work. Do we need to have some form of burndown for every story? If so, do we lose some value in a single Sprint? Is there some way we can still have a single chart to show the big picture progress of the Sprint?
Conclusion
The only way to find out is to try it out at the team level and evolve it. Retrospectives are great places to discuss these things and tweak them further. Ongoing practice refinement like this is critical to getting the most out of Scrum.

Monday, September 03, 2007

Principles and Practices

From Mary and Tom Poppendieck:

“Principles are underlying truths that don’t change over time or space, while practices are the application of principles to a particular situation”.

Agile Game Development preserves the principles of agile while seeking better practices to apply them.

Success or failure with agile seems to be based on how the principles are maintained.

Saturday, September 01, 2007

A Case for Small Teams and Longer Development Cycles

Great Games are Hard to Make

We were having a discussion the other day about how it seems that most great games which come out go through some form of crisis or independence from deadlines. One development story was about MechWarrior 2. The team had struggled to find the gameplay to allow funding to continue, but at one point the game was canceled by an executive on a Friday. The team struggled to finish a pass at gameplay over the weekend and fortunately succeeded in overturning the decision Monday morning.

It seems that every breakthrough game had a similar chaotic history or had a commitment by the developers to “release it when it’s ‘done’” (which was longer than the publisher or public liked).

No publisher in the world is going to sign up for a “release it when it’s ‘done’” development deal with anyone besides an Id Software or some other similar developer who can share the risk and who has a proven track record. It’s reasonable considering the risk, but what potential great titles like MW2 didn’t make it?

The Publisher Business Model Rules All

Publishers are like any other large business. They have shareholders that expect returns on their investment and a board of directors that directly answer to those shareholders. The board demands a business plan from their executives that show perhaps a five year plan with lots of promise of profit. To create this plan, the executives develop a portfolio of game products with an estimated profit/loss (P&L) statement for each. Obviously each future game in the portfolio has a positive P&L or it wouldn’t in the portfolio. Unfortunately a large portion of games released do not make a profit. We are still a hit driven industry, but hits can be a bit difficult to predict. Hit franchises give you the best chance (Madden, WoW, etc), but many franchises fade over time.

The portfolio should show a good mix of proven franchise titles and riskier titles that are potential new franchises (new licenses or new intellectual properties (IP)). The problem is that riskier titles won’t get a big financial bet. In order to get a positive P&L you have to compare the predicted sales with the cost of development (and marketing, etc). No one in their right mind would predict huge sales for a brand new license or IP. So to keep the P&L positive, the dev cost has to be low.

Developing the Riskier New Franchises

So there are a number of new riskier titles in every publisher’s portfolio. Unfortunately they can often be cast in the same date driven mold as the proven franchise development projects. They needed to be treated differently.

Risky new titles should be developed with the P&L in mind, not the portfolio. The portfolio is a wish list used to predict the market and the marketing flow. It’s a poor tool to apply to new titles. Like Design Documents, if anyone takes the time to look back a few years at original portfolio plans; they bears little resemblance to what really happened.

To maximize the units sold and minimize the cost of development, development of these titles needs to pursue an effort that maximizes efficiency, is quality focused and minimize risk to the publisher.

Maximizing Efficiency

Research has shown that there is a “sweet spot” in the number of people working on a task; fewer or more people than this sweet spot and the cost of development rises. This is the basis of the Mythical Man Month and Scrum practices of what team sizes should be. Brooke’s Law is especially true of the game development industry where adding more people to a game development project in pre-production not only has little effect on the ship date but can actually slow down the team with communication overhead and loss of focus.

There are many ways to minimize production time (outsourcing, reuse, etc) by improving efficiency and partitioning, but pre-production is a highly creative and iterative process that can slowed down by applying the same “fixes”.

Better games can be made by having small creative (inexpensive) teams iterating over the possible gameplay mechanics.

Maximizing Quality

The first factor in quality is the team. Talent and teamwork is the key to success. How do you know whether the team can deliver? Iterate. Have the team demonstrate their progress every few weeks. Participate in the direction that the game is taking. Marketing should spend time in reviews and planning as well.

Reducing Risk

This approach fits well with the kill gates approach to seeding many ideas and allowing only the best to fully grow. This sounds harsh to some, but it’s better to kill a bad game early than to spend years working on it to see it fail in the market. Also, impeding cancellation of your project can have a great motivation for the team to prove the games value, as with MechWarrior 2.

Incentive to Take Risks

Based on the current business model, it's difficult for the developer and the publisher to adopt a pre-production model that allows for quick cancellation if the game is going nowhere. Developers want predictable, long-term cash flow. Publishers want to keep the portfolio flow going and want a sense of security from a big plan.

However examples for this models exist (see below). The important thing is to preserve the relationship and pursue ideas with a talented team.

Where does Schedule fit in?

At some point in time, it should be apparent what the core game is and what it is going to take to finish it. At this point the production side of the project can ramp up and a low-risk gold date can be predicted.

Production can’t be ramped up overnight. So every release should have an evaluation of the project in terms of the conditions for production that remain to be met. The team should be able to give a release or two advance notice (~ 3-6 months) of when production can begin. Longer notice can be given with less certainty.

Where games took 2 years to develop, they now might take longer. Teams may save some of the time by entering production with truly shippable mechanics rather than wasting time reworking assets in light of changing mechanics. It should be a lot less expensive and produce better results.

The main risk of this approach is rushing the production trigger. It’s hard enough not to fix a launch date two years in advance. Asking for a year or less commitment to the launch date may be impossible for marketing up front, but it’s what is often done in light of project delays.

Not a New Idea

None of this is new. This approach dovetails with agile very well though. As more developers adopt agile practices it will make more sense to develop games with this approach.

As a member of the Nintendo Ultra-64 Dream Team, Angel Studios was exposed to this model in the mid-90s. Nintendo would discuss a game idea with us and ask us to "find the fun". They'd give us enough money to operate for three months and then come back at the end of that time to see what we found (occasionally Miyamoto would visit!). Usually we discussed the results and the high level direction for the next three months. Occasionally, if we couldn't "find the fun", the game would be canceled and an entirely new game would be started. If the game lived long enough Nintendo would tell us to finish it in six months! Unfortunately we weren't very iterative in our development approach at Angel, so it wasn't an ideal fit.