Tuesday, March 03, 2020

Managing Risk on Video Games

Back in the early nineties, I had the privilege of working at a game studio occasionally visited by Shigeru Miyamoto. Miyamoto arrived every few months to play the game we were developing for Nintendo. He didn't care about a schedule or budget. He only wished to know if we had "found the fun" yet.

Finding the fun is one of the most significant areas of uncertainty in a new game's development. Shipping a game that isn't fun is always a considerable risk.

Risk is the impact on your plan caused by uncertainty. Uncertainty is also an essential part of game development. Think of all the great games you've ever played. Did many of them do something that you'd never seen before? For every successful game that does something new, there are many more that embraced uncertainty and failed. This was key to Miyamoto's approach: find the fun or fail fast.

Many other stakeholders attempt to avoid risk by coming up with ways to minimize uncertainty.  Often this is done through:
  • Detailed design documents that try to answer every design question upfront in an attempt to reduce scope uncertainty.
  • Comprehensive schedules that identify work to be done, in an effort to minimize schedule and cost uncertainty.
Embracing Risk

A detailed set of practices for embracing risk is too long to describe here, but is detailed in the next edition of "Agile Game Development."

Here is an overall view of how risk can be managed:

The steps are:
  • Identify risk. There are a number of activities that can identify a broad set of risks based on what you know is uncertain and what might surprise you.
  • Classify areas of uncertainty that we can plan for and other areas we can't. Knowing which can help us figure out how to handle them. For example, you can plan to port to a new platform, but can't really know if the game is fun until it is.
  • Prioritize. Risks have different levels of likelihood and impact on our game.  We need to sort those out and prioritize which we deal with first.
  • Find a root cause. Most risks have root causes, like purchasing that piece of middleware that was "supposed to be ported" by a specific date. By identifying root causes, we can come up with better ways of avoiding the impact of that risk, if it comes true.
  • Identify a trigger condition. What is the earliest we can know whether a risk has materialized? In the case of the undercooked middleware, we might have a trigger that says, "it fails to do so-and-so on our target by this date." Triggers should be testable and binary.
  • Create a mitigation strategy. What are you going to do if the risk triggers?  Are you going to buy the source code license to the middleware and port it yourself? Having a plan in place helps sell this approach and resolve risks.
  • Evaluate your triggers regularly. Make this a part of backlog refinement. If a risk is triggered, it'll probably change your next sprint, and the backlog refinement is where that is best handled.
Make Stakeholders Your Partners in Risk Management
One of the best tips I received from a former boss of mine was, "don't come to me with problems.  Come to me with solutions".

Sometimes the hardest thing to do is admit that you "don't know" to your boss or publisher. Going to them with a list of risks is even harder. That's why the mitigation strategy above described above is valuable. It has a solution associated with every prioritized risk. It might take a development cycle to prove the value, but I've found that with most well-developed risk management lists, at least 20% of them come true. Because those risks are triggered with enough time to solve them, the value becomes apparent.

A publisher once accused my risk management list as a CYA ("Cover Your Ass") document. I agreed with him that it was partly that, but added that it covered his as well since everyone has a boss that they answer to.

Where to Start
  • Brainstorm risk. A favorite practice of mine is called a "PreMortem" (see the GearUp book). Gather all potential risks and prioritize them through the mapping practice described above.
  • Come up with triggers and mitigation plans for the higher priority risks
  • Set aside a regular time to identify new threats, evaluate the triggers, and retire any risks that have been mitigated or have otherwise been resolved.
How to Master It
  • Document and share the risk mitigation plan with your stakeholders.  Involve them in the regular evaluations.
  • Reduce your existing planning practices to move away from "documenting away uncertainty." This is a "cultural security blanket" that may take a few development cycles to wean stakeholders off of.
Learn More
  • The second edition of "Agile Game Development". As mentioned, there will be a lot more about this approach in the next edition of the book, coming out in summer.
  • Gear Up, 2nd edition. Over 100 practices for team, game, and development improvements you can immediately implement. On Amazon and LeanPub.
  • Me. I teach courses on improving game development, including integrating debt management into your existing process. Visit and contact me.

Tuesday, February 18, 2020

Ending Video Game Death Marches - #1 Managing Debt

We’ve all experienced it: A sink stacked high with dirty dishes. You rarely have time or incentive to tackle them. Usually, you’re in a rush to do it before a visitor arrives or when you run out of clean dishes for your next meal.

Eventually, most of us learn that washing the dishes once a day (or at least throwing them in the dishwasher) is a better approach. It takes a little discipline to get into the habit…similar to flossing your teeth every day, but it’s for the better good.

The same principle applies to game development. We often let the crud in our games pile up. We call this crud debt. We often push off paying that debt until it’s an emergency, and that usually leads to crunch, and lots of crunch sucks.

Why Debt contributes to Crunch (getting stuck between a rock and a hard place)
Debt is unfinished work, whether it’s bugs that need fixing, stand-in art, or untuned mechanics. It’s called debt because, like financial debt, it has an interest rate whose payback grows over time. Debt piles up —usually inside some tracking software—where it stays, growing until some point in development, commonly called alpha (the “rock”), when teams dive in to address it. The problem is that alpha is close to a ship/deployment date that is fixed (the “hard place”). Teams quickly discover that there is not enough time to address all that debt, and management decides it’s time to crunch.
It Often Gets Worse 
Ironically, managers usually react to crunch, low quality, and missed deadlines by demanding more detailed task planning. This tends to squeeze out the slack that should be used to address emergent debt. That slack is essential; when was the last time you estimated the amount of work it took to fix a bug that hadn’t been found?

Setting Aside Time to Manage Debt Works, but it’s Not as Easy as You Might Think
It’s easy to see that if you have less debt, there’s less reason to crunch because of it. So what’s so hard about managing debt?
There are two significant reasons that we fail at managing debt. First, it seemly slows down development. Practices like unit testing, creating automated test tools, and making sure new features introduce minimal debt takes extra time upfront, but in the long run, it saves a lot of time. However, it’s often hard to convince a stakeholder that doing the extra work now is a lot less expensive than doing it months from now.
Second, we developers can be lazy. Yes, I said it. Managing debt, such as optimizing polygon usage, replacing stand-in audio, or refactoring code that has gotten “crufty” is a lot less fun than tossing in something new. Like washing your dishes daily, it’s less fun than making them dirty, but it’s a discipline that needs to be built up to avoid the piles in the sink.

Where to Start
Debt is best managed by setting aside time to eliminate levels of it. You can start by establishing an agreed-upon time that’s set aside to address debt. For example:
  • Every day, eliminate any problems that cause the game to crash or otherwise be unplayable.
  • Every Sprint, make sure the game is demo-able to internal stakeholders. This could mean that the frame rate is stable and fast, and players can have fun.
  • Every Release or milestone, make the game worthy of showing the outside world. It could be missing key features, like a marketing demo, but what is there is polished.
How to Master It
  • Build test automation. The Gear Up book describes several useful ways automation can help
  • Metrics. There’s a lot of debate about whether tools that measure code quality, like unit test coverage, help. I feel they do. If you can measure something useful, you can improve it, but beware of the metric becoming the goal.
  • Educate. Try pair programming. For example, hold a “Wednesday Pizza Talk” (Gear Up) to educate developers about practices to reduce debt.
Learn More

Monday, February 10, 2020

Six Signs your Game is in Trouble

We all want to make great games and not suffer making them, but sometimes it doesn't work out that way. Below is a list of 6 typical signs that the game you're working on is in trouble.

1. Your bug database is growing out of control
I'm not a fan of bug databases to begin with. They are often rugs to sweep dirt under, and that dirt gets more expensive to clean over time. All that debt has to be paid off, and it's often paid off with crunch and compromise. We should be spending time at the end fine-tuning the experience of the game, not making it barely shippable.

2. You Don't See the Big Picture
Often, especially on large games, individual developers don't understand the game they are working on. It's all in the head of some lead designer, who might even be in a different city. Without a shared vision, moving your game forward becomes like pulling a wagon with a hundred harnessed cats. Chaos ensues.

3. Gantt Charts
Detailed Gantt charts often just serve as project management theater. A complex, graphical chart can often placate a publisher. Not that it's terrible to think about the complexities of work and dependencies, but these artifacts, adopted from manufacturing, aren't well suited for creative work. For one, they don't lend themselves to change very well. The one thing I always look out for are Gantt charts that magically slope downward off in the future when some magical burst of productivity is forecasted, which brings us to the next sign.

4. Wishful Thinking
I'm an optimist, but it usually takes the form of "yes, I'm positive something will go wrong!" Projects should embrace risk. It's more important than task management. Problems do not solve themselves and that reassuring management reserve set aside for problems will probably be gone by the time it's needed. If the game is not fun and on track now, it's unlikely it will suddenly become so "someday."

5. Building Final Content Based on Unproven Technology
This is probably the most expensive one. Technical promises and their schedules are often not worth the electrons used to store them in the project management tool. Even console manufacturers are guilty of this (who remembers the "Emotion Engine"?). If you are creating final, shippable content using budgets (graphics, physics, AI, etc.) that are beyond what your current engine can do, it's a good sign you'll be redoing all that work, in crunch, again.

6. Management "Tells"
As with poker, there are some tells that managers show, which are signs that there is trouble.

  • Mandated crunch. Scheduled crunch often means more is coming. It's panic time.
  • Moving people between projects to speed things up. This always slows things down. You know why.
  • Sudden micromanagement. I'm all for management being engaged daily with developers and communicating more, but when it instantly ramps up, they're worried.

This might sound a bit negative, but game teams get into trouble all the time, and this list could be much more extensive. Identifying the signs early on is the first step in solving them.

Well-proven solutions to these problems exist, but they are not easy. They involve changing approaches to how we think of game development and stakeholders. We need to focus on the game first and the project second. That requires courage from leadership and developers.

Friday, January 24, 2020

Solving Large Team Dependencies

Simulated Annealing for a Travelling Salesman

We've all seen it. The larger a game team, the more dependencies between developers and teams emerge to slow development down to a crawl.

The problem of dependencies is a complex one. They are called NP-complete" problems, usually only solved by time-consuming brute force approaches. So forget about an easily managed solution.

The best approach has its analogy in computer science called "simulated annealing," a technique where you start with an approximate solution and add a bit of change from time to time and see if it improves the solution. The GiF above shows simulated annealing as applied to the classic Travelling Salesman Problem. Instead of thinking of cities (groups of dots), and paths between (lines), think of developers(dots), teams(groups) and dependencies (lines). Over time, as teams and individuals within teams experiment with ways to reduce dependencies, you see those inter-team dependencies reduce.

For large teams, those changes are often to team makeup and the formation of the Product Backlog. The goal isn't to eliminate inter-team dependencies, but to move as many as you can within the individual cross-functional teams (Scrum-sized, 5-9 developers). Within those teams, they build accountability and better practices to address debt and reduce the cost of dependencies.

By experimenting with more self-contained teams and organizing the Product Backlog to reflect those teams and to depend on fewer "future integrations" and build development practices, dependencies will slowly diminish over time.

Practices to Try

  • Instead of creating a detailed release plan every few months, just define the major epics and let teams reform around the epics they'll take on and refine the release plan themselves.
  • Identify dependent specialties in the release plan between each team instead of tasking out dependencies. Often this indicates where you don't have enough specialists or where there is an opportunity to cross-train someone and spread some skill.
  • Talk about them in retrospectives. Encourage the team to come up with solutions.
  • Measure inter-team dependencies. If you don't measure it, it's harder to improve it.
  • Find ways to visualize dependencies. Program boards that use strings to identify inter-team dependencies are useful. A board that looks like the one below should horrify anyone.

This is not a good "after" shot IMO

Dependencies on large games are a huge anchor that slows development down in a very opaque way. Your focus should be on those and less on tracking individual effort.