TDD project summaries (KIXEYE & EA)

I recently spent two years at KIXEYE, building a test engineering group and integrating automated testing into every game team. Significant improvements in development time, scalability and operational cost/quality were directly attributed to the load-testing work I led, driven by metrics for the engineering, QA, producer and executive teams.
After the initial success wave of KIXEYE, operational costs skyrocketed and server failures during peak revenue hours became commonplace; the infrastructure and development processes simply weren’t scaling. The KIXEYE executive team issued a priority one mandate to implement load testing tools and test-driven processes for all titles.
To staff the testing team, we set up intern programs with several schools and hired the best interns out of each batch into our QE group, where we trained them in load testing, application metrics and how to work with each game team. We then embedded these QE engineers into each game team, working with the game teams to establish priorities and continuing to run daily scrums for mentoring purposes and fine-tuning direction.

This approach overcame each game team’s strong initial resistance to load testing: engineering decisions and launch decisions became largely driven by results from the automated tests and embedded performance metrics probes. We then grew the work into performance testing for mobile and web clients.
A quite interesting project for me was Robbie; a friendly front-end to the Maxis automated testing systems. After the success of automated testing in The Sims Online, building a second-generation system, integrated into all teams, became a top studio priority. We first tackled build stability, performance and finally content regression for the studio’s next major release: The Sims 2.0. By embedding myself into the Sims 2.0 team, we found tremendous opportunities for turning test and measure tools into task accelerators for the content team, and by aggregating performance statistics into an executive level dashboard we were able to help focus engineering resources and provide clear cross-team communication. For example, the studio head had been able to start a non-confrontational, very directed conversation on performance, as shown in this simple email: “I see on the dashboard that your frame rate is not where it should be at this point in your project. How are you planning to address this risk in the remaining time before launch?”

Advertisements

“I feel the need, the need for speed!” the value proposition for rapid II cycles and metrics-driven development

Introduction:
The key to success in building any game is how quickly a team can see a game in action, quickly evaluate the state of the players and the state of the game, and quickly react to the new information.
The key to success in online games is to quickly repeat the above evolutionary cycle, over and over again, without significantly damaging what attracted and retained your existing customers, or significantly damaging your platform and brand, while adding material to attract and retain new customers and finding new ways to make money, all while players continuously do the unexpected and vultures show up to pick at your established customer base. All of these constraints are difficult individual problems. Combined, they present an incredibly difficult problem.
The barrier to success is achieving the required development speed, visibility and flexibility in the multiple-dimension complexity of online games. New, continually shifting problems require more than just new solutions, they need a new way to solve them, over and over again. And with the development speed, scale, complexity and longevity of online games, your business needs game production tools, metrics and agility more than ever before.
Introduction, 1.1: A Darwinian shift in what games are requires a similar shift in how we build them.
Online games are the most intriguing shift in media since we first learned to scratch on cave walls. A product oriented industry is migrating to service business models that rent persistent, continually evolving virtual worlds, populated by studios and customers alike. You’re trying to attract and retain long-term customers, not point-of-sale customers. A service-oriented business has different cost and revenue models than packaged goods. Further, players require the flexibility to explore their own paths, build relationships and expand the shared virtual world. The more bonds players have to your game and to your quality of service, the longer you retain customers and the greater your revenue and the better your chances of expanding your customer base. These shifts in the relationship between the players of games and the producers of games change how we think, what our business goals and priorities are, and how we build software.
Online games are a combination of long-term experiments that seek and field innovative approaches to solve complex, continually changing problems:
• expanding the virtual world and the supporting infrastructure;
• exploring new types of fun;
• improving how you attract and retain players;
• improving your revenue models and lowering your recurring expenses.

That’s a lot of stuff to get right, quickly, over and over again, especially with the high quality of service required to attract and retain long-term customers.
• Your software and your processes must be designed for a continual, incremental, cost-effective growth pattern, with rapid prototyping to continually shape the growth direction and continual pushes of new content to paying customers.
• The heart of an iterative development process is the pace of iterations and your ability to see actionable results of iterations.
• Adding metrics to your incremental development cycle provides guidance in your experiments, visibility into your sources/sinks of money, visibility/predictability into your schedules and visibility into the speed, agility and stability of your platform and your processes.
Metrics are not a silver bullet in and of themselves [reference: Frederick Brooks, “No Silver Bullet”]. To meet the demanding requirements of online games, you need an integrated set of development approaches that, in aggregate, lower the cost and risk of fielding an online game service.
• Metrics point out anomalies for you to investigate or find high level, aggregated patterns of data too complex to absorb individually. Automating this triage / aggregation phase is essential to a rapid experimentation/reaction model: finding the problem area is the slowest part of debugging large, complex systems, such as gameplay, player behavior or nondeterministic servers. You need speed of analysis.
• If you can’t quickly react to what you’ve observed, you’ll be more frustrated than successful. You need speed of execution.
• To achieve these levels of agility and speed, a backlog and burn down charts just won’t do: you need your software, your tools and your processes to have ease of observation and ease of change built-in at the genetic level. You need more than process agility, you need system agility.
• Breaking things slows you down and complex, continually evolving software is easy to break. Your experimentation capacity is effectively zero when your software doesn’t work. You’re also forced to squander resources and time on something that worked just fine yesterday and you risk burning your customers with unreliable service. You need production and operational stability.
• Incorporating your long-term business success metrics into your day-to-day decision-making keeps you on the right track, and helps decide which answer to a complex question pushes the ball forward: a completely reaction driven process tends to thrash, not advance. Similarly, critical but less visible infrastructure features fall by the wayside until they are very visibly in the way of forward progress. You need simultaneous focus on both long-term goals and short-term milestones.
• Finally, continual process improvement is a hard people problem. Once the basic infrastructure and processes are in place, people tend to focus on other, more visible problems, such as gameplay. Unless infrastructure is measured and tracked against at least at least at least timeshifting up a short-term needs and long-term business goals, the quality, speed and stability of your infrastructure tends to fall below the radar, until it is visibly and painfully in the way of production and/or customers. In fact, QOS and speed of content refresh are critical game features in your online service business; they need the same level of attention and resources as any other feature, if not more so. And even then, human nature is unwilling to change how work is done in the middle of a project. You must be able to measure reality and adapt your software, your development processes over and over again. You need facts and willpower.

Purple’s Three Laws of Analytics
1. What gets measured, gets done. Getting the right thing done is an entirely separate problem.
2. Individual metrics aren’t worth spit. Context is everything.
3. Change is tough. Without willpower and a continual, evolving path that is easier to use than not to use, you won’t change.

Iterative Innovation, Part Two: the Value Proposition for a comprehensive metrics program

Is a concerted metrics program necessary to achieve a rapid II cycle? Or will some spot checks, when needed, suffice? Why is measuring anything more than basic player behavior important?

Raise your hand if you’ve been haunted by some variant of this question: “Is all of this metrics stuff really necessary for success, or is this a post-launch luxury that we can do without, at least for now? Other online games have been built without it, we have a strong team, we already know what our users want, and our publisher is pushing heavily on game features, not tools. What’s the difference?”

People react to how their success is measured. To generate the correct behavior, you need to measure the right things. If everybody keeps chanting “game features, game features, game features”, then game features take priority even when project schedule, cycle rates, engineering efficiency, system stability and infrastructure all suffer. Without automated, accurate metrics, it is difficult to evaluate such priorities. All measures become “probably” based, and the tendency is to assume the optimistic side of probably, especially when you’re running late, as everyone always is. This leads to sacrificing less visible, less quantifiable problems, such as cost of frequent build failures, to show more progress on game features: the primary factor on how success is measured.

This in turn results in slow, inefficient development until the system breaks down so badly that cycle times and QOS become barriers to forward progress on anything. Slap some bandages on the worst of the spurting arteries, then straight back to game features, because now you’re out of time, too, Right?

All features, game play and game service, need to be evaluated by their impact on CA/CS/CR/LVT, and on the II cycle rate. See also: [The Probably Problem and Case Study (QA test suites)]. People tend to measure what is easy to measure, not necessarily what will generate the correct behavior. Gameplay features are easily seen, easily compared to the last version, and are definitely required for success. But a successful online game also requires many service-based features, such as QOS, quick response times, and cost of service per customer. When fuzzy definitions of infrastructure requirements such as “good enough service” or “good enough cycle speed/stability” exist, or they are not easily observed, they will continue to receive less attention than game features until something breaks.

To make infrastructure as important as game features, make infrastructure features as visible, fresh and trackable as gameplay features, and tie those infrastructure features to your core business success metrics: CA, CS, CR, LTV. Game teams live in the here and now: if you can’t visibly see the impact of work, right now, and how the project has advanced, it is going to get less priority than game features. Similarly, you need to tie specific quality levels to project milestones, and expose your true overheads, particularly dead build rate, go-back costs and generally things that get in the way of people trying to get their jobs done each day. And of course online games are a recurring service business, not build once, ship one once, sell once and move on. Online means customer retention, speed of new content development, operating costs etc become a big part of the decision cycle: things with a high recurring cost for a years/long service quickly show their value just from a $bottom-line perspective. But when you fold in the added creative edge on polish and innovation, that’s where TDD really pays off for the engineering, production and operations teams. To change the behavior, you have to expose the importance and progress on all components you are working on, right now, to the same level of “I can see that it’s important” and “I can see we’ve made progress this day/week” and “I can see that we’ve moved forward against what everyone has agreed is important.” Further, you have to have the strength of will to continually push for improvement on this, and you need the software/environment to expose the change/growth you need to fill the void of “thank Ghod, we’ve made progress today”.

When asked “do we really need a lot of metrics?”, start by reframing the argument. The correct question is “how does lack of metrics hurt our chances of success in a risky business market?”

• Historically and culturally, game features have been considered much more important than tools like metrics, or even a stable development environment.

• To succeed in an online game service, you need to change the mindset of your team. Metrics, development speed and stability are not luxuries, or something that can be deferred until “we’ve caught up on the important stuff”, where “important” is ultimately defined by how you’re measured and rewarded by money, social status and peer approval, but you can never catch up on the important stuff enough so you can do something not easily visible right now: it’s Catch-22. A stable, fast development cycle and the ability to see inside the myriad black boxes you’re creating are not luxuries: they are required, mission-critical features when building and running an online game service!

• Your team needs metrics because you’re in a service business, and need to understand your customers in order to attract more customers, retain them longer, increase revenue and decrease recurring costs.

• Additionally, your team needs fast visibility into experiments, schedule progress, and where the problems are in code, gameplay, production and monetization models. You also need a fast reaction time both for experimentation and live operational problems.

• Great game play features are a necessary but not sufficient condition for success in a continually evolving online game service. In an online game, stability, speed at runtime, speed of new content production and low recurring costs are also necessary conditions for success: they have a tremendous impact on the player experience and your ability to grow your business.

o A rapid IIC is also a necessary but not sufficient condition for success. Without a structured, evolutionary process to effectively utilize speed, your wheels may be spinning very quickly, but your game might not be advancing at all.

• You can’t achieve all of these customer and business requirements without a strong development environment. That means that scaffolding, metrics, automation, testing tools, content tools and agile architectures/processes are all mission-critical features of your online gaming service. Without them, you might survive development and launch, but don’t bet your lunch money on it: there are more unsuccessful online games than successful online games. What you can count on is slow, inefficient growth of your game once you’ve launched, and continual, chaotic firefighting to stay running at all, let alone turning a tidy profit.

• Instead, if you prioritize cycle speed, visibility and stability from day one, you’ll get a faster, more effective development cycle, a stronger customer experience and lower recurring costs. Bet your lunch money on this instead.

• A Measure/Change/Measure approach is far more efficient, agile, stable and predictable than the traditional Guess/Change/Hope process. And by having a fast cycle time, and staged comb filters in your decision cycle, you don’t have to worry if MCM is slower than GCH. Finally, if push comes to shove, the exact same infrastructure you need for MCM also allows GCH!

• An fast, incremental evolution strategy, driven by metrics, lets you start small and then iterate yourself a hit game.

• Also required for success is a low cost, high turnaround model for new content and affordable, scalable recurring costs.

• MCM is essentially the classic scientific method, sped up for online games. If it was good enough for Isaac Newton and good enough to kick-start the Age of Reason, it’s a good enough place to start with, as we improve how we build, optimize and expand games, it just needs to be faster and easier. Automation and rough, back of the envelope math are the keys here; you’re not looking for the final answer, you just want to know where your people should focus their valuable time.

• Describe what improvements to tools/process/architecture you need to do, and what will be the impact? There’s no need to get excessively accurate or precise in this analysis: that takes a lot of time and attaches the focus of people on the details, not the overall effect, and you’re not likely to have very accurate metrics of a chaotic process in the first place. For the first pass of optimizing gameplay, game service and production processes, high/middle/low buckets are all you need. o Case study: Jade linker

• Production / Platform metrics automatically show you which of your recurring costs are not affordable and what is the cost / speed for generating new content, while Player metrics show you the direction for new content.

• The customer experience is just as important, if not more so, as game features: the customer experience has a very strong impact on your primary business metrics, CA, CS, CR and LTV. [Insert WOW quote] o Deferring the tools that directly impact the customer experience is a dangerous decision, given the impact on your primary business metrics. o Further, the same tools that control the customer experience also control the developer experience. The same iteration speed and stability you need for a strong customer experience also act as force multipliers for your development team Then proselytize the message, repeatedly, across the entire team: the faster you can iterate and the more visibility you have, the more you can innovate, polish, and grow.

• The more visibility you have into your production processes, development environment and software architecture, the more you can accelerate your Iterative Innovation cycle (II cycle).

• The earlier you can measure your production processes, implementation variants and player’s reactions to your changes, the more likely you are to find a successful path and avoid schedule-busting dead ends.

• Production metrics in particular are critical for a rapid II cycle. Identifying and removing bottlenecks in your production processes not only speeds up your cycle time, your production processes increase in stability, efficiency and predictability.

• Metrics help you quickly triage problems, which is often the slowest part of problem-solving; metrics help you evaluate your changes on the fly and metrics help you decide when your current task is done.

• Changing how a team works is hard. If you can’t clearly show cause and effect, you’ve got an uphill battle. When you can quantify what changes brings what benefits, it’s no longer a battle at all. Increasing speed and flexibility without hurting stability and predictability has market value, making it much easier to change how people work.

• My advice: get some early adopters going, on things that are demonstrably better with metrics than without, because changing how people work is tough, even when it’s easier… Your mantra: a poor iteration rate, poor visibility and poor stability slap handcuffs on your entire team, increases your cost and schedule risks and trashes your reaction time when things go awry. Don’t do that.

You should have a minimal metrics system up and running by the time the first client connects to the first server, and/or when you first start prototyping game play. You should be using metrics as part of your task description and task completion steps from this stage of development forward. It is much easier to establish metrics-driven development early than it is to establish MMD late, and it sets the tone for the team from day one: metrics are not something that can be pushed off to launch, metrics are how your company does business.

Metrical Metaphors (1 of N)

You can live without metrics in the same way that you can live without bottled water, in a land where Montezuma still seeks his revenge.

It will be cheaper than buying bottled water, at first. It’ll also be very distracting. It’ll also be painful, and you’d be working in a squalid, smelly mess each and every day. That’s the good news.

The bad news is that you’ll not get much effective work done each day as you run back and forth, eternally trying to avoid a Chernobyl event in an essentially chaotic system. Your schedule will become longer and unpredictable. Your recurring costs for cleaning and maintenance will skyrocket; your friends will make fun of you.

Sure, you’ll live. And sure, you saved short-term money by not buying bottled water, that ‘unnecessary frill.’ You’re already used to unpredictable schedules and your friends already make fun of you, so who cares?

But do you want to increase time, money and risks by guessing blindly at what needs doing next, all the while working under the distraction of smelly pants and suspiciously stained chairs?

Give your team a break. Metrics can be used to focus your efforts, quickly find problems, help clean up the production processes and add predictability.

You’ll get more done in less time.