Iterative Innovation, Part Two: the Value Proposition for a comprehensive metrics program

Is a concerted metrics program necessary to achieve a rapid II cycle? Or will some spot checks, when needed, suffice? Why is measuring anything more than basic player behavior important?

Raise your hand if you’ve been haunted by some variant of this question: “Is all of this metrics stuff really necessary for success, or is this a post-launch luxury that we can do without, at least for now? Other online games have been built without it, we have a strong team, we already know what our users want, and our publisher is pushing heavily on game features, not tools. What’s the difference?”

People react to how their success is measured. To generate the correct behavior, you need to measure the right things. If everybody keeps chanting “game features, game features, game features”, then game features take priority even when project schedule, cycle rates, engineering efficiency, system stability and infrastructure all suffer. Without automated, accurate metrics, it is difficult to evaluate such priorities. All measures become “probably” based, and the tendency is to assume the optimistic side of probably, especially when you’re running late, as everyone always is. This leads to sacrificing less visible, less quantifiable problems, such as cost of frequent build failures, to show more progress on game features: the primary factor on how success is measured.

This in turn results in slow, inefficient development until the system breaks down so badly that cycle times and QOS become barriers to forward progress on anything. Slap some bandages on the worst of the spurting arteries, then straight back to game features, because now you’re out of time, too, Right?

All features, game play and game service, need to be evaluated by their impact on CA/CS/CR/LVT, and on the II cycle rate. See also: [The Probably Problem and Case Study (QA test suites)]. People tend to measure what is easy to measure, not necessarily what will generate the correct behavior. Gameplay features are easily seen, easily compared to the last version, and are definitely required for success. But a successful online game also requires many service-based features, such as QOS, quick response times, and cost of service per customer. When fuzzy definitions of infrastructure requirements such as “good enough service” or “good enough cycle speed/stability” exist, or they are not easily observed, they will continue to receive less attention than game features until something breaks.

To make infrastructure as important as game features, make infrastructure features as visible, fresh and trackable as gameplay features, and tie those infrastructure features to your core business success metrics: CA, CS, CR, LTV. Game teams live in the here and now: if you can’t visibly see the impact of work, right now, and how the project has advanced, it is going to get less priority than game features. Similarly, you need to tie specific quality levels to project milestones, and expose your true overheads, particularly dead build rate, go-back costs and generally things that get in the way of people trying to get their jobs done each day. And of course online games are a recurring service business, not build once, ship one once, sell once and move on. Online means customer retention, speed of new content development, operating costs etc become a big part of the decision cycle: things with a high recurring cost for a years/long service quickly show their value just from a $bottom-line perspective. But when you fold in the added creative edge on polish and innovation, that’s where TDD really pays off for the engineering, production and operations teams. To change the behavior, you have to expose the importance and progress on all components you are working on, right now, to the same level of “I can see that it’s important” and “I can see we’ve made progress this day/week” and “I can see that we’ve moved forward against what everyone has agreed is important.” Further, you have to have the strength of will to continually push for improvement on this, and you need the software/environment to expose the change/growth you need to fill the void of “thank Ghod, we’ve made progress today”.

When asked “do we really need a lot of metrics?”, start by reframing the argument. The correct question is “how does lack of metrics hurt our chances of success in a risky business market?”

• Historically and culturally, game features have been considered much more important than tools like metrics, or even a stable development environment.

• To succeed in an online game service, you need to change the mindset of your team. Metrics, development speed and stability are not luxuries, or something that can be deferred until “we’ve caught up on the important stuff”, where “important” is ultimately defined by how you’re measured and rewarded by money, social status and peer approval, but you can never catch up on the important stuff enough so you can do something not easily visible right now: it’s Catch-22. A stable, fast development cycle and the ability to see inside the myriad black boxes you’re creating are not luxuries: they are required, mission-critical features when building and running an online game service!

• Your team needs metrics because you’re in a service business, and need to understand your customers in order to attract more customers, retain them longer, increase revenue and decrease recurring costs.

• Additionally, your team needs fast visibility into experiments, schedule progress, and where the problems are in code, gameplay, production and monetization models. You also need a fast reaction time both for experimentation and live operational problems.

• Great game play features are a necessary but not sufficient condition for success in a continually evolving online game service. In an online game, stability, speed at runtime, speed of new content production and low recurring costs are also necessary conditions for success: they have a tremendous impact on the player experience and your ability to grow your business.

o A rapid IIC is also a necessary but not sufficient condition for success. Without a structured, evolutionary process to effectively utilize speed, your wheels may be spinning very quickly, but your game might not be advancing at all.

• You can’t achieve all of these customer and business requirements without a strong development environment. That means that scaffolding, metrics, automation, testing tools, content tools and agile architectures/processes are all mission-critical features of your online gaming service. Without them, you might survive development and launch, but don’t bet your lunch money on it: there are more unsuccessful online games than successful online games. What you can count on is slow, inefficient growth of your game once you’ve launched, and continual, chaotic firefighting to stay running at all, let alone turning a tidy profit.

• Instead, if you prioritize cycle speed, visibility and stability from day one, you’ll get a faster, more effective development cycle, a stronger customer experience and lower recurring costs. Bet your lunch money on this instead.

• A Measure/Change/Measure approach is far more efficient, agile, stable and predictable than the traditional Guess/Change/Hope process. And by having a fast cycle time, and staged comb filters in your decision cycle, you don’t have to worry if MCM is slower than GCH. Finally, if push comes to shove, the exact same infrastructure you need for MCM also allows GCH!

• An fast, incremental evolution strategy, driven by metrics, lets you start small and then iterate yourself a hit game.

• Also required for success is a low cost, high turnaround model for new content and affordable, scalable recurring costs.

• MCM is essentially the classic scientific method, sped up for online games. If it was good enough for Isaac Newton and good enough to kick-start the Age of Reason, it’s a good enough place to start with, as we improve how we build, optimize and expand games, it just needs to be faster and easier. Automation and rough, back of the envelope math are the keys here; you’re not looking for the final answer, you just want to know where your people should focus their valuable time.

• Describe what improvements to tools/process/architecture you need to do, and what will be the impact? There’s no need to get excessively accurate or precise in this analysis: that takes a lot of time and attaches the focus of people on the details, not the overall effect, and you’re not likely to have very accurate metrics of a chaotic process in the first place. For the first pass of optimizing gameplay, game service and production processes, high/middle/low buckets are all you need. o Case study: Jade linker

• Production / Platform metrics automatically show you which of your recurring costs are not affordable and what is the cost / speed for generating new content, while Player metrics show you the direction for new content.

• The customer experience is just as important, if not more so, as game features: the customer experience has a very strong impact on your primary business metrics, CA, CS, CR and LTV. [Insert WOW quote] o Deferring the tools that directly impact the customer experience is a dangerous decision, given the impact on your primary business metrics. o Further, the same tools that control the customer experience also control the developer experience. The same iteration speed and stability you need for a strong customer experience also act as force multipliers for your development team Then proselytize the message, repeatedly, across the entire team: the faster you can iterate and the more visibility you have, the more you can innovate, polish, and grow.

• The more visibility you have into your production processes, development environment and software architecture, the more you can accelerate your Iterative Innovation cycle (II cycle).

• The earlier you can measure your production processes, implementation variants and player’s reactions to your changes, the more likely you are to find a successful path and avoid schedule-busting dead ends.

• Production metrics in particular are critical for a rapid II cycle. Identifying and removing bottlenecks in your production processes not only speeds up your cycle time, your production processes increase in stability, efficiency and predictability.

• Metrics help you quickly triage problems, which is often the slowest part of problem-solving; metrics help you evaluate your changes on the fly and metrics help you decide when your current task is done.

• Changing how a team works is hard. If you can’t clearly show cause and effect, you’ve got an uphill battle. When you can quantify what changes brings what benefits, it’s no longer a battle at all. Increasing speed and flexibility without hurting stability and predictability has market value, making it much easier to change how people work.

• My advice: get some early adopters going, on things that are demonstrably better with metrics than without, because changing how people work is tough, even when it’s easier… Your mantra: a poor iteration rate, poor visibility and poor stability slap handcuffs on your entire team, increases your cost and schedule risks and trashes your reaction time when things go awry. Don’t do that.

You should have a minimal metrics system up and running by the time the first client connects to the first server, and/or when you first start prototyping game play. You should be using metrics as part of your task description and task completion steps from this stage of development forward. It is much easier to establish metrics-driven development early than it is to establish MMD late, and it sets the tone for the team from day one: metrics are not something that can be pushed off to launch, metrics are how your company does business.

Advertisements