CO2 & 30 Trillion Horses (a thought experiment)

This piece is about a YouTube video that claimed to have proof that climate change is a myth. It accuses Greens of only quoting small samples of time to show climate change. It also covers the ‘500 scientists’ letter/scam to the UN. 

Let’s see if we can agree on what time period is relevant. Here is the past 1,000 years of the two most important variables: temperature & C02. It captures the rise of the industrial revolution (the key forcing factor) and is long enough that local minima problems don’t exist. And here is something further back, based on ice cores. And here is one with human population against C02 levels. Sediment data gives us data even further back.

Carbon emissions and human population, 1751-2013. Data are from CDIAC and Graphic: James P. Galasyn


They clearly show long-term trends that break the normal shifts in climate. They also show the root of the problem: exponential growth in population, and the importance of the industrial revolution.

Then let’s try a thought experiment on the climate impact of the industrial revolution. There are 1.4 Billion cars in the world. Assume an average horsepower of 100. That’s 100 Billion horse-equivalents pushing new C02 into the atmosphere, every day. The 7.5 Billion people in the world consume over 21 Trillion kilowatt hours each year; that’s about 27 Trillion more horses pumping CO2 from power plants. There are over 50,000 container ships worldwide, each at 100,000 horsepower. Planes: 39,000 more, also at thousands of horsepower per plane.

So you’re telling me that adding over 30 Trillion horses to our environment, for hundreds of years, will not have an impact?

Let’s try shaving with Occam’s razor. It is clear that somebody is lying on the topic of climate change. But who, and why? The video you forwarded uses the “conspiracy to overthrow democratic governments” theory; that there is a shadow group, worldwide, who is manipulating things to scare people into a global government. Surely if such a group existed, they could find an easier way? And nobody has squealed on them yet, worldwide?

On the other side, we have billionaires and massive corporations who have their entire fortunes tied up in fossil fuels. If renewables take off, they are no longer the richest people on the planet. They provably hire places like the Heartland Institute to pump out misdirection and outright lies to introduce Doubt, just like the same group did for the Tobacco industry about the risks of smoking.

Which group do you think is more likely to be lying? A world conspiracy of scientists forge data in a plot to take over the world? But are exposed by a plucky group of oil companies and billionaires? Really?

Comparing the science on the two sides:
Over 30,000 climate scientists who work at the highest level in the field say climate change is a serious problem.
The most recent counter (a letter sent to the UN) 500 “scientists and professionals from related fields” say it is not.

So let’s look deeper at the two groups. The climate scientists who say “serious problem” had to pass the highest bar in research papers: publishing their work in peer-reviewed journals.

The “not a problem” petition has no bar. Of the 500 signatories, only a handful have a background in climate science, with the majority being writers, engineers and geologists with no direct expertise in the field. Note that this argument says “I’m right because scientists agree with me.” Why then would you believe 500 scientists versus the over 30,000 scientists who say increased C02 equals increased heat”?

The UN letter is headed by G. Berkhout; an acoustics engineer from the oil and gas industry. Of the visible signers, C. Monckton, a prominent denier has a degree in classics, not science. R. Berger is a geologist from the tar sands & shale side of the oil business. Terry Dunleavy is a former journalist, also with no qualifications. Viv Forbes is a geologist from the coal industry.

Their letter is embarrassingly bad. It basically hopes the reader doesn’t know how Google, or basic logic, works. Claiming that palm oil plantations that destroy orangutan habitat means renewables are bad is just stupid when you remember that The Whole Point Is To Stop Burning Things That Generate CO2! The typical ‘windmills kills birds’ forgets to mention that Oil Kills More.

C02 is a very stable molecule, so the effects are cumulative over hundreds of years. Some excess C02 does get absorbed beyond the normal C02/oxygen cycle, but note that a warming ocean can hold less C02. CO2 is also a proven greenhouse gas: it retains heat. Humans ‘only’ add about 29 gigatons of CO2 per year, vs the 750 gigatons moving through the carbon cycle each year, but the thing is, there is nowhere for most of it to go! So 29 GT per year adds up pretty quickly: one hundred years worth is 2,900 GT! And we increase the amount of C02 we produce each year, so the curve is actually worse than that. You can run an experiment yourself to show that CO2 retains heat. Yes, the sun and orbital conditions provide new heat, but because CO2 retains heat, we keep more heat that we used to and it all adds up…

You are right in that models are dependent on data, and thus are more decision aids than facts. The thing is, all major models have been back-validated (also called hindcasting), where you plug in the initial conditions from history, then simulate forward. You get something pretty close, which is the best you can do with a model of chaotic systems. I used to work for DARPA in simulation, so I am always skeptical of models too 😉 But if you run the models without the additional C02 factor, you get something wildly different. So clearly CO2 is important.

You are also right in that it is hard to buck the trend in journals. But there has been 50 years of open debate, and analysis. In the 1970’s there were 42 peer-reviewed papers suggesting a warming trend, and 7 predicted a cooling trend. The most recent literature survey, over the past ten years, shows 13,950 peer-reviewed papers, with a 99.99% chance of humans being the causal factor (24 dissenting papers). That’s 33,690 scientists putting their reputations on the line, not just staying silent out of fear.

I would argue that an open debate period of 50 years is enough to come to a working conclusion.

Here’s a longer one: 400,000 years, to capture the Milankovitch cycles and other naturally occurring conditions.

Graph of the Day: Carbon emissions and human population, 1751-2013




Windmills vs Oil: health issues, bird kill, rare earth elements, lithium batteries & cradle-to-grave cost/benefit

There is a valid question on windmills affecting health. So far, nothing definitive has been found. The interesting question is how does wind stack up against oil for health issues?

“It’s not that we don’t believe that people aren’t feeling well or aren’t sleeping well,” […] “What we don’t know is how that is related to presence or absence of a wind turbine.”

My wife, for example, totally hates the sight of wind turbines! I think they look cool in the distance, but are imposing up close.…/can-wind-turbines-make-you-sick/

“[A negative placebo] effect may be driving people’s reported problems is backed up by a 2014 study that pointed out that health complaints are more common in areas with the most negative publicity about the alleged harmful effects of turbines. A large-scale population survey in the Netherlands found that reports of stress and sleep disturbance were more common in areas where the turbines were visible.”

““We were not against the turbines before they went in [but after] we were dizzy, had vertigo like you wouldn’t believe,” […] One theory from residents as to why these effects don’t show up in the studies is that the Vermont mountains funnel the sound in a way that the flatlands of the Midwest do not. Others say some people may just be more susceptible than others to the inaudible noise, like sea sickness.”

” the closer the respondents lived to wind turbines the lower they ranked the quality of life of their environment. The original study found no link between sound levels and these quality of life ratings. Though because there is no baseline data for the sample, […]it’s difficult to distinguish whether respondents were dissatisfied before the wind turbines were installed.

“But it does suggest that there’s something other than sound itself that influences those perceptions,”

Fossil fuels: 52,000 premature deaths per year, costing nearly $5 billion

Workers are 4x more likely to get hurt than the average US worker. Coal particulates: linked to the 4 major forms of cancer in the US, and countless respiratory problems. “Circulatory, respiratory, central nervous system, musculoskeletal, gastrointestinal and urogenital issues, as well as birth defects are elevated in coal-mining regions.”

[fracking] can introduce unsafe levels of naturally occurring toxins, radioactive materials, and toxic heavy metals into drinking water. Fracking also increases toxic smog composed of volatile organic compounds (VOCs, or hazardous air pollutants). A study published in the journal Science Advances of over 1.1 million births in Pennsylvania over nearly a decade found that women who lived within two miles of a fracking site were more likely to give birth to low-weight babies.

“Chemicals used in the natural gas extraction process and stored in open-air waste pits are hazardous to the lungs, liver, kidneys, blood, and brain.

Exposure to toxic chemicals like benzene and n-hexane from crude oil spills can cause chronic mental, physical, and physiological health effects in local residents and cleanup workers.”

” 12.6 million Americans are exposed daily to toxic air pollution from active oil and gas wells and from transport and processing facilities. These include benzene (linked to childhood leukemia and blood disorders) and formaldehyde (a cancer-causing chemical). A booming fracking industry will bring that pollution to more backyards, despite mounting evidence of the practice’s serious health impacts. Mining operations are no better, especially for the miners themselves, generating toxic airborne particulate matter. “

Sulfur, mercury, car smog on a hot day, I could go on, but to not use wind in our energy mix because some, not all, people are getting headaches and can’t sleep is fixable: we don’t locate them too near people. The far, far bigger health problems of fossil fuels is a much bigger problem that has no solution. If we fold in the health effects of global warming, yikes, do fossil fuels look just terrible.…/can-wind-turbines-make-you-sick/…/the-localized-health…/


Windmills vs oil: birds & resource footprint

Rare earth elements: used 20x more in the oil industry than wind! And much of the extraction industry, including oil, also concentrates radioactive waste 🙂

“Rare earths were in a list of 35 minerals deemed critical to U.S. security and economic prosperity”

  • Protected by the Trump administration, and thus exempted from the trade war
  • Used in oil production, medical equipment, military gear and consumer electronics

Global REE consumption by energy application per year:

  • Fluid Cracking Catalysts: 35,000 (used in the oil industry)
  • Wind Turbine Magnets: 2,000 (the one they are whining about)
  • Transportation: Diesel Additives 200 (oil)
  • Transportation: Hybrid Vehicles 9300 (oil/electric)
  • Transportation: Catalytic Converters 5100 (oil)

Bird Kill: oil kills 6x more each year from waste pits, spills and collisions (even more if you count air pollution deaths), and one research paper ranked turbine kill “biologically insignificant” at a major wind farm on a major migration route.

“TENORM is NORM in which the concentration or availability of radionuclides has been enhanced by anthropogenic activities such as mining or utilization. Examples include the scale buildup in piping used in oil production that contains elevated concentrations of radium and its decay products, or fly ash from burning coal.” [and REE]

I get why Trump exempted REEs from the trade war: they are used in a lot of important things, not just wind, like laptops, military gear and medical equipment. But that’s why I’d rather he subsidize the US producer; long-term, the world needs a 2nd source of this critical resource, which in turn pressures China to clean up.

This is an interesting paper on the topic: it covers the China and US REE mines, radiation from mining in general and China REE radiation specifically, and uses of REEs overall.

“TENORM is NORM in which the concentration or availability of radionuclides

has been enhanced by anthropogenic activities such as mining or utilization. Examples include the scale buildup in piping used in oil production that contains elevated concentrations of radium and its decay products, or fly ash from burning coal.” And of course REEs.

“Of Bayan Obo’s thorium content, 96%–98% ends up in solid waste, 0.1%–0.5% leaves in exhaustgas, and 0.6%–2.0% goes to liquid effluents”

So it really shouldn’t be a problem corralling the solid waste, but because nobody cares about the environment in China, they have a waste water problem. This is what lets them undercut the US REE mine; the US requires cleaning up.


Another attempt by the oil industry to smear green energy. The rebuttal:

  • Ninety percent of a turbine’s parts can be recycled or sold
  • US company has fielded a process to recycle such blades and is ramping up capacity
  • In other countries that have used wind turbines for decades, recycling blades is required
  • A typical wind project repays its production carbon footprint in six months or less, providing 20 years of zero-emission energy.
  • Compare and contrast recyclable blades to the tens of billions of taxpayer dollars to cleanup abandoned wells in Alberta alone. Add in TERM radioactive waste from oil production, the “largest form of avoidable radiation in America.” Flavor with the $65B to clean up the Gulf oil spill.
  • And if somebody brings up bird kill: yup, it happens. Summary: fewer birds die from wind than from oil!
  • Considered “biologically insignificant”, one paper cites 6,800 deaths out of a migration population of 69,000,000 at one major US wind farm.
  • Canada:
  • Wind: 3,060 bird deaths
  • Oil production: 18,661 bird deaths
  • Wind    (US): killed between 140,438 and 234,012 birds
  • Oil pits (US): killed between 500,000 and 750,000 birds
  • Oil spills: 500,000
  • Power lines: electrocute over 900,000
  • Cats: 1,400,000
  • American skyscrapers: 600,000,000
  • Oil-based air pollution: uncounted but estimated in the millions (birds are unusually susceptible to carbon monoxide, sulfur and particulates).

So, yup, it sucks to be a bird. But they are actually better off with wind than oil…

Windmill and e-car cradle to grave costs

You’re right on some points, but put them in context of building gas powered cars vs e-cars, and the cradle to grave costs, which are about half that of gas cars.

1) Yes, e-cars require production, just like gas cars and everything else in the world. They are only slightly heavier to produce (and dropping) versus gas powered cars, and the resources used to make e-cars are used in everything else too.

Note also that the car industry is betting heavily on e-cars, as by 2025, they expect them to be cheaper to build than a gas-powered car (which also means a lower production footprint). They are already cheaper to operate, so the projected consumer demand is high (‘look honey! It costs less, and we don’t have to buy gas anymore!’). They’re ramping up big time; about 1/4 of the lineup for some major makers will be electric or hybrid. In ten years, 50% of the lineup.

2) The whole point is *not* to burn coal anymore. Yes, e-cars consume power, but if renewables are used, it is a big net win in resource consumption, not to mention carbon production. And given solar and wind now beat out coal for new production plants, and sometimes even gas plants, one can expect much of the grid moving this way. Given they don’t require fuel, oil/coal just can’t compete as the raw technology continues to improve and production scale of economy kicks in. You just need an automated factory and you can churn out as many solar panels as you’d like. India just cancelled two new, huge coal plants because the solar bid came in cheaper!

3) Yes, the recycling costs are too high still. But they are coming down with such strong demand for lithium batteries. They’re the same ones used in your cell phone and everything else, so they are in big demand. Interestingly, lithium batteries are less toxic than most other batteries, but recycling is tough because it is such a reactive metal.

4) The cradle to grave numbers are quite good; 1/2 that of gas cars, including production of batteries, and the coal/oil heavy power used today, so it will continue to get better as the tech improves and more of the grid converts to the cheaper-anyways renewable plants.

5) Wind turbines: they typically pay back their carbon production debt in 6 months, then provide 20+ years of emissions-free service.

How ‘ethical’ is Alberta oil?

A fake “personal note” said their local tour guide in Prague says he learns in school about how “Alberta is the only place you can buy ethical oil”.


“Ethical oil” is a poorly executed front for oil producers. They specialize in demonizing foreigners while hiding their ties to oil. Their sole goal is spreading disinformation on oil sands, climate change and where Canada gets its oil. They try to look like a grassroots movement, but are actually astroturf. See also: their “an open letter to Albertans” scam.

The US is the world’s biggest supplier of oil, so why would ‘Prague’ consider Alberta as the only source? Norway, Scotland, etc. This alone should tell you this is fake news, not a ‘student’.

But let’s take it at face value for a thought experiment: how ethical is Alberta oil?

1) Most of Alberta’s conventional oil is exhausted, so really these guys are shilling for the tar sands, which are the dirtiest to produce and to burn than any other sources.

  • Simply saying that production emissions have been lowered is a deliberate attempt to mislead and doesn’t change the fact they are still the worst.
  • Claiming that ‘they are a fraction’ is also an attempt to deceive. C02 is cumulative, so Canada’s share is actually much larger, as we’ve been cranking longer. We’re also 7th worst of the industrialized countries, so, like, yay…

2) Abandoning oil cleanup commitments is not very ethical.

  • Well north of $50B taxpayer money is required to cleanup existing abandoned wells, and $100B or more for the tar sands.
  • Oil companies have walked away from their obligations.
  • Tar sands are touted as ‘clean’ yet somehow have over one trillion liters of toxic waste ponds and no plan to deal with them!

“Despite years of public promises from officials that the tailings ponds would shrink and go away, they are growing. And in the meantime, troubling gaps are opening in the oversight system meant to ensure the oilpatch cleans up its mess. Alberta has collected only $1 billion from companies to help remediate tailings — a problem that is now estimated to cost about 100 times that.”

3) Unethical oil companies are refusing to pay their taxes, hurting small towns across Alberta. The amount owed has doubled over the past year, to $173M. 

4) And how big a problem is Canada buying foreign oil anyways?

  • Canada buys light, sweet crude from outside the country, as most Eastern refineries are not setup to process bitumen from the tar sands. With peak oil in ten years, and a terrible market right now, it just doesn’t make sense to spend over 5 years and $15B to $20B on upgrades and pipelines just in time for the market to start shrinking. Alberta’s bitumen will be the first hit in that shrinking market, as bitumen is more expensive to produce, to ship, and to process.
  • 54% from the US, and 11% from Saudi Arabia. Why doesn’t ‘ethical oil’ complain about the US? Because they are harder to demonize than Arabs…
  • Canada has imported less oil each year for over a decade, except a spike in 2015.
  • Quebec, which can process bitumen, gets 44% of its oil from Alberta (via pipeline 9).

They are really just trying to get federal funding for a pipeline to a coast to try and reach the overseas oil market by making people angry about ‘them dirty foreigners’ because they can’t make a business case.

The harsh reality for Alberta is that the expensive, high-polluting tar sands projects are no longer a solid long-term investment. Funds like Blackrock ($7 Trillion) no longer invest in oil, so even if the oil market was good, capital investment would still be hard to find. Tar sands projects in particular are vulnerable; they are not very profitable without $90/barrel prices. Today’s oil price? $31/barrel!…/us-oil-production…/index.html…/orphan-wells-alberta-aldp-aer-1……/oilsands-waste-is-collected……/the-alberta…/



Why were so many oil projects in Canada cancelled?

This pro-oil propaganda raises a fair point, but several facts were left out.

1) Since 2015, Quebec went from 8% Alberta oil to 44%, via the reversal of pipeline 9. That’s part of why Energy East was no longer worthwhile.

2) Trudeau stepped in with $4.5B to keep the Trans Mountain pipeline going and ground broke last year.

3) It’s hard to find capital for major oil projects right now. A lot of the above projects were proposed when oil was tickling $100/barrel. You can’t get half that now. And some of the biggest investment groups, like the $7 Trillion fund that BlackRock runs, no longer invest in fossil fuel projects.

  •  Oil project profits pay out over a 30 to 50 year period.
  •  Oil companies predict peak oil in ten years: no new growth.
  •  Growth in public concern for climate issues is risky for long projects.

 The competition from renewables also looks bad for oil in the long run.

  •   India cancelled two huge coal plants: the solar price came in lower.
  •   Europe is moving away from fossil fuels as fast as they can.
  •   China is building both fossil fuel and massive solar farms.
  •   Electric cars are projected to be mass-market cheap in 5 years.
  •   America is even money to go all in on climate policy in Nov.

4) Teck didn’t get cancelled; they walked away. For all the above reasons. Instead, they bought a solar plant. Their press release is quoted in an article below; they stated very clearly that they supported climate change mitigation, and Alberta just doesn’t do that hippy shit  

“Our involvement with SunMine is part of our commitment to taking action on climate change, advancing renewable energy development, and supporting the global transition to a low-carbon economy,” said Don Lindsay, President and CEO. “SunMine will help us gain firsthand experience with solar power generation as we advance the use of solar power at other operations.”

You can’t get clearer than that to see which way Teck is moving, and you can expect more.

Image may contain: possible text that says 'Cancelled I Stalled Oil & Gas Projects in Since 2015: Pacific Northwest LNG $36 billion LNG Aurora $28 billion WCC LNG $25 billion Teck Frontier $20.6 billion LNG $18 billion Mackenzie Valley NG $16.1 billion Prince Rupert LNG $16 billion CLOSED Energy East $15.7 billion Grassy Point LNG billion Goldboro LNG $10 billion Northern Gateway $7.9 billion Carmon Creek Sands billion Aspen oil Sands $2.6 billion Dunkirk SAGD $2.4 billion Total: $211.3 billion CANADA ACTION Take Action for Canada's Future'

The Alberta Grasshoppers And The Norway Ants

An open letter to all the recent Albertan open letters on pipelines out east, and the Albertan grasshoppers who wrote them.

This is not about Ottawa. It is not about Saudi oil being ‘less ethical’ than Albertan oil. It is not even about climate change.

It is about the effect of cheap shale oil and cheap electric cars on the long term value of Alberta heavy crude. And it is about the carefree Albertan grasshopper and the plan ahead ants from Alaska and Norway.

The harsh reality is that electric cars and renewable energy power plants are becoming cheaper than oil, fast. Car manufacturers worldwide are planning to switch in about five years to a predominately electric lineup; that is when they think electric cars will be cheaper to both build and run. In about ten years, oil demand will peak, then sharply fall off, because people will switch to the cheaper, cleaner energy source. And Alberta is not yet ready for this incredibly predictable event.

We all knew the Oil Boom days would eventually end. I remember the bumper stickers in the 80’s saying “Please God, let there be another oil boom. I promise not to piss it all away again.” And yet we did.

“Oil is great!” said the Albertan grasshopper, “and people will always need it!”
“But sooner or later” said the ants, “things will change, so let’s all save some money for then.”

Even though Loughheed set up the Heritage Fund in the 70’s, it was quickly gutted, by Alberta conservatives, to keep taxes low and spending high. If we had followed Norway’s model, or even the more conservative Alaskan model, we could laugh at the current oil price crunch. That’s One Trillion dollars that the Norway ants saved up, and Sixty Billion dollars for the Alaskan ants. The Albertan grasshopper has $16 billion.

If we had invested in renewables, we could be Canada’s leader in the next wave of energy. If we had invested in carbon sequestration technology, we could have stretched out the appeal of oil further. Instead, we doubled down on the oil sands, let corporations extract billions in profits and basically pretended the boom days would last forever.

So who’s to blame? Ottawa didn’t cause the shale revolution that flooded US pipelines with better, cheaper oil than our heavy crude. Ottawa didn’t cause people to protest new pipelines in the USA or Canada. If you voted for the conservative policies of lower taxes and lower savings, you caused the problem. If you opted for oil as Alberta’s only economic base, and let oil companies leave with billions of dollars, you caused the problem. If your anger really forces you to leave Canada, then remember you’ll still have to get Canada’s buyin on a pipeline to either coast, so I’m not sure what your plan is here?

The fact is that the tens of billions of dollars and five or more years to create a pipeline out east and a refinery to process oil sands just doesn’t make business sense for a product that peaks in ten years. And the biggest refinery out east has already said they’d keep buying Saudi oil; it is cheaper to buy, cheaper to process, and doesn’t require billions in refinery upgrades.

If your goal is really about getting more oil to the world market, please give up the Saudi argument and be honest about it. Then check the world market for heavy crude. Asia prefers Venezuela oil as it is simply cheaper. The US is drinking the success of shale, and has only limited interest in heavy crude. Europe is going renewable energy faster than anyone. The reason the BC Kitimat Clean pipeline/refinery/tankers project is now seeking a federal handout is that no business wants to invest in something so risky with such a short-term return on investment.

So what to do about it? Oil is so useful it will have a long lifetime past peak demand, but primarily in long-haul transit and derivatives, and cars will take a long time to age out everywhere. Plan around that: lower sulfur emissions and stretch out the maritime oil market. Lower the cost of oil sand processing.

But really, this time, start prepping for the end of the oil boom. Diversify the economy. Please.

And if you really have to get angry at someone, look in the mirror.

One last note. Climate change. It doesn’t matter if you believe in it, but the fact is most non-oil people now do; over 60% of Americans are now concerned. This means more restrictions globally on oil, coming soon to a market near you. So myth or not, plan for a harsher world view of Alberta oil, which has the heaviest carbon footprint of all oil.

And a side note to the marketing agency coordinating this fraud of an “open letter” campaign. Have your shills do less copy&paste; it is way too easy to tell these letters are all coming from a single source.

My Automated Testing Trail and My Executable Use Cases Approach

I’ve been using automated testing as a development acceleration tool since 1989, where it saved me a lot of grief as a programmer in the super-computing world. I moved my first team to my form of test-driven development in 1990. Since then I’ve tried to advance my testing strategies with each new software project, and as part of my self-development plan, I’ve done two new automated testing lectures each year, for ten years.

My focus is not just lowering the cost of quality for the user via automated testing, but also improving developer quality by better infrastructure, less waste and faster iteration loops.

I’ve been one of the major leaders in bringing automated testing and metrics-driven development into the gaming industry, but not just from a quality perspective. I follow the Lean school of thought; if you attack the quality improvement problem by improving the production processes, you end up with both higher quality -and- faster development times.

    • A summary presentation of my automated testing approach in games
    • Automated metrics collection and aggregation is an under-served portion of the automated testing problem:
    • I co-authored an MMO Engineering textbook, writing the chapters on automated testing and metrics aggregation for online games
    • Overall, I’ve done a dozen industry lectures on accelerating production via automated testing, metrics & architecture
    • As part of my personal growth process, I’ve done at least one talk on a new aspect of automated testing for over a decade
    • At EA, I revolutionized the testing process for The Sims franchise and helped kickstart other testing projects in other studios. We created one of the first fully automated build/deploy/test/measure pipelines in the game industry (2001). My approach changed the game’s architecture to support easy automated testing, which allowed us to support load testing, regression testing and CI/engineering tests via a single test system, and for some games, via a single test client
    • My auto-test approach differs from most: I test and measure at the player experience level, and modify the code architecture to be more testable. This radically lowers the cost of testing and increases malleability as the product shifts over time, and supports the huge amount of iterative development required in interactive systems
  • Before games, I was also responsible for some of the earliest advances in automated testing, and I’ve iteratively improved my techniques with every project since 1989. Specifically, I’ve designed and built testing tools for engineering speed, performance testing in super-computing, and functional/compatibility testing across ranges of super-computing and clustered computing options. In 1990, I created one of the first test-driven development approaches: I had all engineers on the team writing tests (in my custom harness) before writing their code; all code had to pass before checking in, and we also had one of the earliest nightly build systems that ran unit tests, full system tests and performance tests each night. I also designed the load testing system for the HLA RTI 2.0 (the military standard networking engine for distributed virtual worlds used in training simulations) when I was a DARPA contractor in Advanced Distributed Simulation and tightly-coupled clustered computing.
  • My long-term goal is to increase innovation by taking cost, risk and time out of the problems in building interactive systems.
  • This is a test plan (and simplistic functional testing code sample) I did for Blizzard. They described it as the best test plan they had ever seen.
  • My current work in Lean Game Factories is based heavily on my custom automated testing approach for interactive systems. We’ve built a continual deployment pipeline that does the usual unit/functional testing, but also performance testing, on devices and at load, for each code checkin. By tickling the system under test in different ways, we’ve managed to support every part of the game team, in different ways
    • Game designers and monetization teams: a decision aid tool in early analysis (player bots that play through all the content, every night, with automated metrics aggregation on balancing data)
    • Engineering: performance testing (client and server)
    • Upper Management: prediction of progress
    • Daily Management: automated collection of Kaizen-style Waste and Friction metrics (essentially automated Production Efficiency Metrics, including heatmaps of defects and change rates per code module, trended over time, as well as common failures or slow tools that interfere with production)

I can (and do) talk all day about how to improve automated testing and expand the use cases into all aspects of production. But I’ll stop here for now 😉

Top five metrics mistakes in games

Here are the top five mistakes I’ve observed when a project tries to implement a metrics program. These are generalities extracted from multiple observations, and as such, are intended to provide rule of thumb guidance, not rules chiseled in stone. On the ground conditions in any given project may require a metrics solution tailored to their specific needs.
Note that some very important metrics usually gets up and running without much risk. For example, channeling user behavior metrics into the game design group is such an obvious mission-critical task it will usually happen even if the game designer has to buy an SQL textbook. Thus few user behavior metrics are represented in the five most common mistakes with metrics.

Top five mistakes in metrics
One: No application of metrics in task assignments and project management.
Two: Failing to measure factors that affect team efficiency and delivery schedules.
Three: Raw, singleton data and manual distribution don’t work. You must automate the entire collection, aggregation and distribution cycle.
Four: Not having senior engineers involved in the architectural analysis, implementation design and growth of your metrics system will either cripple or kill your metrics project.
Five: Not using metrics generated via repeatable automated tests at the front end of the production pipeline to prevent defects from moving further down the production line.

ONE: No application of metrics in task assignments and project management.
a)    Without a measurable goal, it is very amorphous as to when a particular task is considered done, or rather, done well enough. The developer has little incentive to do more than get the task done with the minimal amount of time involved: people respond to the way in which their performance is measured. The level of completeness, stability, performance, scalability and other critical factors tend not to be addressed unless they are considered in the Measures of Success for any given task, or until they become a serious problem. This can result in a very high go back cost: how much time is spent fixing defects in a module or in other, connected modules. To paraphrase one senior MMO engineer, “using metrics in my task allowed me to significantly improve performance and remove some bottlenecks. But my question is why would I ever use metrics again, unless it is out of the goodness of my heart? My manager did not specify anything further than getting the feature to work; not how well it worked, or how stable it needed to be. So if I spend time improving my module via metrics, I have, in my manager’s eyes, achieved less work that week: I could’ve left my first task alone and gotten other tasks done instead of improving my first task.”
b)    Metrics also helps to accurately focus staff on real problems, not perceived problems. For example, if a system is failing to scale, there are two paths to follow. The common approach is to gather the senior engineers together and have them argue for a while about what might be causing the problem and then implementing one of the educated guesses, hoping to get lucky. The other path is to place some metrics probes in the system that is failing to scale and then run a test. With the resultant metrics, it is usually much easier to find where the problem is, implement a solution, and rerun the tests to see if the scalability numbers have improved.
c)    Before we had implemented an effective metrics system on TSO, engineers were tasked mostly by educated guessing: we had no way to observe what was going on inside our game and were thus trying to debug a large-scale, nondeterministic blackbox, with very little time remaining. Once we had effective metrics, server engineers were tasked mostly via metrics coming out from automated scale testing. Our production rate soared.
d)    Aggregated data also provides an easy, excellent focusing tool. A Crash Aggregator can pull crash frequencies and locations per build to provide the number of crashes at specific code-file and line-number locations. Prioritization then becomes quite simple. If you know that bug 33 crashed 88 times in build 99, you know that it is a more critical fix than bug 1 that crashed once in build 99.
e)    Lack of metrics-driven task assignment is particularly deadly in the iterative, highly agile world of game production, which has some pretty deep behaviors burned into how things are done. Further, agile development is sometimes the pretext for programmers to continually change what they want to build, on-the-fly. Gold-plated toilets in a two story outhouse are often the result… Customer driven metrics, task driven metrics and team efficiency metrics are good antidotes for keeping teams focused.
f)    The risk of building necessarily partial implementations on-the-fly is that “something” is working by the due date, but it only has a fuzzy possibility of being correct. Further, the go back costs are not accounted for in the schedule and thus they become “found work” that adds unexpected time into the schedule. Of course many features are experimental in nature: they may shift radically or may not make it into the final game, and so it makes sense to build as little as possible until the system requirements are understood. This is still very addressable via metrics: as part of such experimental tasks, simply define the Measures of Success for the initial task as “implement core functionality only” and address the rest later.
g)    Example: when building an inventory system, you need to deliver enough of that system so that some play testing can be done, but you don’t need to cover all edge conditions upfront. Instead, you define and build only the core functionality and deal with edge conditions later, when you will actually have firmer knowledge of how the system is used and what it is expected to do. Using Inventory as an example, the core functionality is simply <add item; remove item; show current items>. Completions of such features are easily tested and measured, and are thus easy to keep stable in the build and in gameplay. Similarly, once the final inventory requirements are known, the measurable conditions of “ready for alpha” or “ready for launch” are easy to define. In this case, the final acceptance metrics would be something like: 30 items are allowed in the inventory, delete one item then test and measure that the inventory count goes down one; verify that that item has been actually removed (from the user’ s perspective); verify that all other items are still in the inventory; verify that adding the 31st item does not damage the inventory existing items and that an appropriate error message is given; does deleting a nonexistent item from the inventory return failure, and are all of the existing items still intact; does deleting a nonexistent item from an empty inventory return failure; and following that, does adding a real item to the potentially corrupted empty inventory still work; etc. etc. etc.
h)    Metrics allow tying production and operation actions to the big three business metrics: cost of customer acquisition, cost of customer support and cost of customer retention. And if you can quantify an improvement you want to make in the game and track how it affects the big three business metrics, you can do what you need to do: no fuss, no muss.
i)    Finally, without project management using task completion metrics, identifying the current state of game completion and projecting long-term milestones are at best exercises in wishful thinking. This tends to result in projects that inch closer and closer to their launch date, with little actual idea of what will happen then, or even if the game will be complete by then. With early, accurate measures of completion, actions can be taken early enough to improve projects at risk: adding staff, cutting features or pushing back the release date. Without early, accurate measures, by the time the problem is detected it is too late to do anything about it.

TWO: Failing to measure factors that affect team efficiency and delivery schedules.
a)    Large teams building large, complex systems can be crippled by small problems, brittle code, non-scalable tools and lack of development stability. Even if individual developer efficiency drops by only 10%, a 100 person team takes a serious hit in the amount of work done by the team, each and every week.
b)    Some such factors are build failure rate, build download time, build completion time, game load time, components that have a high go back cost, time from “build started, build downloaded to QA, time until pass/failure data reaches production, server downtime, etc. These and other critical path tasks not only slow production, they are also mission-critical problems in operations.
c)    Measuring bottlenecks in your content production pipeline can point to places where automation could be added to speed up production; if server stability or database incompatibility or broken builds are recurring bottlenecks, the engineering team then has an actionable task that will measurably improve content production. In TSO, we found that such bottlenecks, despite being widely known as a problem, were not tagged as priority problems to solve! The management team was under tremendous pressure to build features and add content. Assigning resources to fix a fuzzily defined artist tool issue instead of putting more pixels on the screen is a hard sell. So the problems were always dismissed as “oh the build probably doesn’t fail often anyway, “it probably doesn’t affect the team very much when it does”, or “oh, we probably won’t have another Perforce failure, we must’ve found them all by now”. But when we quantified the numbers of build failures in a week, multiplied by the size of the team and how long it took people to resume forward motion, stabilizing the build became a top priority problem. Lost team efficiency via a poor production environment is one of my favorite metrics. It has always resulted in tool improvements and a faster, more stable production cycle, and one that makes it easier to project delivery times for large-scale systems. In a TSO postmortem, the senior development director stated that “[stabilizing the build] saved us.”

THREE: Raw data and manual lookups don’t work. You must automate the entire collection, aggregation and distribution cycle.
a)    Building a series of one-off metric systems that do not support the entire metrics collection/aggregation/distribution cycle is a path to duplicative dooms. One-off systems quickly rot, which is why you can find so many dead ones littering your code base; people hack together what they need for the moment and then they are done with it. And when you next need a number, you’re back at square one: the old hacks are dead so you hack in a new metrics ‘system’.
b)    One-off tools do not generally support correlation and aggregation across multiple databases, nor do they generally have team wide distribution built-in nor do they generally have sophisticated visualization systems.
c)    One-off systems generate patient only a specific type of report and must be run, by hand, whenever the data is needed, and delivery to others is by a whim or by e-mail. In other words, the data is not actionable. To be actionable, a metrics report must contain specific data points before a given task is started and can report any changes in those specific data points after a task has been completed. Such reports are “breadcrumbs” that quickly lead the developer to the problem and know when the problem is solved.
d)    A team-wide Metrics Dashboards helps improve the efficiency of developers by supplying real-time views into the most common and the most critical reports. This also helps improve the efficiency of your build masters and senior engineers, who are continually distracted by questions such as “where’s my stuff in the build pipeline” or “why is this <thingy> broken?”
e)    Lack of automation in a metrics system means somebody is going to have to continually do a lot of data aggregation and communication tasks. This generally leads to people working with what they know: a simple, one-of-a-kind spreadsheet built and then discarded, or some incredibly complex spreadsheet is built by someone, who then enters massive amounts of data by hand and e-mails the results. Sounds very real-time and accurate; a tool that people would love to use, right?
f)    Your metrics system also needs to support calibration: you use the results in critical business decisions and you need to know that the numbers are accurate. Running hundreds of tests to remove nondeterministic factors by, for example, aggregating multiple test runs, eliminating the outriders and averaging the middle third results. This is a typical function that the report builder tool needs to support.
g)    Using metrics in multiple areas also helps to prevent code rot: the system is always in use, and therefore will be kept up-to-date. Further, a stable, feature-rich metrics system reduces the incentive for engineers to create one-off metric systems and thus preventing duplicative, wasted work. Finally, if the metric system is on the production/operation critical path, not only will it remain active, it will be continually grown by the people using it in day-to-day tasks.

FOUR: Not having senior engineers involved in architectural analysis, implementation design and growth of your metrics system will either cripple or kill your metrics project.
a)    A metrics system capable of supporting a large-scale online game is a complex system in and of itself. A poor metrics tool will be a hard sell into a production team that has gotten along without metrics before, or the metrics tool could be integrated into a project and then crack in the seams as the software and customers scale up. Examples of tasks that are beyond junior engineers to complete without guidance: tailoring the system to meet “on-the-fly” priority requests, deciding what are the key metrics to capture, how to make the system flexible enough for easy addition of new metrics and rapid aggregation/calibration/ of new reports, how to make the system scalable, how to make an easy user interface for the complex aggregation map/calibration/new-report functions.
b)    In other words, the design and implementation of a mission-critical tool usually falls down the programmer pecking order to the people least likely to make the correct decisions or correctly implement a real-time report creation tool or a real-time report viewing tool in a massively scaled metrics database that imports data from multiple external forces or correctly aggregate data from multiple, radically different databases.
c)    Example: correlating game data with data from CS or social networks can produce a profit/trouble ratio for customers or detect bots and hackers. One could correlate game features to network costs and suggest game changes to lower network costs, or suggest network changes to strengthen gameplay. One could easily find the players that generate the highest revenue with the lowest hassle. One could easily expand the detection of a single hacker into finding the other hackers associated with them, or even different hackers to use the same basic patterns. You could also correlate better between what game features create more and stronger social building blocks and thus broader social networks. If you know most of your friends from playing an online game, that’s what you have in common, strengthening the customer retention factor.
d)    Failure to collect “metrics on metrics”, which lets you see how the team is using your metrics system: what features are popular, who uses what features, and what is response time for users creating or viewing a report.

FIVE: Not using metrics generated via repeatable automated tests at the front end of the production pipeline to prevent defects from moving further down the production line.
a)    The earlier you detect a defect, the better.
b)    The further you let a bug go down the production pipeline, the more expensive and time-consuming it is. Bug verification, bug assignment, bug replication, bug tracking, bug fixing and fix verification generates expensive noise that hinders already busy people.
c)    The more you allow defects into your build, the more you affect the productivity of the entire team! One half hour of a junior engineer can drop in a build bomb that freezes your team for the hours it takes to find, fix and create a new build.
d)    Even worse, tracking down hard problems often require your top technical people, who could otherwise be generating useful systems! I measured a few such bugs on TSO. One little problem in the build consumed about 30 hours of five of the most expensive people on the team.
e)    Using metrics generated by repeatable automated tests before checking in buggy code will prevent them from burning team-wide time.
f)    Many of the most valuable metrics in an online game can only be accurately produced via repeatable automated tests. Failure to integrate your metrics system into an automated testing system will at the worst kill your project, and at the best, cost you time and money that you might not have.

Innovation Factories

Innovation requires a tremendous amount of iterative experimentation to capitalize on infrequent glimpses into a new, potential future. Doing this at scale is prohibitively slow and expensive, leading to a vicious cycle where cost and risk greatly limit our ability to expand into new fields.

Video games are a perfect case study for this problem. True innovation occurs rarely, while most of the industry flounders around, squabbling over increasingly smaller shares of the market with increasingly larger numbers of competitors. Then, when a new niche is opened up via either new game play models, new business models or new technology, the clone wars immediately start and profit/growth decline. Innovation keeps you ahead of the curve, but is harder to schedule and harder to fund.

An Innovation Factory approach is proposed, where automation is heavily utilized to lower both the cost and risk of innovation. Further, said automation is capable of taking innovative prototypes directly into the marketplace, bypassing the productization phase without risking scaling and quality control issues. Finally, an Innovation Factory is not just to get new products to market, but also to allow rapid, iterative adjustments to true market conditions and scalability to meet market demands.

Online games and Virtual Reality applications are excellent candidates for Innovation Factories. Much of the Production and Operations work is highly susceptible to reusable automation techniques, thus lowering the cost/risk of development and also lowering the recurring costs in running highly complex distributed systems with high quality control requirements, high content refresh rate requirements and low operating cost requirements. The scale of the development team, the scale of the potential user base and the scale of the application complexity all greatly limit innovation opportunities to most content creators: there is a direct relationship between iterative innovation at scale and the cost/risk/time of new products.

AAA products require AAA production techniques, currently only available to dominant market players, such as EA and Ubisoft.

  • Architectural support for iteration and Automation of the content creation, live operations and testing processes are useful in and of themselves.
  • When coupled with real-time Analytics of Players, Production and Performance, the whole becomes greater than the sum of the parts.
  • A transformative leap in effective creativity, coupled with the ability to take rapid prototypes directly into the marketplace, all at radically reduced cost/risk/schedule factors.

Why is this not done already?

The market opportunity here is deceptively simple. The death rate of projects & studios limits retaining lessons painfully learned on how to scale development without crippling innovation. Every project starts from scratch, with a new group of people, and it is always three months late, from day one. Senior engineers who learn these lessons get frustrated and leave the industry, to be replaced by young programmers who have dreamed only of making games, not scalable software & processes, and thus the vicious cycle repeats itself. Couple that with the lack of corporate memory from the churn rate of projects and studios. Add the fact that to make this work, you need a deep understanding of all aspects of the experimentation process/mindset, how to construct complex code/content that needs to shift direction almost daily, how to test rapidly & cheaply, at scale, how to cheaply field live operations of brittle prototypes, and how to modify them on the fly, quickly enough to react to shifts in the ecosystem. Factor in that most of the people who have the background and talent to pull off these challenges off are only in games to work on gameplay features, not infrastructure. Top off with the credibility and business/communication skills required to convince funding agents that investing in invisible infrastructure from the start is more valuable than pure feature work, and it becomes clear why this is not done everyday…


  • Quality and speed, at scale
  • Innovation at scale
  • Accelerating the experimentation rate provides the innovation of play mechanics and also the highly iterative polish of the user experience so essential to success
  • Take Innovation direct to the market: put your rapid prototypes directly in the live market, continuing to iterate on design with one hand and iterate on cost/scale/stability with the other­­­

We need a massive shift in the mindset of how we build complex, interactive systems. A shift that is capable of fostering innovation at both the grass roots level and the large corporation level, but it is not enough to simply come up with new ideas; we also need a way to take new concepts directly, with quality, to the market, and quickly, iteratively, improve against real world conditions.

What has worked before that we can copy from? Darwin-Driven Development!

  • Evolution rocks! It always finds a good solution to the current problem, and a way to adapt to shifts in the market/eco-system. But who has millions of years to get to market?
  • So we accelerate the huge random search called evolution
    • Automation to speed each step
    • Embedded metrics to help prune branches early and react to on the ground conditions
    • More automation, performance-test centric, to allow a rapid change rate without killing the code/product
  • Guided evolution, not random evolution, accelerated via automation!



Metrical Mistake: Andre Iguodala gets robbed of the NBA 6th Man Award

Metrical Mistake: Andre Iguodala just got robbed of the NBA 6th Man Award. Voters clearly looked at only the easy to measure, sexy but weak metric: points scored. I would argue that by using these bad metrics in publications, kids will get the wrong idea of what skills to focus on in their training.
1) Metrics are like fire: a powerful servant, but a dangerous master.
2) Singleton metrics aren’t worth spit. Use groups of metrics, ones that tie directly to the core goal: in this case, winning games.

Jamal Crawford, of the Los Angeles Clippers, took the award for the third time, which you’d might think that meant he was quite valuable.

By scoring a lot, in bursts, he is ranked highly. But his defense is so bad, his team is actually better off without him on the floor. Yes, Crawford scores a lot of points, but he lets the other team score even more!

Iggy was Finals MVP last year for the 2016 NBA champions, the Golden State Warriors. That’s a pretty solid indicator of value. The award is decided by a panel of nine high-end media members, after their detailed watching and commenting on of every game in the series. Experts with solid data: the qualitative view. Another qualitative view: Dre is a key part of the so-called Death Squad; the top-ranked lineup in the NBA. When these five guys are on the floor, they completely dominate. Everyone. Like, at a shocking level. The coaches put these guys on the floor together any time the game is on the line, and they deliver at the best rate in the conference. Jamal gets on the floor, and his team gets worse! So why did Crawford win? He’s a chucker: he physically can’t touch the ball without shooting the ball.

Let’s add a quantitative view to help out here: measured player actions and their impact on the overall game score. Then we compare a few different sets, and check for noise. Like, if a player you just know isn’t that great comes up rated in the top ten in the X composite metric, you know there is a flaw somewhere.

Example: one willowy center was getting killer PER numbers, one of the best aggregated stats. He was an anomaly: excellent numbers, but only in very specific scenarios and with very specific people around him. You knew he was an anomaly because he’s a bench player who gets limited burn; way less burn than any sane coach would give a normal player with such a high PER. Always validate your hard data with qualitative views.

So let’s look at Iggy and Jamal with several advanced stats, drawn from

Wow. Iggy outranks Jamal in almost every advanced stat! And in the regular stats, Jamal is ahead only on scoring-related statistics. Not shooting accuracy, not rebounding, not assists, not turnovers. Chuckers like Jamal are very overvalued as a result of people only seeing the easy metrics.

A final sample: Jamal’s  Defensive Real Plus Minus is the lowest of anyone on his team,  and is ranked 454th in the league. Andre’s rank in the league: 6th. And let’s not get into hustle stats 😉

More details:

[update] Whee! Somebody did a similar analysis; let’s shift the voting patterns for awards! and


A writer’s retreat weekend in Amsterdam

A very good writer’s retreat weekend in Amsterdam! The train ride is 5 hours, but it is such a smooth user experience that I got a lot done on the way here, and I’m more willing to travel via train than air. Then a nice evening of lingering over garlic mushrooms and Irish coffee: making notes on some interesting software development problems at the office, and some preliminary notes for a new talk on rapid iteration.IMG_1384  IMG_1391

You have no idea how hard it is to find a coffee shop in Amsterdam that actually sells, you know, actual coffee. With caffeine. But by a spectacular coincidence, they’re playing two of my favorite writing albums: I’m jamming with Miles Davis on Kinda Blue and bopping with Bob Marley’s best! So I guess I have to stay for a bit, right?

Three out of three hits on the random, walking-around food scene in Amsterdam! Wrapping up the weekend with a droolingly-good ribeye-steak from Argentina: a fantastic flavor experience after a few months in Germany, where the beef just brings the wrong type of tears to the eye. I am writing up more notes on scalability and rapid iteration in game production. This, I am beginning to feel, will be a most interesting talk: the 25 minute limitation is making me really think about the core messages and the delivery framework, which is producing some interesting, clarifying moments.