I have a small request to the gaming industry, and the gaming community in general: please quit thinking that Kills/Deaths is actually a statistic that is worth a damn.

[ Time to blow away some dust from the game blog, again! More rambling will be underway, as usual. This blog entry goes in the category of often rambled but surprisingly unblogged topics. =) ]

I thought this was a pretty interesting notion to bring up now that Halo 4 was released. While Halo: Reach statistics revel in simplicity, Halo 4 presents a completely different kind of point system that reward other actions in the game besides just kills. The bottom line is, people need to start thinking more about completing game objectives than raw mathematics.

For some obscure reason, there’s still a whole lot of video games and a whole lot of game communities that K/D is a valid statistic. It’s presented in many video games as an easy-to-comprehend statistic that is supposed to compare your skills against other people. What could be easier to comprehend than this? Take your kills, divide them by number of deaths. You get a number. Bigger is better. Video game developers keep putting this statistic in the games because people are used to seeing it, and it’s very easy to implement.

The truth, of course, is that it’s absolutely terrible for comparing anything.

Allow me to enumerate:

Simple mathematics won’t cut it

I think this goes without saying: People should be extremely sceptical of very, very complicated things that are supposedly solved with grade-school mathematics.

Statistics is a huge branch of mathematics. It deals with a whole lot of mindbogglingly complicated and tedious stuff. You can be a perfectly competent video game programmer if you know a little bit of advanced mathematics, but you’ll soon find out that most of the math game developers are hacking together involve a whole bunch of trigonometrics. Statistics just isn’t something the bosses will be interested of. Our intrepid developers will just open the elementary statistics books and go “hmmmmmm, I suppose I could write some of this stuff.”

And I have to say one thing: I’ve done little games. I’ve done graphics hacking. Some people think my stuff is a little bit neat. And I have no fucking clue about statistics. If someone asks, I’m going to admit that.

So it’s obviously very, very tempting to stick a very elementary form of statistical thingymabob in a video game. People will trust you. You’re a highly acclaimed, highly paid programming god working for a large video game company, after all. People everywhere will trust you to pick the best possible statistical algorithm for player comparisons. Your boss will trust you to do your work properly. …and then you refuse to acknowledge that K/D is bullshit and try to pass it off as a legitimate statistical tool. What will happen? If you answered “your boss will put thumbs up and gamers will praise your wisdom”, congratulations!

If you want to know why this is hard, try this article: How Not To Sort By Average Rating. You have programming problem that sounds very simple: you have items that have positive and negative ratings, and you need to sort them by average ratings. It’s just a matter of subtracting negative votes from positive ones, right? A simple solution looks the easiest. And of course, it’s completely wrong. Correct answer? “Lower bound of Wilson score confidence interval for a Bernoulli parameter.” Hmmmm, hadn’t heard of that one, if you allow a little bit of an understatement.

Of course, I’m not blaming programmers. Game programmers are quite smart, and definitely capable of turning mathematical problems into algorithms and onward to reusable code libraries. If someone devises a more sensible rating algorithm, and allows it to be used in other games, that’s bound to be a good thing.

The only remaining roadblock here is the gamer community. I think the real reason why K/D is so popular is that it’s easily understood with grade-school math and it doesn’t need explanation for most people - mind you, this may be its only good feature, and not even a redeeming one. There are several statistical methods that are much better suited for game ranking. Microsoft Research developed TrueSkill, which in itself is an evolution of earlier methodologies. The only problem with TrueSkill is that it’s mathematically a bit more complicated and it’s partially proprietary and unpublished, so people don’t really know how it works.

If you’re rated in TrueSkill games, your rating is a number between 0 and 50. Unfortunately, people have some sort of an irrational trust toward concepts they cannot fully comprehend. Right now, HaloTracker puts my Halo: Reach Team Slayer rating unofficially at 22. My thoughts are something along the lines of “hum, so I’m kind of a mid-range player, huh.” The K/D fans’ thought is “okay, 22. What the fuck does it really mean? I don’t trust random numbers unless I can see where they come from.”

But the point of mathematics, in this case, isn’t simplicity. It’s choosing the right tool for the job. TrueSkill or any other game rating system may be mathematically heavier, but it does the job better than an overly simple system.

Halo 4 also puts more effort into scoring. It has a leveling system, which simply tracks time investment and isn’t particularly informative about actual skill (though if you see a high-level player, the chances are they actually know which button fires the weapons, and several other neat tricks that will probably turn out quite useful in the battlefield.) I can only hope that the actual skill tracking system that was promised earlier will be rolled out at some point.

Statistics that need interpretation won’t cut it

What the hell is K/D, anyway?

This is a deceivingly simple question, right? Kills divided by deaths. Nothing unclear about that.

Which kills? Which deaths?

What actually counts as a kill?

You could start breaking this down by looking at very minor differences in Halo: Reach and Halo 4 mechanics. Weapons work slightly differently. Some old weapons aren’t there. (I can’t remember if Brute Spiker is in H4. Good riddance if it isn’t.) New weapons have shown up. (If you give me a stickydet and railgun, I’m a very happy non-camper. =) There are slight differences in how different weapons behave. (Gravity Hammer feels less clunky. Grifball is going to be sweet.)

You’ll notice that assists work very differently in Halo 4. Assist needed horrifying health damage and regularly failed to register in Reach when they should have, but H4 is more generous and more sensible about them. Your KA/D is going to look very different.

The bottom line is that your Halo 4 statistics will look different.

Then you can start looking at the individual gametypes. Perhaps it’s best to take a diversion to my crappy statistics.

My overall matchmaking Halo: Reach K/D is 0.73. I’m not sure if that’s in any way significant. I tend to play Team Slayer a lot, and HaloTracker thinks my K/D for that particular playlist is 0.82. Some might say that that is quite passable, considering I play there pretty seriously. Then there’s Grifball. Guess what? 0.82 here too. I put no effort into actually being good at this gametype, I just go goof off and murder a lot people with gravity hammer. They should make jumping illegal. The only thing jumping is good for is that people die a little bit slower. I guess I have a strategy for this gametype, too: “Murder everyone as hard as possible. Then someone who’s right behind me can toss the ball in.”

As said, using some more intricate statistical methodologies (*ahem* TrueSkill) HaloTracker ranks me somewhere on the mid-range. I have absolutely no idea how representative the HaloTracker userbase is compared to the Halo populace as a whole; you might make the assumption that HaloTracker is biased toward the smaller set of Halo players who actually give damn about the statistics. Or, it’s possible that I’m actually worse: Halo Waypoint issued me Reach BPR of 35 (short for “Battle… Point… Rating thingy”, a value between 0-100). Of course, no one knows or seems to particularly care exactly how BPR is calculated. Perhaps Waypoint will introduce some publicised rating methods later on. A casual observer might start noticing some doubt toward multiple contradictory rating systems in the air.

My personal feeling is that as far as my skills go, I’m somewhere in the “probably fairly passable” category - Newbies die by my hand, I can consistently get passable results in most games and I have some eye for the levels and various situations raising from them. I’m not useless. I don’t claim to be able to survive too well in really high-stakes games, but I do pretty passably in Matchmaking. It’s pretty odd that I need to describe my capabilities in a subjective manner in an article about rating systems.

Of course, all these statistics look completely different to your statistics in every other shooter in the planet. K/D ratings simply aren’t comparable between games.

What I’m getting at is that overall K/D in any particular game, looks quite different from your K/D for individual game types, to say nothing of what it looks like in other games.

In other words, no matter how you cut it, your overall K/D is useless and requires further interpretation. The whole purpose of presenting this statistic was that it’s supposed to be a value that is easy to understand and easy to compare. It bloody well shouldn’t require any further interpretation.

A simple example of the interpretation:

The objective of Slayer games is to either reach the specific number of points before the other team, or to have the bigger score when the time ends.

I’m not saying this at the risk of sounding obvious. I’m just asking you to figure out how K/D actually relates to the above statement.

Of course, the immediate objection is “why, K/D totally relates to this, because Kills contribute to your score and Deaths contribute to your opponent’s score.”

Yes. But everyone forgets one thing: Slayer games are a team-based race to a specific goal. You have an objective, which is to get kills, as a team, as fast as humanely possible. Also, you need to fail to commit any betrayals or suicides as those do not contribute to your score. The betrayals and suicides alone throw the thing off a little bit, so it’s not entirely a zero-sum game. But what really throws the thing off balance is that it’s a team game - your team may have players who just fail to keep themselves alive, and there’s nothing you can do about it. That absolutely doesn’t mean you can’t win, though. People have survived pretty horrifying scenarios.

In short, K/D doesn’t measure the team’s capabilities. It only measures the end result which may or may not be related to the overall performance of the team. Does a good K/D make you a good team player? Why, we could start looking at this, but we’d need some interpretation of the data… argh!

Throwing K/D further into disarray is the time dimension. Are we measuring kills and deaths over lifetime? That’s a dubious statistic, especially if you consider the situation of having multiple games - people start playing games, get better, stop playing them, move on to other games, start playing new games with a little bit less trembly hands, and so on. K/D over specific events? Changes of K/D over a period of time? Huh, that sounds more useful - but again, I thought the point of this tool was to be simple and easy to use, and now we’re piling more interpretations over top of it.

Achieving objectives means bloodspatter on your precious K/D record

I’m a bit of a loonie what comes to my playstyle. I absolutely love doing hilarious crap that crinkly-foreheaded people who take the game seriously refuse to do. Death just doesn’t seem like that much of a deal if Halo drops half a screen-widths of medals to go with the kill. I die with my boots on. Sometimes pants on my head.

And I realise that there are players who absolutely don’t want to do this.

This is the real danger of staring too much at K/D. I have absolutely no problems if people just say “but I like crinkly-foreheaded, high-paced shit where everyone plays seriously.” That’s a fine choice, and I can appreciate that. But the problem is that there are people who rather say “Oh, I’d totally play objective gametypes and other random shit, but I won’t, because that will wreck my K/D”.

That way lies madness.

As I demonstrated alone, it’s not that difficult to actually go to HaloTracker and go looking up your individual playlist details. If you want your K/D, you know where to find it.

But if people say that they don’t want to play certain gametypes because that screws up their statistics, maybe that’s a pretty big indication that the statistical methodology used isn’t really that good to begin with.

Why would anyone get attached to statistics that can be manipulated, especially if by “manipulation” you mean “doing stuff you’d normally not do”? As I said, I don’t know much about statistics, but even I’m aware that people are able to see trends and rank things while not allowing obvious outliers to skew the results too much.

And you don’t need to be much of a statistician to know that you’re not supposed to mix two completely different sets of data together anyway. These are effectively completely separate types of games. This probably doesn’t come as a surprise. As I said above, relating even the basic premise of Slayer to K/D statistics can be problematic. The objective gametypes throw everything out of the window, because they no longer measure your killing ability, just your ability to do stuff while being fired upon.

I’ll much rather do stuff while being fired upon rather than worry about bloodspatter on K/D record. People may keep saying that firing upon people is the point of the game, but goddamn it, someone’s got to do the job around here.

Allow me to share the proudest moment of me trying to play MLG. Obviously, those guys are good at shooting other people and shooting at people while objectives are being completed, I’ll definitely give you that. Shame that in this particular game, their ability to get any and all flag-related shit done was a bit suspicious, though. Not as good as mine. I got flag-related shit done.

In short, fuck K/D, just have fun and do awesome stuff. If you absolutely have to use rankings, there are so many more better ways to rank players. Don’t put too much faith in simplistic ratings like this.

Especially don’t allow statistics to keep you down. I’ve noticed that there’s nothing more depressing than staring your game statistics that aren’t going up. Try actually playing the game, they just might get better. Who knows, maybe you even get some flag-related shit done. It is thankless, grueling, shitty job, but someone’s got to do it. And it feels so good.

And a free hint: Halo 4 actually rewards you points for flag-related shit.

If the supposed upcoming rating system can somehow rate your ability to do something I’m afraid to repeat at the risk of running a joke to the ground, the future is indeed very bright.