IFComp: Postmortem – Bradley C. Buchanan

IFComp 2010 is over and the results have been posted. Congratulations to Aotearoa, Rogue of the Multiverse, and One Eye Open!

I’m always curious about the results of the competition, and how my own votes stacked up against the community opinion. So I grabbed the competition results, and my own votes, and did some decidedly unscientific number crunching, and created the chart above. Let’s discuss!

The Competition Results

My general feeling during the competition was that the quality of the entries was above average this year. I’m not sure what ‘average’ is in the comp, but 16 of 26 games scored over 5.0, and half scored over 6.0. Taking a look at the entries, they can be roughly placed in three or four groups: Three games below 4.0 (the tail), ten games between 4.0 and 6.0 (the middle-runners) and the rest above 6.0 (the head of the pack). The winner (Aotearoa) might also be said to be in its own group, a whole half-point above second place. Interestingly, the top game in the middle-runners, Leadlight, also wins the golden banana of discord with a standard deviation of 2.29, so it might slip into the upper group as well. All told, I think there were a lot of good games in this comp, or at least an unusually small number of bad ones.

Define “Good”

As I reviewed these games I ended up thinking about what makes me think a game is good. Must I enjoy a game for it to deserve a good rating? Must it attempt to do something new and different to be worthy of a top spot? And I think for me a game can be pretty well measured on three axes: Ambition, playability, and fun.

Ambition: Does the game do something new or unusual? Does it have a highly detailed or historically accurate setting? Is it attempting to evoke a specific/difficult emotion in the player? Does it do something different and useful, technically?
Playability: Does the implementation, writing/grammar, and puzzle design hinder the experience, stay out of the way, or improve the experience? Do I have to guess verbs? Can I play without consulting hints? Does the text make me weep tears of joy or tears of grammatical agony?
Fun: Do I enjoy the experience this game creates? Is it interesting or intriguing? Is it funny? Does it make sense to me (or to its target audience)? Am I engaged in the story? Do I want to know what happens next?

Games in the top half of the results tend to have at least two of these traits going for them. I would argue that Aotearoa is strong in all three. I think The Blind House was ambitious and playable, but not very fun because I couldn’t figure out what was going on. On the other hand, Flight of the Hummingbird was playable and fun, but not very ambitious due to both the thin plot and unoriginal use of the superhero trope. Gris et Jaune was fun and ambitious, but its playability suffered because there weren’t enough clues about what to do next.

I think next year I will try scoring games on these three axes, and see if I get more interesting results. Also, what would it look like to judge other games on these axes? I’ve always thought it was a little silly that “Presentation,” “Graphics,” and “Sound” are three of the five categories IGN uses to rate video games. Don’t they all pretty much affect us the same way, compared to the game’s overall design and fun factor?

My Own Ratings

…But that’s not how I rated games this year. I used my letter-grade system from last year, and while I was able to distribute the games pretty well into a rank order, I felt like my system let me down a little bit and it was hard to place some of the games this year. Then if you look at my scores on the chart, it’s clear that my opinions were off pretty drastically from the final scores.

Let me take a moment to explain that chart. I used the full range of scores (1-10) to rate games this year. I like having more granular control to say “I like this one more than that one” and since the final scores are averages, that works just fine with other people who think all games are fives or sixes. But since we all review on different scales, it’s a little tough to compare my scores to the competition averages; fully half of my scores were outside the standard deviation.

Therefore, I (roughly and unscientifically) normalized my scores to a range of 2-8 for this chart, to give them about the same spread as the competition results. (That is, what you’re seeing here is ({my score}/10) * 6 + 2 ) It’s imperfect, but it’s closer, and only four of my adjusted scores are outside the standard deviation (This is cheating, which tells you how far from the average I really was).

Most notable among the games that I had a very different opinion of is our silver-medalist, “Rogue of the Multiverse.” I gave this game a 6, which on my scale this year translates as “a good game needing more polish or more content.” While the game was fun, I was hoping for a less linear experience; it’s entirely possible that I would have given this game a higher score had its title been “Test Subject of the Multiverse” or anything less misleading. My bad. Other reviewers seem especially taken by the NPC Dr. Sliss, whom I found mildly entertaining but never enthralling. Oh well. I also had an unusually bad experience with “Divis Mortis,” which seems to be something of a fluke, a combination of my lack of persistence and a poorly worded description early in the game.

More disturbing to me on a personal level is that I found a couple of poorly rated games much more enjoyable than others did. I gave “Sons of the Cherry” a whopping 8 for a fantastic premise, pretty good writing and the illusion of agency on my first playthrough. I felt like the game was exactly what it set out to be, and hitting one’s mark is no small task. Others lambasted the game for its anachronisms and lack of actual agency, and I’ll confess that it probably would have dropped to 5 or 6 had I realized how little effect my choices were having. I also gave “R” a 5 because, in spite of its somewhat archaic design, it made good use of its sparse text and classic style. Others were less forgiving and the game dropped into the tail section of the rankings.

Why Compare?

Does it matter that my ratings did not look like the competition results? As a reviewer, not really. The diversity of experiences and opinions is good for the competition, and it’s nice to think that my own experience is as valid as any other.

As a game designer, though, I do want my impressions to line up with the majority vote. Knowing what I enjoy is great, but knowing what others enjoy is key to making a well-received game. An idea that I’ve come across both in The Art of Game Design and a recent Escapist video is that the number one skill for a game designer is listening. Listening to your audience, listening to your team, listening to your own reactions and to the world around you is key to being a good designer.

When I saw that my own ratings were way off the comp average, I knew that I needed to do some reflection. In what ways do I disagree with the body of voting participants? Let’s list them (adjusted deviation beyond 0.5):

Games I overrated	Death Off the Cuff Flight of the Hummingbird The Warbler’s Nest Oxygen Pen and Paint The Lost Sheep Sons of the Cherry R
Games I underrated	Rogue of the Multiverse One Eye Open The Blind House Divis Mortis East Grove Hills

Quickly glancing over these lists, you can tell that I prefer short, light games to longer, heavier works. Going back to my three axes, I would say that I undervalue ambition, and probably overvalue playability a bit. The games that I overrated are frequently short and stick to tried-and-true IF techniques. They have very simple, pointed stories. The games I underrated were all unusual in some way: Rogue had a unique NPC and action scenes, One Eye Open has a very detailed backstory and a well-designed flashback mechanic, and Divis Mortis had an ambitious story and setting with a unique player perspective.

Lesson: When I work on a game, I need to not shy away from letting the player do some thinking. The voters don’t want to be spoon-fed, they want to figure things out on their own. Longer games would be good too; voters want to make more than one decision during my game.

The voters are also very tired of the same old IF tropes, and want to experience something new and surprising (I think this is especially true of the blogger-reviewers). Having only been interested in IF for a few years, I’m not as familiar with the road well-traveled and I’m more forgiving of an oft-used setting, and several of the games I overrated have oft-used settings: Murder mystery, superhero, sci-fi, Myst-alike, pirate. The only genre that I underrated (twice!) was “horror in a hospital,” which is apparently better-received than I expected.

Lesson: In every case, do your market research! I didn’t realize that some of these genres are overused; as a designer, I am responsible to go find out what’s been released before, and make sure my game is something new (or at least better). And if possible, make my game difficult to place in a genre. Or set it in a horrifying hospital.

Being a Beta Tester

This is the first year I’ve done any beta-testing for comp games, and I was pleased to work with Hannes Schueller on “Ninja’s Fate” and Timothy Peers on “Heated” (though the latter forgot to credit his testers). I’m less pleased by how these games ranked, and a little bit paranoid that it reflects on me as a beta-tester. I promise I went and read the articles on how to be a good tester. I provided complete (and nicely formatted) annotated transcripts. I sang and burned pants and generally tried to break things. It was, on the whole, a good experience seeing an author’s work in progress and talking to them about how it should be tuned. I thought I was very helpful.

And for all that they placed 20th and 21st. Looking back, I think that regardless of polishing and tuning, the fundamental concept of each was its stumbling block.

“Ninja’s Fate” was the most interesting experience. I had never heard of Paul Allen Panks before working with Schueller, which was probably for the best from a testing perspective. Apparently Panks produced many games, none of which were very good. “Ninja’s Fate” celebrates Panks’ passion for the medium both by setting the game in a museum dedicated to him, and by emulating his (sometimes questionable) design. I did question several design decisions, and in each case Schueller could point out a similar choice by Panks. It really is a fitting tribute, but I’m unsurprised that it didn’t rank well in the competition. Does this sort of game belong in IFComp? I really don’t know.

Edit: Hannes Schueller has posted a Ninja’s Fate website explaining the aforementioned design decisions. I’d also like to thank him for his kind comments about his testers.

“Heated” I don’t want to talk about too much, as I’m not sure what Peers would like to share himself. Let’s just say that the version I tested was a much longer, more difficult game. There was a lot of work to be done to really polish it, and it seems Peers carefully shortened the game rather than release a longer, less polished work into the comp. I applaud the decision (see the overwhelmingly negative reaction to unfinished work year after year), but it doesn’t seem to have done him any favors in the rankings; the cut has caused reviewers to point out an abrupt ending, and to place the game in the unfortunate “my apartment” category that it may have escaped otherwise. I advised such a cut at one point in my notes, and feel a little guilty about it; Timothy, I’m sorry if I advised you poorly. We’ll never know how the longer version would have ranked, but I hope you release a director’s cut so that the community can see the huge amount of work that didn’t get released in the comp.

So as a beta tester, when someone comes to me with a “tribute” game, or a game with a “my apartment” component or feel, do I tell them that their concept is not comp-worthy and they need to start from scratch, or do I help them make the best of the game they come to me with? I think I still lean toward the latter; there’s no reason people shouldn’t take a concept they like and try to do it really well. I’m curious what everyone else thinks, though.

Until Next Time…

That’s how my ratings stacked up in this competition. You can still my final rankings back at the original comp post. The next competition on my radar is Mozilla Labs’ Game On 2010, which I won’t be as involved in but I like to keep my ears open. Good job, IFComp authors! Thanks for your games!

Leave a Reply