The one thing that’s wrong with gamification

Games are made by rules. Rules structure our choices in the game. And it’s not a game if there’s no trade-off. So, games require loaded choices: you trade one thing for another, and you can’t have it all. Things that we call “gamified” really fail to do this.

What trade-off did you choose in order to get your Foursquare badge? What rules of the Foursquare game gave meaning to that choice, other than physical limits on how many times you can press “Check In” on your smartphone?

If something is really a game, it has nothing to do with the badges you receive. If something isn’t a game, you can “pointsify” it easily, and then forget all the delicate balancing of rules that makes choices meaningful, at which point you don’t have a game either.

As much as I want to strike the word ‘gamification’ from the universe, it’s probably close to the time at which people figure out that building games is more about building a matrix of choices than it is about building a cabinet for badges, high scores, and so on.

Not simplicity vs. obviousness, but past vs. future

I’m no designer, but now that I’ve been thinking through a few concrete design problems in a project that I hope can change how we listen to music, I’ve come to realize a few things.

When John Gruber and Federico Viticci talk about simplicity vs. obviousness and discovery vs. frustration, I think they’re describing a sticky, problematic period of flux in UI and UX design (one that will continue to repeat itself, forever). For any user, between the user’s current set of capacities and a perfect understanding of how their device and its apps work, they’ll encounter a learning curve. They have to become acquainted with a new way of interacting with a device. Consider the delicate art of learning a musical instrument. It is certainly awkward and definitely frustrating; it takes forever and puts many new demands on you and your muscles. But you just have to look at how painful and monotonous violin practice is in order to understand how good of a job today’s designers are doing at cluing you in to their inventions. In Viticci’s terms, they’re really good at helping you discover what their app does – but the frustration comes later, and inevitably.

So from where does this frustration stem? There’s a pretty intractable problem with all digital interfaces: once you establish them, every single user who currently uses your app/OS/hardware device becomes a reason to not change it, because the user might get confused. Jaron Lanier calls it lock-in, I think. And it’s a big problem – but not for the first version. It takes time, and you’ll encounter it soon enough.For example, when Word changes something, it messes with people’s minds. Engineers think: “Better leave it as it is (but we’ll just add this little thing here).” Then the problems just keep compounding. Soon, you’ll be desperately flipping through tabs and menus and all sorts of crud that has been added ostensibly for your benefit.

I guess the thing that becomes apparent, while digging and scrolling and mouse-hunting for some long-lost feature in Excel, is that designing a familiar, obvious, non-frustrating user interface is only non-frustrating and obvious at first. But as features creep in, as versions iterate, it becomes more and more apparent that the obvious thing is not a permanent thing, because the obvious thing was only obvious thanks to some other thing that had been around previously. Microsoft Word started as a digital version of the printed page. Its functions are, to this day, organized primarily around printing manuscripts and papers and whatever else you might want to get your printer to spit out. But a lot of what we do with text on computers has nothing to do with printing. I haven’t printed anything since graduating university. When I need to make a note, I use Evernote. When I need to store a file, I put it in my Dropbox. But when I need to get someone at work to edit an article, I use Word. And then I e-mail it across the internet to a blogger, who then painstakingly reformats it for her blog. If that sounds odd – as though this Word document is just a container for things that could be handled far better – then you might be right. Technically speaking, Word’s .docx files are closer to .zip files than anything else.

When designers take old, familiar designs and build up a new experience off of them (whether superficially and visually, in Apple’s Find My Friends iOS app, or more fundamentally, like with Word), it’s called skeumorphism. But no matter what its implementation, skeumorphism eventually cracks under the stress of the future, which is really digital and not at all like the concrete artifacts that preceded it. All of this stuff is just… well, it’s bunk. It’s old. It couldn’t guess what we were going to do in the future. Even the stuff we made digitally over these last couple decades is now obselete. Next time you find a song in your iTunes library by searching for it, think about why there’s a library in the first place, or why it’s even organized by album. Think about what people will think when touch is suddenly the “old way of doing things.”

If I could guess what we would be doing in the future, I would guess that we’d be doing the same tasks we are now: we’ll still be writing, and editing, and sharing information. We’ll still be listening to music, discovering, sharing. We’ll still be taking photos, sprucing them up, and sending them to our friends and family.

So what will all that look like? It will always look different, and it will always be changing, and we will always be frustrated, and there will always be a hint of the past in there somewhere (whether you call it skeumorphism or ‘familiarity’ or what). But we ought to know that our frustration isn’t just with the app that is trying to make our experience better. We ought to know that we will be frustrated because whatever can be developed to respond to the needs of the end user, it will always necessarily lag behind what is possible. The frustration we feel is from our old practices stumbling on the new, unusual terrain opened up by technological progress, and that is a feature of the interactive landscape that will only keep growing. We must always be learning in the future. I mean, we should always be learning anyway, but while there’s not much you can hide behind now, there will be even less familiarity and stability in the future. Worse still, for every new frontier opened up by technology (and let’s keep that term as broad as we can), a new expectation is levied – and sometimes the demands are far greater than the demand to learn how to operate the newest app, or operating system, or device. Sometimes, health care or financial decisions ride on our ability to work within the interfaces and technologies offered to us. Soon, a lot more may be expected.

It is up to designers to make these demands as moral, as clear, and as attainable as possible. Even if new experiences might be frustrating (and I think that’s just what learning feels like, most of the time), they need not be damningly inaccessible to all but the youngest and the brightest. If you’re anything like me, you want people to use what you’re making not simply for the sake of growing your ego, but because you really and truly want to make something better. Let’s make sure we remember that ‘making something better’ doesn’t just mean making something more useful, with more features, and so on. It means making things that are responsible for, and respectful of, what the humans that use it will feel when using it.

Why I picked Codeacademy over Udacity to learn how to code

I’ve done the first few lessons on both Codeacademy and Udacity, and I’ve finally settled on one: Codeacademy. Why?

  • Codeacademy teaches Javascript, and Udacity teaches Python. Python’s great and all, but I’m interested in building stuff for the web, and Javascript’s pretty good at that from what I gather. Not like I can’t learn Python later, but why learn something I won’t use?
  • I can go at my own pace – and Udacity was just too damn slow. I found myself skipping through videos to get to the relevant parts, and it’s not because CS profs aren’t good at presenting or anything. The problem is that video is bad for skimming, and skimming is what gets a person to relevant content. With Codeacademy, it’s a lot easier to just jot down notes and then go back over them later to bolster your knowledge.
  • Udacity rewards learning for the sake of learning – but I want to learn for the sake of doing. The sense of accomplishment you get from completing a Udacity assignment is tied directly to the fact that you’re getting marked on it. There are no marks in Codeacademy, so its importance depends on what I do with it, which is a fairly well defined objective for me at this point.

Have you learned how to program recently? Any tricks for getting right to the core of what you want to do? Any resources you’d like to share? I’d love to hear your thoughts in the comments below.

Edit: as Mike in the comments points out, Codeacademy is not Code Academy – so if you’re in Chicago and want some real live and awesome instructions, head on over to Code Academy to see what’s up.

The Klout Experiment: one week later

On Sunday, January 22nd, I presented on Klout (see this post for the Slideshare), and got my audience to participate in an experiment with me. We were trying to figure out the difference that a retweet, a mention, and a follow make on a user’s Klout score, by doing those actions on three different users. We RT’d this, mentioned this, and followed this account (and if you’re in Halifax, you should be following them already!). I followed the changes in their numbers over the following week, and man. It was not what I expected.


  1. RTs would lead to the biggest increase in Klout, and would primarily affect the True Reach submetric. My rationale is that since True Reach gives an absolute number of people that are exposed to the content you create, there would be a large increase if a large number of RTs were made.
  2. MTs would contribute much smaller increases in Klout, and would primarily affect the Amplification submetric, since Amplification measures the likelihood of people responding (in any way) to your content.
  3. Follows would contribute the least to Klout score increases, and would primarily affect the Network Influence submetric. I justified this claim based on something I did at work: when I unfollowed a large number of inactive accounts and accounts with lower Klout scores (<20), and the next day there was a huge spike in my Network Influence (which then returned to normal the following day, but oh well).

Experiment design:
I found three accounts that had their entire Klout score derive from Twitter, and who shared the same Klout score (47-49). I also looked to make sure the True Reach (1300-1900), Amplification (5-15), and Network Influence (27-31) metrics were all similar enough. I also asked the audience to follow all the accounts we tested, so that I could have more data to compare the effects of each individual action, as well as get more generalizable results (because if it turned out that Network Influence change adversely for one account but not for another despite the same amount of new followers, then some more investigation would be required into the mechanics of Klout).

Experiment 1. True Reach/Retweets: I verified 13 retweets of this tweet, and 11 follows of the tweeter, whattheklout. The initial score of 47.17 Klout (K) decreased every day afterwards, and then plummeted on January 26th. Why? Because subject #1’s True Reach had been revised. As in: even the past data had been re-written. When I first measured True Reach on January 21st, it measured 1363. On January 23rd, it measured 1340. But then when I looked back at the data a week later, and those same days were now displayed as 746 and 733, respectively. Keep reading for what this means for Klout and the analysis of Klout score.

This is whattheklout's True Reach, which no longer displays the same values for True Reach that I measured on January 21 and 23.

Experiment 2. Amplification/Mentions: I verified 7 RTs, 2 mentions, and 0 follows for this tweet and this tweeter, beer47. Klout score has been fairly consistent with the pre-test score (from 48.44 K, it hasn’t varied more than 0.5 K). Amplification stayed constant, and True Reach grew by 65 the day after the experiment. No real indication of a strong correlation between RTs, mentions, and Klout’s metrics, though.

Experiment 3. Network Influence/Followers: I verified 3 follows of @hfxtraffic. Klout score has been increasing gradually (from 49.17 K on January 21 to 49.89 K on January 27), as has Network Influence, which means that Experiment 3 was the only one to show correlation between Klout, Network Influence, and increased follower counts. However, the link of causality hasn’t been established: the likely cause of the higher scores is the fact that Halifax had bad weather around this time, which seems to lead to more people following the account.


1. The experiments failed, probably because I picked out people with really abnormally high Klout scores. Since only the top 5% of Klout users have scores above 50, they require a huge amount of RTs and mentions in a given day for Klout to notice something different. For instance, even the account that we gave the most RTs to – whattheklout – normally gets 3RTs and 6 mentions a day, which means that 12RTs in a single day may not have been enough for Klout’s algorithm to call statistically significant. Beer47 gets an average of 26RTs a day, which means that our 7RTs would hardly have changed the score at all. It’s also likely that Klout discounts moderate-to-strong activity that doesn’t sustain itself: the fact that this experiment was all about one really big but short jump in measurable actions means that any changes that were significant enough would probably have been caught by Klout’s algorithm, which seems to emphasize consistency. It would be interesting to redo this experiment with Klout scores of 30.

2. There’s something really weird going on with True Reach. Every single account we experimented on had its True Reach score change completely after I measured it. Although the whattheklout account had the biggest changes in True Reach (it was retroactively chopped in half), every account I measured had its’ previously reported True Reach change. So check your True Reach today: what is it? In a week from now, if you look at the same day on the graph, I’ll bet you that it will be different.

I have some guesses about why this happens: if Klout’s spidering through the social networks you’re on, it might take a couple days before it sees exactly who had been exposed to what content of yours. This means that it would have to retroactively change scores to reflect your real True Reach. Why the delay for only True Reach? It’s probably the most challenging thing to measure: while Amplification just requires that Klout look at the actions generated from your tweets, and Network Influence requires that Klout look at the Klout scores of the people you follow and who follow you, it seems that True Reach needs a pretty much exhaustive mapping of all social media activity on a given social network for Klout to be able to tell how many people have been exposed to your content. There’s also the possibility that Klout is deliberately masking or changing True Reach numbers to make reverse engineering impossible, which brings us to the next issue…

3. It’s hard to analyze data when Klout changes it retroactively. If True Reach can change without us doing anything, then it’s really hard to assess causality. The causal line from the action an experimenter does to the result in the Klout score is hugely obfuscated, and so you’re no longer sure if it’s Klout’s algorithm making some crazy retroactive changes, or if it’s your actions that have pushed the historical data to change (despite it already having been reported). This whole experiment needs to be redesigned once the mechanics of True Reach are uncovered. The solution is not obvious, but I’m sure as I analyze more Klout accounts, I’ll have a better grasp on what drives specific improvements in the submetrics, and what the relationship is between the submetrics and the main Klout score. Until then, well, let’s keep doing Interwebs Science. Huzzah.

Ferris Bueller’s day off with Honda

Watch the sneak peek first:

Doesn’t that bring back all sorts of nostalgic happy feeling for you? Doesn’t a Ferris Bueller Superbowl commercial seem like a great idea right now? Here’s what came of it:

No joke is made better by repeating it (re: The Hangover 2). The longer commercial misses every single mark, and for a CRV? Seriously? It’s kind of hard to imagine Ferris Bueller’s Day Off being all that awesome if he borrowed his mom’s SUV instead of a Ferrari GT California. All that nostalgia fails, for me anyway, to coalesce into something worth watching for any other reason than to see how much they bastardized the original. Weakest point: either the valet saying “Bueller? Bueller?” or Broderick hiding behind the panda.

I’m not sure what category of marketing this fits under, but if they aspired to make branded content, they failed the basic test: make something original and compelling that resonates with people.