The Klout Experiment: one week later

On Sunday, January 22nd, I presented on Klout (see this post for the Slideshare), and got my audience to participate in an experiment with me. We were trying to figure out the difference that a retweet, a mention, and a follow make on a user’s Klout score, by doing those actions on three different users. We RT’d this, mentioned this, and followed this account (and if you’re in Halifax, you should be following them already!). I followed the changes in their numbers over the following week, and man. It was not what I expected.

Hypotheses:

  1. RTs would lead to the biggest increase in Klout, and would primarily affect the True Reach submetric. My rationale is that since True Reach gives an absolute number of people that are exposed to the content you create, there would be a large increase if a large number of RTs were made.
  2. MTs would contribute much smaller increases in Klout, and would primarily affect the Amplification submetric, since Amplification measures the likelihood of people responding (in any way) to your content.
  3. Follows would contribute the least to Klout score increases, and would primarily affect the Network Influence submetric. I justified this claim based on something I did at work: when I unfollowed a large number of inactive accounts and accounts with lower Klout scores (<20), and the next day there was a huge spike in my Network Influence (which then returned to normal the following day, but oh well).

Experiment design:
I found three accounts that had their entire Klout score derive from Twitter, and who shared the same Klout score (47-49). I also looked to make sure the True Reach (1300-1900), Amplification (5-15), and Network Influence (27-31) metrics were all similar enough. I also asked the audience to follow all the accounts we tested, so that I could have more data to compare the effects of each individual action, as well as get more generalizable results (because if it turned out that Network Influence change adversely for one account but not for another despite the same amount of new followers, then some more investigation would be required into the mechanics of Klout).

Results:
Experiment 1. True Reach/Retweets: I verified 13 retweets of this tweet, and 11 follows of the tweeter, whattheklout. The initial score of 47.17 Klout (K) decreased every day afterwards, and then plummeted on January 26th. Why? Because subject #1’s True Reach had been revised. As in: even the past data had been re-written. When I first measured True Reach on January 21st, it measured 1363. On January 23rd, it measured 1340. But then when I looked back at the data a week later, and those same days were now displayed as 746 and 733, respectively. Keep reading for what this means for Klout and the analysis of Klout score.

This is whattheklout's True Reach, which no longer displays the same values for True Reach that I measured on January 21 and 23.

Experiment 2. Amplification/Mentions: I verified 7 RTs, 2 mentions, and 0 follows for this tweet and this tweeter, beer47. Klout score has been fairly consistent with the pre-test score (from 48.44 K, it hasn’t varied more than 0.5 K). Amplification stayed constant, and True Reach grew by 65 the day after the experiment. No real indication of a strong correlation between RTs, mentions, and Klout’s metrics, though.

Experiment 3. Network Influence/Followers: I verified 3 follows of @hfxtraffic. Klout score has been increasing gradually (from 49.17 K on January 21 to 49.89 K on January 27), as has Network Influence, which means that Experiment 3 was the only one to show correlation between Klout, Network Influence, and increased follower counts. However, the link of causality hasn’t been established: the likely cause of the higher scores is the fact that Halifax had bad weather around this time, which seems to lead to more people following the account.

Analysis:

1. The experiments failed, probably because I picked out people with really abnormally high Klout scores. Since only the top 5% of Klout users have scores above 50, they require a huge amount of RTs and mentions in a given day for Klout to notice something different. For instance, even the account that we gave the most RTs to – whattheklout – normally gets 3RTs and 6 mentions a day, which means that 12RTs in a single day may not have been enough for Klout’s algorithm to call statistically significant. Beer47 gets an average of 26RTs a day, which means that our 7RTs would hardly have changed the score at all. It’s also likely that Klout discounts moderate-to-strong activity that doesn’t sustain itself: the fact that this experiment was all about one really big but short jump in measurable actions means that any changes that were significant enough would probably have been caught by Klout’s algorithm, which seems to emphasize consistency. It would be interesting to redo this experiment with Klout scores of 30.

2. There’s something really weird going on with True Reach. Every single account we experimented on had its True Reach score change completely after I measured it. Although the whattheklout account had the biggest changes in True Reach (it was retroactively chopped in half), every account I measured had its’ previously reported True Reach change. So check your True Reach today: what is it? In a week from now, if you look at the same day on the graph, I’ll bet you that it will be different.

I have some guesses about why this happens: if Klout’s spidering through the social networks you’re on, it might take a couple days before it sees exactly who had been exposed to what content of yours. This means that it would have to retroactively change scores to reflect your real True Reach. Why the delay for only True Reach? It’s probably the most challenging thing to measure: while Amplification just requires that Klout look at the actions generated from your tweets, and Network Influence requires that Klout look at the Klout scores of the people you follow and who follow you, it seems that True Reach needs a pretty much exhaustive mapping of all social media activity on a given social network for Klout to be able to tell how many people have been exposed to your content. There’s also the possibility that Klout is deliberately masking or changing True Reach numbers to make reverse engineering impossible, which brings us to the next issue…

3. It’s hard to analyze data when Klout changes it retroactively. If True Reach can change without us doing anything, then it’s really hard to assess causality. The causal line from the action an experimenter does to the result in the Klout score is hugely obfuscated, and so you’re no longer sure if it’s Klout’s algorithm making some crazy retroactive changes, or if it’s your actions that have pushed the historical data to change (despite it already having been reported). This whole experiment needs to be redesigned once the mechanics of True Reach are uncovered. The solution is not obvious, but I’m sure as I analyze more Klout accounts, I’ll have a better grasp on what drives specific improvements in the submetrics, and what the relationship is between the submetrics and the main Klout score. Until then, well, let’s keep doing Interwebs Science. Huzzah.

Advertisements