Archive for November, 2008
As I’ve mentioned elsewhere, my goal with Twitterank was to experiment with a slightly different way of quantifying Twitter users, specifically, using @ replies. Quite a few people have asked why we can’t just use follow counts or the number of posts. My answer has been something along the lines of “there are other signals”, which is another way to say, if we use less obvious signals, we might get less obvious (and potentially better) results. Let me illustrate with a few examples…
Earlier today, I got a tweet from @erikvb who asked: “how is it that your secwet alrowithm calculates 224.941 for @google if they don’t have any tweets at all?” Indeed, @google is not following anyone, only has a few hundred followers, and has no tweets. Yet, it’s ranked in the top 20. Weird, huh?
Of course, if you look at all the @ replies that @google receives, you can see that the Twitterank algorithm is behaving as expected. Now, this is weird and surprising, but does that mean twitterank is “wrong”? I’d say “no”. In fact, it’s actually doing exactly what it’s supposed to do: it’s telling us something we might not have otherwise known. It’s actually telling us (or rather twitterers everywhere are telling us) how influential @google could be, if it were a real twitterer with real tweets. It’s like going to a crowded party and pin pointing influential individuals without even looking at them directly or them saying anything. It’s quite powerful stuff.
Here’s another example I stumbled across. Some of you may know (or know of/about) @caterina, who is probably most famous as the co-founder of Flickr. She has close to 3000 followers on Twitter, and you might think someone like her would be influential in twitterverse. As it turns out, her score is currently 59.35 (72.79 percentile, 1.2% confidence), which is high, but not high for someone with so many followers. It actually makes sense if you look at her tweets. She’s obviously not a heavy twitter user, and her tweets are somewhat cryptic, at least to a casual passerby. But how would she stack up against @joetheplumber? He only has 530 followers and 44 updates. Take a guess, then go find out. Were you surprised?
All this is mildly amusing, but Twitterank itself actually isn’t very interesting. Someone’s twitterank is metadata, not data. It’s a spice, not a main course. So, Twitterank still needs to find a main course to spice up, but that’ll come later.
The original version of Twitterank required a user name and password in order to make an authenticated web service request to Twitter and retrieve data that isn’t otherwise accessible. Understandably, many people were reluctant to enter their passwords in a new 3rd party site, and so far, only a small fraction of twitterers have been twitteranked.
The good news is, I have enough twitteranks at this point, that I can start calculating twitteranks for most twitter users, using only publicly accessible data and previously calculated twitteranks. In other words, you can now get any Twitter user’s twitterank, without any passwords. Take for example, @ev, one of the founders of Twitter. He’s so far been cautious enough to not get a twitterank (or he probably doesn’t care), but wouldn’t you want to know what his score would be?
Now you can find out. Check out these screen shots:
Notice that under the score, it says “(30.04% confidence)”. Since the version of the Twitterank algorithm which works without a password uses limited public data, there’s a certain amount of guessing involved. Generally speaking, the higher the confidence, the closer the score should be to what Twitterank would calculate if it had access to the user’s password and can retrieve more data.
In any case, head over to twitterank and start looking up the twitteranks of all your friends who’ve resisted the urge so far. Make sure you let them know how they score too :-)
I’m about to go to bed, because I really really need sleep. Today was pretty quiet, fortunately, and by tomorrow, most of Twitterdom will have forgotten about Twitterank. Thank goodness for short attention spans. So this evening, all you’re getting is a redesign (thanks @redct), and a new Tweet for those of you with scores in the upper 80th percentile (BTW, the latest version says “XX% of twitterers” instead of “XX% of you”, because the latter wording makes your ego seem more inflated than it should be).
Also, I experimented with a Twitterank algorithm that uses an API that doesn’t require authentication, but some crucial information is lost in that case, and the scores are 1) low and 2) less meaningful, which is no good. I have another idea I’ll try tomorrow, but for now it’s back to the old algorithm.
If I were to do this over again, I probably would’ve done 2 things differently. Firstly, I would’ve come up with an algorithm that works with Twitter’s Search API, which doesn’t require authentication. Secondly, I would’ve handled posting to twitter differently, which would’ve sacrificed virality, but after today, I’m not so sure insane rapid growth is really worth it. So… sorry, Twitterverse. I’ll do better next time. Promise.
Also, one last thought before I go to bed (damn, it’s almost 6am here): it doesn’t seem like Twitter rate-limited me, even though I’m certain I went over the 100 requests per 60 minute limit (by orders of magnitude, in fact). I’m not sure if that’s a bug or a feature since I haven’t heard a peep from them… but if anyone at Twitter is listening: thanks guys!
You can now also see the top 50 users with the highest Twitterank scores. What does it mean? Are they any good? It’s hard to say.
… is up: Twitterank Creator Speaks
The main thing I wanted to get at was that this was a small, casual hack, that blew up. The Twitterank algorithm isn’t particularly exciting, and it clearly needs more work. When I made the thing, asking for passwords seemed suboptimal, but not a huge deal since I figured most people would just be turned away by it. We live and learn, I guess.
Anyway, like most internet fads, Twitterank will also eventually fade away. But before that happens, something might come out of it. Or not. We’ll see.
I’ll never convince everybody that this is not a phishing site, but one thing I can do is to try and make it better. So, I’m staying up late to bring you new features. Yay!
Where do I stand?
I got a ton of feedback asking “What’s the number mean? Is this good or bad?” That’s my bad. I used the word “rank” but didn’t actually give you a ranking. That’s changed now. Under your score, you’ll see text that says “aprox. XXX percentile”. For example @t_rank currently has a score of 86.41, and “aprox. 83.12 percentile”. That means t_rank has a score higher than approximately 83.12% of twitterers that Twitterank knows about. This ranking is pretty fluid, and will change depending on your score and the distribution of scores in the system.