Archive for November, 2008
As I’ve mentioned elsewhere, my goal with Twitterank was to experiment with a slightly different way of quantifying Twitter users, specifically, using @ replies. Quite a few people have asked why we can’t just use follow counts or the number of posts. My answer has been something along the lines of “there are other signals”, which is another way to say, if we use less obvious signals, we might get less obvious (and potentially better) results. Let me illustrate with a few examples…
Earlier today, I got a tweet from @erikvb who asked: “how is it that your secwet alrowithm calculates 224.941 for @google if they don’t have any tweets at all?” Indeed, @google is not following anyone, only has a few hundred followers, and has no tweets. Yet, it’s ranked in the top 20. Weird, huh?
Of course, if you look at all the @ replies that @google receives, you can see that the Twitterank algorithm is behaving as expected. Now, this is weird and surprising, but does that mean twitterank is “wrong”? I’d say “no”. In fact, it’s actually doing exactly what it’s supposed to do: it’s telling us something we might not have otherwise known. It’s actually telling us (or rather twitterers everywhere are telling us) how influential @google could be, if it were a real twitterer with real tweets. It’s like going to a crowded party and pin pointing influential individuals without even looking at them directly or them saying anything. It’s quite powerful stuff.
Here’s another example I stumbled across. Some of you may know (or know of/about) @caterina, who is probably most famous as the co-founder of Flickr. She has close to 3000 followers on Twitter, and you might think someone like her would be influential in twitterverse. As it turns out, her score is currently 59.35 (72.79 percentile, 1.2% confidence), which is high, but not high for someone with so many followers. It actually makes sense if you look at her tweets. She’s obviously not a heavy twitter user, and her tweets are somewhat cryptic, at least to a casual passerby. But how would she stack up against @joetheplumber? He only has 530 followers and 44 updates. Take a guess, then go find out. Were you surprised?
All this is mildly amusing, but Twitterank itself actually isn’t very interesting. Someone’s twitterank is metadata, not data. It’s a spice, not a main course. So, Twitterank still needs to find a main course to spice up, but that’ll come later.
The original version of Twitterank required a user name and password in order to make an authenticated web service request to Twitter and retrieve data that isn’t otherwise accessible. Understandably, many people were reluctant to enter their passwords in a new 3rd party site, and so far, only a small fraction of twitterers have been twitteranked.
The good news is, I have enough twitteranks at this point, that I can start calculating twitteranks for most twitter users, using only publicly accessible data and previously calculated twitteranks. In other words, you can now get any Twitter user’s twitterank, without any passwords. Take for example, @ev, one of the founders of Twitter. He’s so far been cautious enough to not get a twitterank (or he probably doesn’t care), but wouldn’t you want to know what his score would be?
Now you can find out. Check out these screen shots:
Notice that under the score, it says “(30.04% confidence)”. Since the version of the Twitterank algorithm which works without a password uses limited public data, there’s a certain amount of guessing involved. Generally speaking, the higher the confidence, the closer the score should be to what Twitterank would calculate if it had access to the user’s password and can retrieve more data.
In any case, head over to twitterank and start looking up the twitteranks of all your friends who’ve resisted the urge so far. Make sure you let them know how they score too :-)
I’m about to go to bed, because I really really need sleep. Today was pretty quiet, fortunately, and by tomorrow, most of Twitterdom will have forgotten about Twitterank. Thank goodness for short attention spans. So this evening, all you’re getting is a redesign (thanks @redct), and a new Tweet for those of you with scores in the upper 80th percentile (BTW, the latest version says “XX% of twitterers” instead of “XX% of you”, because the latter wording makes your ego seem more inflated than it should be).
Also, I experimented with a Twitterank algorithm that uses an API that doesn’t require authentication, but some crucial information is lost in that case, and the scores are 1) low and 2) less meaningful, which is no good. I have another idea I’ll try tomorrow, but for now it’s back to the old algorithm.
If I were to do this over again, I probably would’ve done 2 things differently. Firstly, I would’ve come up with an algorithm that works with Twitter’s Search API, which doesn’t require authentication. Secondly, I would’ve handled posting to twitter differently, which would’ve sacrificed virality, but after today, I’m not so sure insane rapid growth is really worth it. So… sorry, Twitterverse. I’ll do better next time. Promise.
Also, one last thought before I go to bed (damn, it’s almost 6am here): it doesn’t seem like Twitter rate-limited me, even though I’m certain I went over the 100 requests per 60 minute limit (by orders of magnitude, in fact). I’m not sure if that’s a bug or a feature since I haven’t heard a peep from them… but if anyone at Twitter is listening: thanks guys!
You can now also see the top 50 users with the highest Twitterank scores. What does it mean? Are they any good? It’s hard to say.
… is up: Twitterank Creator Speaks
The main thing I wanted to get at was that this was a small, casual hack, that blew up. The Twitterank algorithm isn’t particularly exciting, and it clearly needs more work. When I made the thing, asking for passwords seemed suboptimal, but not a huge deal since I figured most people would just be turned away by it. We live and learn, I guess.
Anyway, like most internet fads, Twitterank will also eventually fade away. But before that happens, something might come out of it. Or not. We’ll see.
I’ll never convince everybody that this is not a phishing site, but one thing I can do is to try and make it better. So, I’m staying up late to bring you new features. Yay!
You can see your (and other people’s) Twitterank by creating URLs like this:
Where do I stand?
I got a ton of feedback asking “What’s the number mean? Is this good or bad?” That’s my bad. I used the word “rank” but didn’t actually give you a ranking. That’s changed now. Under your score, you’ll see text that says “aprox. XXX percentile”. For example @t_rank currently has a score of 86.41, and “aprox. 83.12 percentile”. That means t_rank has a score higher than approximately 83.12% of twitterers that Twitterank knows about. This ranking is pretty fluid, and will change depending on your score and the distribution of scores in the system.
Thanks to everyone who posted a comment to the previous entry. Since it seemed like there were common questions, I’ll just answer them all in a single post.
Are you a phishing site? Are you going to steal my account? etc..etc..
No, I am not a phisher. I don’t even store your password. Your password gets used once to calculate your Twitterank, and is never stored on disk or any other permanent storage device. Having said that, people do need to be more careful about giving away their account information. I’m not evil, but the next guy might be.
How can we verify that you aren’t storing our password?
I don’t have a good answer, but I’d be happy to do whatever I can to help convince people that this isn’t a phishing operation. I know that the people who know me will vouch for my character and integrity, but I’m also open to showing a trusted 3rd party the innards of the system. If you have any suggestions, please leave a comment.
Why do you need my password to begin with?
There’s some data I use (but not store) that I need to calculate your Twitterank. There are ways for Twitter to make that data available without requiring you to give out your password to 3rd party sites (Facebook, Yahoo! and others have such systems) but Twitter doesn’t yet offer those options to developers. As soon as Twitter adds more secure authentication mechanisms, I’ll switch to that. (*Give or take up to a few days it’ll take to change the code.)
My score is x… is that good?
There’s no good or bad, per se, but a higher score means you are more active and prominent on Twitter. As you and your friends use Twitter more you should see your scores increase.
What kind of “ranking” is this?
I’m sorry the name is misleading. It’s not really a “ranking” even though the name implies it. The name is an homage to PageRank, an algorithm developed by Google’s founders, and is considered to have been part of the secret to their search engine’s success.
Please delete my account from your system.
Again, I don’t have your password, so I can’t abuse your account. But if you really really want to be removed, leave a comment and I’ll delete you from the system. (But please understand that I’m doing this for fun, and have a day job, so I can’t promise to respond instantly. I’ll do my best though.)
I’m setting up this blog in case anyone wants to follow future developments of Twitterank. Right now, Twitterank is spreading like a wildfire through Twitterverse, and that’s according to plan*. A lot of people seem to be asking “um, what’s the point?” The “point”, I assure you, is yet to come. Most of the interesting things can’t be done until lots of people have Twitteranks, but lots of people are indeed getting Twitteranks as we speak (about one a second, total of about 6500).
Also, you can follow Twitterank on twitter: t_rank
*Yeah, most of my previous plans never actually went according to plan, so I can’t say I was really prepared for this one. I probably wouldn’t have done this on DreamHost if I’d known things would go so well…