If you are thinking "What's this?", see Whose Sounds in the Cellar.
If you are wondering "How did I get here?", use the back button of your browser.
If you are asking "What Sounds?", read on.

More on Technorati's Desi Blog Ranking

Today is/was Singapore's 40th Independence Day. Other than fireworks and flags, it meant that it was a holiday. Sitting at home and reading all your comments, I decided to revisit the blog rankings.

My main aim was to clean up the code so it could potentially be useful elsewhere. So I rewrote it to be extensible (geeks: OOPSified it), a bit faster and 99% automatic. I still have to upload the results to the server. :-)

I thought it wouldn't take more than an hour but it took most of the day. You see, I went and harvested all the links from indianbloggers and sambharmafia and then it turns out, there are some bloggers writing in Hindi and Tamil! Well, I knew it happens, but just didn't anticipate it in this context. You see, my program couldn't handle non-English data!

This being the first time I've had to deal with language encoding issues, most of the day was spent in researching Unicode and coaxing my parsers to respect encodings. At the end of this learning exercise, I present to you the better, bigger and badder Desi Blog Rankings!

These are built from this list of blog urls. As of now, its got nearly four hundred urls. Adding new blogs is just a matter of appending to this file and running an update.

The URLs in the file have to be exactly the blog's URL and not be pointing to some entry in the blog or the domain root. Only when we query Technorati with the correct url does it sends back the correct information viz. the blog's name, rank, links, etc. With an incorrect query url, Technorati returns just the inbound links. So it is vital that the blog url be accurate.

Even with an accurate url, Technorati likes to act naughty and we sometimes get incorrect results. For example, the query for India Uncut always fails although the web based query works just fine. I can't figure out why; I've filed a bug with Technorati, let's see what comes out of it.

Since not all queries yield a cosmos rank, we can't take that as an accurate quantitative indication of popularity. In general, inbound blogs and inbound links are reported correctly, so they are the best measures at the moment.

Also, I am not scraping sites to look for links to RSS/Atom files. If Technorati returns a link, I include it. So don't ask me to manually add RSS links.

As far as I am concerned, this ranking stuff is feature complete. I'll probably add detection and marking of inactive blogs but other than that, I can't think of what else could be done.

Now go blog about it and make it worth my while. ;-)

techtalk | 10 comments | permalink | 10.08.2005 00:08 SGT


Re: More on Technorati's Desi Blog Ranking
Kaps wrote on Wed, 10 Aug 2005 01:50

Great effort. As u said, there seems to be some bug as few blogs (including mine) have the number of incoming and outgoing links but there is no Cosmos Rank attached to it. Would Technorati support team have an answer for this as well? Can the output be generated in the ranking order instead of the alphabetical order? This might be a more digestible format for the blogging community. Thanks for taking up this cause.


Reply to this comment
    Re: Re: More on Technorati's Desi Blog Ranking
    antrix wrote on Wed, 10 Aug 2005 10:26

    I can't do much about Technorati's bugs. Anyway, rank or no rank, just a list of active blogs will be useful. A sort of simpler, more open, blogstreet india.

    The output is sortable on the fly, just click on the column headers. But you are right, on page load, it should be initially sorted by incoming blogs/links. I am trying to figure out how to do this.. I hate Javascript :-((


    Reply to this comment
Re: More on Technorati's Desi Blog Ranking
Ravages wrote on Wed, 10 Aug 2005 16:35

Neat list, but are there only 400 blogs? Ought to be definitely more...

I was wondering...if you could build in a system that also checks to see trackbacks...that is a far better yardstick to measure popularity. A good blogger will inspire other bloggers to take up the thread and talk about it on their own blogs. Ravikiran, Amit Varma and a few others score on this...


Reply to this comment
    Re: Re: More on Technorati's Desi Blog Ranking
    deepak wrote on Wed, 10 Aug 2005 21:15

    For all practical purposes, trackback == inbound links.

    Are they a better measure of popularity? I don't think so. A one-off good/interesting/topical blog entry might generate great buzz and a flurry of trackbacks. Doesn't mean all those folks linking to that entry are regularly going to follow the blog.

    You would have to track the breadth and frequency of trackbacks, a much harder problem.


    Reply to this comment
      Re: Re: Re: More on Technorati's Desi Blog Ranking
      Ravages wrote on Wed, 10 Aug 2005 22:46

      You would have to track the breadth and frequency of trackbacks, a much harder problem.

      Hmmm...true. But wouldn't that enable the really good, popular bloggers get a better rank. As things stand now, inbound links alone are used, and that is not really trustworthy is it? My brother has a blog, and so do his class mates. Now, 60 of them read his blog and he reads 60 other blogs, and they each link to each other. that's easily 59 links and takes him right to the top of blogstreet listings. but is he popular enough?


      Reply to this comment
        Re: Re: Re: Re: More on Technorati's Desi Blog Ranking
        deepak wrote on Wed, 10 Aug 2005 23:40

        In a web searching context, what you've described is known as a link farm. True, they can bias the ranking.

        Now if I were Blogstreet, i.e. had commercial motives, I would write algos to detect and penalize link farm just like Google and Yahoo do. Since I am not, this is it :-)

        If you find that the technorati generated list is a good guide to finding popular and new blogs, I think my goal would be achieved because that is what I am working towards.


        Reply to this comment
          Re: Re: Re: Re: Re: More on Technorati's Desi Blog Ranking
          Ravages wrote on Thu, 11 Aug 2005 09:53

          If you find that the technorati generated list is a good guide to finding popular and new blogs, True, I found a few blogs I had read once upon a time and lost track of since. And a few new ones. Thanks. BTW - Great job with all the coding and stuff


          Reply to this comment
Re: More on Technorati's Desi Blog Ranking
divya wrote on Thu, 11 Aug 2005 16:04

Whatever happened to poor blogstreet india?


Reply to this comment
    Re: Re: More on Technorati's Desi Blog Ranking
    deepak wrote on Thu, 11 Aug 2005 18:25

    Yes, whatever happened? :) I have some thoughts, will write in a future post.


    Reply to this comment
Re: More on Technorati's Desi Blog Ranking
Krishna wrote on Fri, 12 Aug 2005 17:32

Doesn't BlogPulse come up with rankings based on a similar criteria? Wonder how different is the Blogpulse rankings from this one.


Reply to this comment

 
Name:
URL:
Title:
Comment:
Please type spam captcha image in this box