Clay Shirky has a very interesting piece about power laws. He explains that just as with everything else, some blogs get more attention and in fact, the 2nd place blog has 1/2 the value of the 1st place blog, etc. in a 1/n sort of fashion. If you plot this power law distribution, you find that 2/3's of the blogs are "below average" and that this sort of inequal distribution of attention is natural if you think of the way the system works.

Dave protests and says that blogs are different.

Dave Winer
To get an idea of what I'm talking about, skim Clay's article. How many of the weblogs he mentions have you heard of? I found that most of them were strange to me. So if we're hitting a scaling wall, why are these blogs becoming popular, even dominant, without any of us knowing about them? If we were all on a mail list together, believe me, we'd know the names of the people who dominate.
So I am reading Steven Johnson's book Emergence - The connected lives of ants, brains, cities, and software trying to prepare for a 8000 word article I have to write for Illume on the future of information. I've been thinking about just this issue for the last month. I think that trying to connect the discussion about emergence with this issue is key to understanding how blogs are different.
Steven Johnson - Emergence
The technologies behind the Internet--everything from micro-processors in each Web server to the open-ended protocols that govern the data itself--have been brilliantly engineered to handle dramatic increases in scale, but they are indifferent, if not down-right hostile, to the task of creating higher-level order. There is, of course a neurological equivalent of the Web's ratio of growth to order, but it's nothing you'd want to emulate. It's called a brain tumor.
by definition, no page on the Web knows who's pointing back.
Self-organizing systems use feedback to boothstrap themselves into a more orderly structure. And given the Web's feedback-intolerant, one-way linking, there's no way for the network to learn as it grows, which is why it's now so dependent on search engines to reign in its natural chaos.
So as the former Chairman of Infoseek Japan, I use to think about this power law and tried to figure out ways to get EVERYONE on the net to hit the Infoseek top page. We were able to route a significant amount of the Net's traffic through portals because the web pages weren't self-organizing into anything intelligent enough to sort itself out.

Blogs are different. Although the search engines and metaindexes are useful, they are no longer the first place you go. I read my RSS news feeds before I go searching on a portal for news. As Dave says, don't know most of the blogs on the top 100 list and I don't care. We are organized into more intelligent communities and although there is a power law of sorts with respect to blogs that get a lot of attention, there are many local peaks. I think it looks much more like clusters of blogs with interconnections between communities. A lot like a strength of weak ties sort of map.

I'm going to focus on this for my paper. Any references to things I should read or any comments would be very helpful. Sorry to use you all as my editorial support team for my writing all of the time. ;-)


All that was old is new again :). These are probably not the sort of links you are thinking of, the central theorist in this area might be a guy named Karl Deutsch. He suggested that you could map communities my watching the flows of communications. He also--driven by the drive toward cybernetic systems at the time--was particularly interested in the sort of communication that changed structures and organization, that allowed the social organism to control itself.

Ithiel de Sola Pool took off from here and wrote about how technologies that let you talk back are fundamentally more democratic (a "good thing") while broadcasting systems led to authoritarian systems of control ("Technologies of Freedom" from the early 80s).

Even if you've already run across these guys, they deserve a re-read within the current context. Trackbacks, for example, are the perfect example of this kind of communication that is intended to "steer": a feedback system that can allow for the sort of emergent systems Johnson's book talks about (I write, not having read the book yet :).

Blogs are different from the rest of the web, in ways that specifically contradict Johnson's description. Blogs are beginningn to enable "feedback-tolerant, two-way-linking," which means they could be the first step to a network that *does* learn as it grows.

As Alex noted in his response to Clay's essay, the interesting work is in understanding what's happening in the linking itself, not so much in creating statistical models that show yet-another-power-law-distribution.

Your title for this post struck a nerve for me, because I'm becoming increasingly uncomfortable with 2D representations of the time-shifting, multi-faceted space of blogs. Not that I have a great solution, mind you, but I'm not seeing a good fit in what's currently being used.

You might want to check out David Reed's discussion about Reed's Law, which describes how the value of groups scale on the Internet. A number of his papers are available here:

David is also one of my partners on the Croquet project.

Thanks for the pointers. Just saw another good analysis by Ross Mayfield who talks a lot about networks and relationships.

Although the search engines and metaindexes are useful, they are no longer the first place you go. I read my RSS news feeds before I go searching on a portal for news. As Dave says, don't know most of the blogs on the top 100 list and I don't care.

The only reason I know Doc Searls is a high-profile blogger is because everyone keeps talking about him being one. Out of my ninety feeds, links to him show up about as often as links to anyone else.

I guess now that I'm reading nearly a hundred unique blogs, the statistic is spread out a bit; people don't really all post about the same thing, and so there's this nice spread. If Doc posts something interesting, I can watch his comments ripple outwards and stop. If Scoble posts something interesting, it ripples in totally different directions.

Recently I added your blog, because you talked about things I like to hear about; since then, I've gotten some really interesting outlinks that I've never seen before, and I've added a couple to my feed list.

I guess I'm missing the power relationships between them, or the popularity, or whatever. In my RSS reader, y'all are each a voice, and have indistinct, sometimes overlapping communities surrounding you. I don't really notice when Doc has a hundred people linking to him and I don't have any (do I? I don't think so, heh) because it's not really how I'm accustomed to observing communities.

Toph reads on the spur of the moment and misses most of the A-list, but not all. Lots of off-list, good sites, and not too many dogs are on the blogroll. First heard of Columbia from Adam Greenfield, v-2 Organisation, and recorded changing situation 2003/02/01. On Google News at the time, there was an article from the Washington Post that gave the incorrect impression that all was A-OK. Anyway, recommend research blogspace for a traumatic date in history, for shifts in power patterns. Also consider graph distribution annually from longest possible historical view, the early Justin Hall days, forward.

I'd love to see what you come up with in fusing Clay's ideas with Emergence -- I barely mentioned power laws at all in the book, and ever since I've been mulling over the connection as well. Liz is absolutely right that the blogosphere is starting to develop two-way linking, which has all sorts of interesting secondary effects. It has been interesting to see how many of the meta-blog tool created since Emergence came out have been devoted to creating one form of two-way linking or another (trackback, technorati, various google hacks, etc.)

I look forward to seeing what you come up with!


Clay Shirky
You know the answer as well as I do: you use 2 dimensions of representation if you have two dimensions of data. So if you are doing rank by traffic, anything _other_ than 2D is the wrong answer.

If on the other hand you don't want to find the hubs but the neighborhoods or some other multi-D metric, you use a different kind of graph.

So one's reaction to that article comes down to two things. First, do you think rank by traffic is important? I do, Winer and Powers, among others do not.

Second, and this is what I didn't address but it looks like its what you're going after, what other modes of representation capture other important features?

Probably the best book out there on the subject is Duncan Watts "Six Degrees"

Hey Clay.

A few thoughts...

You're not ranking by traffic, you're ranking by links I think. Right? These are different metrics, although maybe you get similar results.

I think that looking at rank by traffic is useful for some things, but I think the reaction you're getting from bloggers is that it misses the reason why we are blogging and makes it too easy for critics to say it's the same as all of the past "Web Revolutions." It doesn't help that you don't have your own blog. ;-)

I truly believe that we may be seeing one of the best examples of emergence on the Web with blogs and some way to amplify the difference between blogs and past systems is probably what I would like to see.

Also, as Dave Winer says, the Technorati top 100 ranking is not as important to me as WHO is linking to me. When I was running Infoseek, all I cared about was HOW MANY pages views we were getting. Sure I brag about my page views to people who don't blog because that's a metric they understand, but the really interesting stuff is going on at a higher level I think.

So... How do we capture the next higher level of order? Well, that's what I hoped you might have an answer for. I think there are ways to look at this subjectively.

One way might be to track a meme through blogspace. See how an idea like your article gets picked up, quoted and where it ends up. Map that and you have one space. Each meme is like a tracer. Some communities will pick up certain ideas, while other will not. You can find the weak ties between to communities as these memes make their way across networks. MANY memes will end up being very local, and SOME will end up on EVERY blog. But I don't know how to do this.

I think that "weak ties" may be a key word. Link across networks probably have a lot of value. Another place to look might be how Google does page rankings. I think there are some good papers about this. But I think, again, single dimension rankings don't capture the higher level order here.

How do scientist measure emergent behavior in ants and so on?

Doc says it so much better than I do, something about what a blog is and what to expect.

I think the crucial element that modifies traditional power law behaviors is the lack of sustainable advantage to the incumbents. If a blog in the top 20% deterioriates in terms of content focus or freshness or other criteria, it will drop out of the 20%. Conversely, if a new entrant brings a fresh flavor to the fray, there are no barriers to it entering the 20%. In other words, unlike other marketplaces of ideas, the web log "blogosphere" remains liquid.

I think blogs and their use or value display more chaotic behaviour than empirical theories might predict. Doesn't it depend more on the time and circumstances of the user both reader and writer.

A book author can never know the consequences of the publishing of the content they create. The internet amplifies this lack of knowledge. I think at the moment the net and blogophere are systems to complex and evolving to accurately describe.

Trying to quantify things like attention, trust, comprehension and such is inherently problematic. Something beautiful and powerful is emerging but it can't be known till after the fact.

