Joi Ito's Web

Joi Ito's conversation with the living web.

Recently in the Search Category

Google has just launched a "Usage Rights" category in their advanced search. It uses Creative Commons license to allow users to search for works which either "allow some form of re-use" or "can be freely modified, adapted or built upon". This is a great step forward and will hopefully increase the adoption of Creative Commons (CC).

On the other hand, I don't see CC mentioned on the page and having only two choices is limiting, considering the various other licenses that people are likely to use. Yahoo advanced search already has two radio buttons instead allowing you to choose "Find content I can use for commercial purposes" or "Find content I can modify, adapt, or build upon". This actually allows three choices (depending on how you count) and they have a CC logo and a link to an explanation.

I realize it takes a lot for Google to add this and I appreciate all of the work that went into getting this done. Yahoo and Google are both probably testing this feature to some extent. It would be great if you could all spread the word, try to service and give feedback to Yahoo and now Google so they continue to integrate Creative Commons into their offerings.

In addition to Google and Yahoo, there are many other services that have begun integrating Creative Commons. See the web page for more info.

Vint Cerf just left MCI to join Google. Congratulations Vint!

Interesting in the context of eBay buying Skype...

UPDATE: Google press release

Creative Commons: weblog
CC in Yahoo! Advanced Search

Yahoo! Search for Creative Commons is now part of the Yahoo! advanced search

Way to go Yahoo!

(Now close to 16 million pages linking to a CC license.)

Now it's official. Thank you Yahoo!!

BitTorrent have just launched a search service. It allows you to search for legal Torrents. Someone slashdotted the secret URL before it was launched and they moved it about an our later. The amazing thing is that someone wrote a Firefox search extension in the hour it was up. ;-)

Anyway, this launch is official. It looks like they underestimated the interest and the site is really slow right now, but give it a try in a bit.

I would also like to disclose that I am in discussions with BitTorrent about joining their advisory board. It's not inked yet, but I thought I'd mention it since I've been blogging a lot about BitTorrent these days. And just to be clear, this is a recent development and I was not in such discussions during the jury process when we gave BitTorrent an award.

Cory @ Boing Boing Blog
Anti-Starbucks site doesn't use "Starbucks" in name

NPR sez, "'The Delocator' is a site that helps you find independent alternatives to Starbucks in your neighborhood. So why isn't it called the 'Starbucks Delocator'? Because the San Francisco Art Institute was too scared that Starbucks would come through with the corporate smack-down. Of course this renaming means the site won't show up in google when people search for 'Starbucks', and what's the point if people can't discover it? Carrie McLaren is out to change that: she's launched a google campaign to get people to link to it by its real name, the Starbucks Delocator. Take that chilling effects. Now, get your link on!"

Starbucks Delocator Link
(Thanks, NPR and Stay Free Daily!)

Take that!

UPDATE: You can use a company name in the domain name of a non-commercial "sucks" site. via ICANNWatch

UPDATE 2: See also EFF Deep Links for information on the ruling.

Google & Firefox == Evil & Annoying

I recently did a search at google for "radio shack". To my surprise, I received a cookie setting request from radio This had never happened before- and radio shack also happened to be a sponsored link. I did other searches, such as "ford", "sony", and even "girl scouts"- and each time, the top link requested a cookie to be set. Since Girl Scouts did not have a sponsored link- I realised it must only be the top link that sets a cookie. It turned out that Mozilla browsers (that includes firefox) and Google have both enabled prefetch- although it would seem that Google only recently enabled it- as this is a new occurrence. I always verify the setting of cookies- so this makes every google search into an annoying cookie refusal time waste. It would also seem that prefetching is turned on my default in firefox- and is very unintuitive to turn off.

So- for my friends that automatically accept cookies- you are now downloading a page and a cookie nearly every time you use google and firefox together.

And even though I never clicked on their link- and never wanted to visit their site- I'm downloading the top link of my search results to my harddrive every time I do a Google search.

This makes me dislike not only Mozilla- but Google as well.

There is a discussion about this on as well. This feels annoying and since it's more fun to pick on Google than Microsoft these days, I'm blogging it.

New York Times
Growing Number of Lawsuits Could Hurt Google's Ad Revenue

PARIS, March 27 -


This month, Mr. Dariot triumphed in his year-and-a-half-old lawsuit against Google's French subsidiary, which has been ordered to pay him $97,000 in fines and legal costs.

Dariot and his travel companies, Luteciel and Viaticum, successfully challenged Google's practice of selling Internet advertising from rivals designed to appear with Web searches for his trademarked Web site name, Bourse des Vols, which means flight exchange.


Mr. Dariot's company is one of the first to win against Google; similar cases in the United States and Germany that challenged the search engine's use of keywords have failed.

But more companies are piling on. France is home to as many as 15 cases, according to lawyers involved.


In a recent California case, Norm Zada, the chief executive and founder of Perfect 10, a publisher of nude photographs and adult material based in Beverly Hills, said he started sending legal notices to Google about the unauthorized use of his images in 2001.

"After 16 notices, they said they couldn't do anything," Mr. Zada said.

Since then, he said, his attorney has issued a blizzard of 44 notices in the past two years that covered 9,000 unauthorized images. In January, he sued Google in United States District Court in Los Angeles.

Google is in an amazing position to be the target of tons of lawsuits that will set precedent for many important things for us on the Internet. I personally like that Google is pushing the envelope on fair use and other issues. For instance, I think Google Images "thumbnails" are no larger than 150x150 pixels. Because of this, I use 150x150 as my own "safe zone" for "fair use thumbnails". If someone sues me, at least I can point at Google. The other thing that Google, Yahoo and others are involved in is transborder lawsuits, which are a very interesting issue from an Internet governance point of view.

Maybe Google should get into the legal advisory business too. ;-)

Sorry about the light blogging. Have been a bit swamped during my travels. For now I present to you... kittengate.

For some more serious comments on the issue, see the comments on this post.

Six Log
Support for nofollow

Recently, we’ve reached out to other blog tool vendors to try to coordinate information about comment spam techniques and behaviors. As part of these efforts, we’ve also begun to talk to search companies about enriching linking semantics to better indicate visitor-submitted content (like comments or TrackBacks).

The search team at Google approached us with the idea of flagging hyperlinks with a rel="nofollow" link attribute in order to alert their search spider that a particular link shouldn’t be factored into their PageRank calculations. The Yahoo and MSN search teams have also indicated they’d support this new spec, and we’ll be implementing and deploying this specification as quickly as possible across all of our platforms around the world.

This sounds like a good idea. Take a look at the whole post for more details, but your support would be greatly appreciated.
New York Times
Google Is Adding Major Libraries to Its Database


Google, the operator of the world's most popular Internet search service, plans to announce an agreement today with some of the nation's leading research libraries and Oxford University to begin converting their holdings into digital files that would be freely searchable over the Web.

It may be only a step on a long road toward the long-predicted global virtual library. But the collaboration of Google and research institutions that also include Harvard, the University of Michigan, Stanford and the New York Public Library is a major stride in an ambitious Internet effort by various parties. The goal is to expand the Web beyond its current valuable, if eclectic, body of material and create a digital card catalog and searchable library for the world's books, scholarly papers and special collections.

Harvard Pilot Project with Google

I just got a university-wide email regarding a pilot project that Harvard is starting with Google. It looks like Google will also be joining with other universities in this project, which will begin the work of digitizing, and in the case of public domain works providing public access to, the contents of the Harvard library system.

Sounds good. Now if only we can figure out a way to get more of the books, particularly those which are out of print, into the public domain.

I remember someone posting a graphic of how an idea spreads across blogs. the image had a "gray area" of instant messenger and email that couldn't be tracked as easily. I've asked a few people who remember seeing the post, but now no one can find it. Does anyone remember it and have the URL? It's amazing that we remember it, but can't find it or remember who posted it...

UPDATE: Found! Thanks tarek! Amazing. That was less than one hour after I posted this question. I had been googling for it for a day or so.

merkinofbaphomet posts on AnandTech that he just noticed that no abuse images show up on Google Images when you search for Abu Ghraib. The same search on Alta Vista produces a bunch of images.

I DO know that Google Images doesn't refresh their image database that frequently. Is it just that the images haven't made it into the database yet? Does anyone have more info on this? Can someone from Google shed some light?

via metafilter

If you're using OS X, you can now search for Creative Commons content using Sherlock! Just connect to sherlock://


If you don't have OS X, you can still use our search engine to find licensed content.

More info on the CC blog.

What is Google Print?

Google's mission is to organize the world's information and make it universally accessible and useful. Since a lot of the world's information isn't yet online, we're helping to get it there. Google Print puts the content of books where you can find it most easily; right in Google search results.

To use Google Print, just do searches on Google as you normally would. Whenever a book contains content that matches your search terms, we'll show links to that book in your search results. Click on the book title and you'll go to a "content page," where you can see the page containing your search terms and other information about the book. You can also search for other topics within the book. Click on the "Buy this Book" link and you'll go straight to a bookstore selling the book online.
If you're a book publisher and you'd like to have your books included in Google search results, look into the Google Print program for publishers.

Holy shit. Watch out Amazon, here they come!

via danah

UPDATE: It appears that people have known about this since last year and it has been on and off in test mode, but the official announcement was Oct 6th.

Technorati Hackathon, San Francisco: Wed Oct 6, 2004

A must for anyone interested in hacking on Technorati.

Suw Charman writes about Egogooglebombing. I sometimes accidentally do this to people with my moblog.

CNN has invited Technorati back to provide real time analysis of bloggers blogging about the Republican National Convention. Thanks CNN! More on Sifry's Alerts.

I remember when I was Chairman of Infoseek Japan, I would get a weekly list of the top 100 search words. I remember loving this list. You could see watch trends and stuff, but mostly it made you realize just how sick people were. When I was around, the only US search term that beat adult content phrases was "Olympics" and the only Japanese query was "Tamagocchi" when it was all the rage.

Now uber-gadget-hacker Phillip Torrone has brought this experience to the street via the Search Engline Belt Buckle. It uses the SearchSpy service which shows real search queries and is provided by Dogpile, the metasearch engine.

I suppose this is slightly more useful than an RSS feed of my weight, but definitely harder to build.

I had a breakfast meeting with Professor Hirotaka Takeuchi about my doctorate program and I was taking notes in my moleskine notebook. I was jotting down just names and keywords and I think the professor thought it was a bit odd. I realized that taking notes with the intention of googling everything later is very different than taking complete notes. I had never noticed that I had started doing this.

Sifry's Alerts
Technorati and CNN

A few minutes ago CNN announced that Technorati will be providing real-time analysis of the political blogosphere at next week's Democratic National Convention. I will be on-site in CNN's convention broadcast center, along with Mary Hodder, and I'll be providing regular on-air commentary on what bloggers are saying about politics and the convention. And on Sunday, July 25, we'll launch a new section of our site for political coverage: This site will make it easy for bloggers, journalists, and anyone interested in politics to see the postings of the most linked-to political bloggers, to track the ideas with the fastest-growing buzz, and to monitor conversations in thousands of other political blogs. will link to this site, and we'll be updating the CNN site with the latest from the blogosphere.

Great news for us at Technorati and hats-off to CNN for taking this leap. Hopefully this will help people view blogging as a more "legitimate" source of news.

It's interesting to note that it was CNN which broke the big 3 TV network monopoly on news editorial by feeding local TV the raw video feeds, allowing them to edit the news themselves. Similarly, CNN providing bloggers the ability to reach the public directly may have an impact on the way media edits their news.

Obviously, incentive to just be faster, isn't better. I think we're going to get a chance to see whether Technorati authority management and the ability for blogs to fact check and manage news will be able to provide viewers of CNN with additional insight.

UPDATE: Here's the press release from CNN.

OK. I promise not to boast about every 1M blogs Technorati adds, but it's an opportunity to quote some interesting facts.

Sifry's Alerts
Technorati tracks 3 million blogs

On an average weekday, we're seeing over 15,000 new weblogs created per day. That means that a new weblog is created somewhere in the world every 5.8 seconds.

Of course, not all weblogs that are created are actively updated. Even though abandonment rates are high - our analyses show that about 45% of the weblogs we track have not had a post in over 3 months we are still tracking a significant population of people who are posting each day. The number of conversations are increasing. We're seeing over 275,000 individual posts every day. That means that on average, more than 3 blogs are updated every second. The median time from when someone posts something to their weblog to when it is indexed and available for searches on Technorati is 7 minutes. And we're striving to handle the load. But to be perfectly frank, it isn't easy. We've had some bugs and some outages - and for that I am truly sorry. I don't think the service is fast enough or stable enough. So, stability and fast response time is job #1, over new features and product developments. It has to work, 100% of the time.


Nigritude Ultramarine. Here you go Anil.

Brin was no expert on international diplomacy. So he ordered a half-dozen books about Chinese history, business, and politics on and splurged on overnight shipping. He consulted with Schmidt, Page, and David Drummond, Google's general counsel and head of business development, then put in a call to tech industry doyenne Esther Dyson for advice and contacts. Google has no offices in China, so Brin enlisted go-betweens to get the message to Chinese authorities that Google would be very interested in working out a compromise to restore access. "We didn't want to do anything rash," Brin says. "The situation over there is more complex than I had imagined."

Four days later, Chinese authorities restored access to the site. How did that happen? For starters, the Chinese government was deluged with outcries from the nation's 46 million Internet users when access to Google was cut off. "Internet users in China are an apolitical crowd," says Xiao Qiang, executive director of New York-based Human Rights In China. "They tend to be people who are doing well, and they don't usually voice strong views. But this stepped into their digital freedom."

The quick workaround: Chinese authorities tweaked the national firewall, making the new Google China different from the site that was turned off. Today, Chinese who use Google to search on terms like "falun gong" or "human rights in china" receive a standard-looking results page. But when they click on any of the results, either their browsers are redirected to a blank or government-approved page, or their computers are blocked from accessing Google for an hour or two. "They have a new mechanism that can block the results of certain searches," Brin says. Did Google help China find or obtain the filtering technology? "We didn't make changes to our servers" is all he'll say.

Seth Finkelstein describes how Google self-censorship works. Also, Jonathan Zittrain and Benjamin Edelman of the Berkman Center for Internet & Society, Harvard Law School have a paper on Localized Google search result exclusions which is quite interesting.

I can understand from a business perspective why Google would do this, but whenever I bring this up with people they deny it or can't believe it.

Does anyone else have any more information on this?

PS This has nothing to do with trying to hurt Google or their IPO. I've been trying to figure this out for the last few weeks and have reached a dead end in my research so I'm trying to understand more. How companies like this work with governments and how this information is then disclosed is very important.

Google's S-1 is online. (Warning. Big file.)

via CNET

I've been messing around with A9, Amazon's search engine project. It integrates search inside the book, Alexa and the recommendation engine on Amazon so you can find web pages and Amazon will recommend other sites that you might like. Considering how "meta data savvy" Amazon is and how easily they can connect search to their core business, I can see A9 giving Google a pretty good run for their money.

Here's the "" on Amazon. I'm not sure whether I like the fact that they list my address and phone number. Also, I find the data on Alexa a bit sketchy. The traffic chart for my site is very noisy and doesn't track my actual traffic logs and it says my site is "very slow". (It isn't THAT slow is it?) I hope that Alexa gets better now that it's integrated into Amazon. On more famous sites like Boing Boing, they have ratings and reviews.

Christian Lindholm has some good thoughts on A9 on his blog.

Seth Godin
The New Google

Google changed their UI today. The scary thing is how wrong it feels. Obviously, the small changes aren't wrong, but the fact that you notice them is a testament to how spectacular the marketing of the "original" Google was.

The funny thing is... When I was chairman of Infoseek Japan, we would do user surveys every time we changed the UI, and almost EVERY time we did it just about 50% of the people hated the new UI. It was always big let-down after spending so much time re-doing the UI. I think people get used to their tools and hate it when you muck with the design, even if it makes it better... at least unless it really sucked before the change.

New Technorati beta launches. New looks, new features. Go to to give it a whirl.

Yesterday I visited Google Japan's offices then later had dinner with Yajima-san, the CEO of Digital Advertising Consortium (DAC).

We talked a lot about the future of blogging as well as the good old days. Sato-san who was recruited by Google to get the Tokyo office going and Mr. Yajima both worked with me in the early days of getting Infoseek Japan going. We recruited Sato-san from Asatsu to startup Infoseek Japan inside of Digital Garage and Yajima-san was at Hakuhodo in charge of looking at the Internet advertising business.

My company Digital Garage had just lost our offer to do Yahoo Japan because Softbank invested in the parent company in the US and got to do Japan as part of the deal. Softbank offered to give us 1% of Yahoo Japan in exchange for helping them with Yahoo Japan, but we told them to take a hike. (In retrospect, maybe we should have done this deal.) Anyway, I shifted gears and we ended up with Infoseek. I was convinced that search engines were going to be the next big thing.

Softbank needed to get Yahoo Japan's business going so they joined forces with Dentsu the biggest ad agency in Japan to make an ad rep company called Cyber Communications Inc. (CCI). In response, we decided to set up a competing ad rep company. It turns out that all of the non-Dentsu ad agencies combined is about equal to the total revenue of Dentsu. Hayashi-san, my partner at Digital Garage and I gathered all of the other ad agencies together and put together the first consortium of its kind in Japan. We spent close to six months explaining the concept of banner ads and ad impressions. I remember that we couldn't get the ad agencies to understand the notion of ad impressions and how ad prices should be set by page views and not page position. Yahoo was an easier sell because they framed their pitch in old-media terms. IE, the "top page" will cost you X, sponsorship of section Y would cost you Z, etc. I remember explaining that we should be able to target ads based on what people are searching for and that eventually you would even be able to track click-through rates and disintermediate ad sales guys. No one believed me. They did believe me enough to rally against Dentsu/Softbank and form DAC with us to sell Infoseek ads. The first year, our guys were in the market competing with Yahoo Japan, which had a clear head start and we struggled to make a million dollars in sales. That was about seven years ago. Now DAC is public and I'm happy to hear they've got about $100M in revenue and are neck-to-neck with CCI.

Yesterday, we joked about how I was basically dreaming about Google AdSense and AdWords seven years ago. We also talked about how Steve Kirsch, the founder and Chairman of Infoseek was right and the others were wrong. Steve wanted to keep working on the Infoseek search engine, but in the "portal days" Infoseek tried to become a media portal, hiring media people and eventually being acquired by Disney. Infoseek pursued a big media strategy and dropped its focus on search. It's not clear whether Infoseek would have been able to compete with Google, but if they had stayed "just a search engine" maybe they could have given Sergey and Larry a run for the money.

Anyway, seven years after I was getting all excited about search, the search engine has finally become an essential part of the Internet. Even Yahoo has built its own search engine. Too bad it's Google and not Infoseek. ;-p

Having said that, Infoseek Japan still exists and is the third largest portal in Japan after Yahoo and MSN. It is owned by Rakuten and I continue to actively advise the group. Infoseek Japan is a strong profitable portal business but alas, it uses Google for its search results. Considering the fact that most of the original search engine people are gone, I think that Sato-san, Yajima-san and I have probably been in the Internet search engine business longer than just about anyone else in the world... scary thought.

I'm sure everyone's seen this by now, but Yahoo has just rolled out their own search engine. I wonder what this will mean for Google. It sure does look a lot like Google, which I guess is good.

Sifry, "Blog this link." Cool Amazon hack. Dave whipped it together at 2am last night after a chat with John Battelle.

The cosmos that shows the people who have just blogged the link is here.

"Do you know about power laws? Well fuck it, I've got the data."

An interesting point David made was that there are a huge number of blogs with 4 links.

Ev writes about Google's new quicklink for whois. If you search for "whois [domain]" on Google you will get a link to a search on's Global Whois service for the domain name. I guess this is useful for people who won't touch a command line, but I don't think I'd ever use it. I will continue to open a terminal window and type "whois". That way I can continue to see whois spam too. ;-)

$ whois

Whois Server Version 1.3

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to
for detailed information.


How many people who blog know that many blogs automatically send trackbacks or send pings to pingers sites like How many bloggers know that these pings trigger services like Technorati to include their posts in an index and that any mention of my blog in their private diary cause a link to their diary to show up in my sidebar within minutes? One of the things that some of us forget is that it's not all about attention. Most people want a little more attention than they get, but they usually want it from the right people and only when they feel like it. One of the problems of using the "big time bloggers" to design the technology is that we often forget that many people would rather NOT have their contexts collapsed.

I've recently had the experience of receiving inbound links from people who write very personal diaries. I struggled when trying to decide whether I should comment, link to them or otherwise shed attention on a conversation or monologue that appeared to be directed at someone other than me or my audience. A lot of people will say at this point that posting on the "world wide web" is publishing to the public and information wants to be free, yada yada... I would disagree. The tools are just not good enough yet. Live Journal has a feature that allows you to post entries that only your friends can see. I would love to be able to add special comments interspersed in my blog posts for only my close friends.

I know the point is to keep it as simple as possible, and I can already hear the arguments, but wouldn't it be useful if there was a way to manage your audience better on a blog by blog or a post by post basis? It might also make sense to be a bit more explicit to new bloggers/journalers about what the consequences of pinging/trackbacking is.

I remember a message board where activists were preparing to march in protest against the wiretap law in Japan. This message board showed up in search engine results. A well-meaning policeman dropped into the message board and mentioned that they might want to get a permit. The community was in flames about being "wiretapped". So this isn't a new problem. Just bigger. What technology actually does and what people expect it to do are very different so the "technically speaking" answer is not always the real answer. Also, the tensions caused by the technologies should be viewed as opportunities for the innovators.

If you go to google and search for "miserable failure" you get bio of George W Bush. This is a bloggers' google bomb.

Newsday article on the topic

Thanks Kev for the clarification

David Sifry writes about growing pains at Technorati. He apologizes for the slow response, but assures us he's on the case.

When I mentioned on my post about ego-surfing Amazon that I wished I could see more context around where my name showed up in the books, Andy Baio pointed out in the comments that you could click on the page number and see the actual page. (Although RIO points out later that Amazon needs your credit card number before they let you do that.)

Anyway, I was looking at the various pages and found this picture taken by Philip Bailey of John C. Lilly with Barbara Lilly, Kazuo and me in The Scientist: A Metaphysical Autobiography by John Lilly. I'm sporting Anarchic Adjustment threads which were hip at the time and I was helping to distribute in Japan. If I remember correctly, they were having a conference about John C. Lilly's work in Tokyo. I remember lots of academics talking on and on about John Lilly and his work. When John was asked to make a comment at the end, he said, "you all know much more about me than I can remember so I don't have much to add. My forgetery is much bigger than my memory." I remember thinking that was very funny. John Lilly was a very smart and very funny man. I miss him.

It's kind of strange thinking about the path that this photo has taken. I remember Philip taking it, I think I remember seeing a print. Then it got published, printed, scanned, searched, downloaded and now blogged. I assume the copyright holder is Philip Bailey and I assume he doesn't mind me posting this.

PS: Philip, I can't seem to find your email address or your web page. If you see this, can you email me?

You've all probably read by now, but Amazon has added a feature that allows you to search the full text of over 120,000 books. Totally amazing. Now tell the truth everyone (so I don't look totally vain), how many of you have ego-surfed Amazon already? I searched for "joichi ito" and "joi ito". I got 8 results for "joichi ito" and 1 for "joi ito". The weird thing is that other than Timothy Leary's book and John C. Lilly's book, I have never heard of any of the other books. Also, the few books that I do know I'm mentioned in did now show up. I wonder if they are scanning books that don't sell well first. ;-) I DID find out that I have the honor of being in a "For Dummies" book.

Excerpt from page 170 of Digital Aboriginal
. . . voice to the radicals. Japanese information pioneer and digital artist Joichi Ito tells a great story about the CIA." An operative told . . .
How can I NOT buy this book to find out what they said about me. Ack!

The Financial Times
Google considers online IPO auction

By Richard Waters in San Francisco

Google is considering holding a massive online auction of shares early next year in an initial public offering that investment bankers predict could value the internet search-engine company at more than $15bn.

Holy cow. Does anyone have any more information on this?

I wonder if it's going to be a Dutch Auction IPO?

I just did a search on "Joi Ito" and got this as the first link. In fact, all the links on the page had a redirection component in the result links. Normally, Google gives the link to the website directly. Looks like they may now be starting to track clickthroughs. I repeated the search on a few other keywords, and I didn't find it again, so I guess it is one of those Google experiments.
Hey! Stop that!

Reading the comments on Rajesh's blog, it appears that Google does this regularly. I hope it's not personal, and I hope it doesn't have anything to do with Homeland Security. ;-p

To make my point here, I have first admit that I often go to the Technorati top 100 page to see where my blog is ranked. I admit that it goes against my negative feelings about the power law, etc. and is a bit self-absorbed.

Anyway, when Technorati added Live Journal and surged in indexed blogs, my ranking dropped enormously and I was barely still on the list. Lately, I've slowly crawled back up. Recently I've been neck to neck with a blog called ":: i don't give a shit what you think :: ". It felt weird seeing, "I don't give a shit what you think Joi Ito's Web." ;-) I just noticed that I finally passed it. The funny thing about this list is that I seem to have (in my mind) a relationship to blogs close to me in ranking. I tend to read them to see who they are. I see some blogs slip down the list, and some others shoot up. I try to find what causes their rise and fall by looking at their Technorati cosmos.

Am I weird?

I was messing around surfing Google, trying to test something Dave Sifry told me about. The theory was that you actually couldn't get to all of the thousands of results Google says it has when you search for something. I searched for stuff and simply paged forward and found that in fact you did reach an end of search results rather quickly. The number varied so I tried it with "repeat search with omitted results." I found that you can get to exactly result number 999 and no results show after that. I felt like Jim Carrey in The Truman Show...

UPDATE: According to Adam Hill on IRC, it was on the Google Weblog too, but I can't seem to find the entry.

Russell Beattie, Jason Kottke, Cory and Aaron blog about the new Terms of Service for Google AdSense. AdSense is a way to allow you to sell advertising space on your own site to advertisers using Google AdWords. It's a cool service for blogs with lots of traffic to allow people to earn a little money.

The new Terms of Service allow Google to easily cancel your account and several people have complained that they have been shutdown unfairly. More importantly, the new terms do not allow you to talk about AdSense in public or to contact any of the advertisers directly. This seems to go against Sergey Brin's Google rule #1: "Don't be Evil."

I think I remember Google saying at the beginning that they wouldn't do advertising... I guess I remember when Infoseek used to charge the user to search. ;-) Things change... But I hope Google realizes that stifling free speech about part of its service is evil and also stupid.

AP invades Google's turf with Silicon Valley startup


(09-25) 17:13 PDT SAN FRANCISCO (AP) -- Inc. is invading Google's turf with a new online search engine company that hopes to pluck some of the profits pouring into the rapidly growing sector.

Seattle-based Amazon has dubbed its search startup "A9" and set up offices in Palo Alto, not far from Google's Mountain View headquarters. A9 hopes to launch in October with 30 employees and grow much larger as it develops a search engine that will be licensed to other Web sites, said spokeswoman Alison Diboll.

I wonder if this is going to be an html scraper or a web service/feed aggregator? I wonder if it will be the mother-of-all referral/affiliate marketing aggregators... but how can they do search and not compete with the business model of "all roads lead to Amazon"? I wonder if A9 means it's their 9th try...

In any event, writing a search engine from scratch at the dawn of real web services sounds like a lot of fun.

Via Google Weblog

Malach on #joiito was talking about surfing referral logs. I took a look at mine. It's pretty cool that I'm #1 when you google for "best headphones", but it's probably not such a good thing that I'm the 4th site when you google for "glock 23"...

In other "search news"...

Heard a rumor that Google Japan has moved into new digs in the posh Cerulian Tower with a Segway, massage chair, pool table, a lava lamp and everything. Congrats to all of the Infoseek Japan alum working at Google Japan. You've reached, "search nirvana". I'm a bit envious. ;-)

Mitch Kapor and Tim O'Reilly are among advisory board members of Nutch, a new open source search engine project which will try to:

  • fetch several billion pages per month
  • maintain an index of these pages
  • search that index up to 1000 times per second
  • provide very high quality search results
  • operate at minimal cost
Sounds good to me!

John Battelle at Business 2.0 says, "Watch Out, Google".

via Dave Winer at Scripting News.

Aaron Swartz@Google Weblog
Google now has a built in calculator that can tell you everything from 2+2 to speed of light in furlongs per fortnight to 2048 in binary.
Very cool!

Thanks for the link Alberto!

I haven't really commented on the "should blogs be in Google search results" debate, but one random question. What is a blog? What's the technical difference (from the perspective of a search engine) between my blog and The Register? I don't see how you can "filter" blogs. You can obviously change the page ranking mechanism to give certain types of sites an advantage or disadvantage, but I don't see how you can filter blogs. My blog is just a bunch of html created by a content management system.

If more people think that the google search results are poor because the top results are not "relevant" it means the ranking system is broken, not that something has to be "filtered". The whole point of a search engine is that it searches everything and finds the most relevant pages.

Doc just blogged about a thought I just had too. If the big print media put their archives online and made them crawlable and linkable, I bet their page rankings would go up. It's really the links between the archives of the blogs that gives blogs so many links. The solution to googlewashing is probably more about getting other forms of journalism published in a more link-friendly way than filtering the blogs.

TouchGraph GoogleBrowser V1.01 is a cool Java tool to let you see your Google neighbors. Uses Google API. Reminds me a bit of the Blogstreet visual neighborhood.

Via Werblog