Joi Ito's Web

Joi Ito's conversation with the living web.

TinyPictures, a company that I'm an investor in and have been involved in for a long time just released integration with Flickr. The Radar team are really good at mobile apps and have focused for a long time on the social/sharing/comment part of sharing photos. With Flickr integration, the Radar app has now become my client of choice for Flickr reading and commenting on my Blackberry and iPhone.

Gratz to the Radar team and tons of thanks for the Flickr folks for helping out.

See the Radar Blog post for more information.

Congratulations to the Science Commons team and everyone involved for this amazing project. It's great seeing the application of Creative Commons "architecture of openness" being applied in increasingly more sophisticated and diverse ways. Sharing data is really hard and really important and this is a huge step in the right direction.

Sage - Open Access Data from Merck

Posted on: February 27, 2009 12:33 PM, by John Wilbanks

Big news today at the CHI Medicine Tri-Conference. Merck has pledged to donate a remarkable resource to the commons - a vast database of highly consistent data about the biology of disease, as well as software tools and other resources to use it. The resources come out of work done at the Rosetta branch of Merck (you might remember them as the company whose sale capped a boom in bioinformatics) and is at its root a network biology system. In use inside Rosetta/Merck last year alone it led directly to a ton of publications.

This is all going to happen through the establishment of a non-profit organization called Sage to serve as the guardian of the resources. It's not about making a quick data dump onto the web, however. Sage is going to take a while during an "incubation period of three to five which new project data are generated, critical tools for building and mining disease models are developed and governing rules for sharing, accessing, and contributing to the platform are established."

This is complex content and it's going to take some ongoing work to expose everything in a usable way. But the resources are headed for the public domain, and will be a remarkable capacity builder for those who currently work without the best tools and data as a base for their science. Sage means that we are now on the path to a world in which scientists working on HIV in Brazilian non-profit research institutes (like my mother-in-law) will be able to use the same powerful computational disease biology tools as those inside Merck. I'm very much looking forward to living in that world.


Also, on the topic of Science Commons, I just realized I hadn't blogged about GreenXchange. See more about it on the Creative Commons blog.

We've been working with YouTube on this for a long time. Kudos to everyone who helped make this happen. It's very big new for us and a huge step forward.

youtubelogo2YouTube just made an incredibly exciting announcement: it's testing an option that gives video owners the ability to allow downloads and share their work under Creative Commons licenses. The test is being launched with a handful of partners, including Stanford, Duke, UC Berkeley, UCLA, and UCTV.

We are always looking for ways to make it easier for you to find, watch, and share videos. Many of you have told us that you wanted to take your favorite videos offline. So we've started working with a few partners who want their videos shared universally and even enjoyed away from an Internet connection.

Many video creators on YouTube want their work to be seen far and wide. They don't mind sharing their work, provided that they get the proper credit. Using Creative Commons licenses, we're giving our partners and community more choices to make that happen. Creative Commons licenses permit people to reuse downloaded content under certain conditions.

Visit YouTube's blog for information. And if you're are a partner who wants to participate, fill out the YouTube Downloads - Partner Interest form.

As I read The Black Swan by Nassim Nicholas Taleb and a draft of Joshua Ramo's new book, I notice a common theme in many of the good books that I'm reading. Most significant events are not predictable. "Education" and at the notion that we actually understand the world causes us to be unprepared for the unpredictable. Science, which makes a great attempt at trying to make the world appear predictable, is really a rough approximation of things so that our simple minds can try to grasp the complex world around us. It also remind me of Science in Action by Bruno Latour which I wrote about years ago which argues that scientific facts are really a product of a very social and political process and isn't really a kind of channeling of mother nature as it might appear to be.

In The Way of Zen, Alan Watts has a wonderful explanation of how western science and philosophy and words themselves take the unknowable "void" and turn them into "rigorous" and "understandable" abstractions of the world which can't really be described by science or words. In a way, everything we write or argue is a version of the "assume a frictionless surface" or as Joshua says in his book, "imagine a spherical cow" jokes about physicists failing at describing solutions to real-world problems. All of our theories are very incomplete models of the real world and the only way to really get close to understanding the real world requires a kind of "unlearning" and a connection with the real world at an intuitive and an "uneducated" level.

Immersion and mindfulness are really important ways to see things that you normally don't see. I think it was Thich Nhat Hanh who said that a monastery is not a good place to learn to meditate because anyone can meditate in a monastery. (This might have been the Dalai Lama... I can't find the reference right now.) It is through learning mindfulness and meditation when there is chaos, suffering and pressure, that we really learn.

In a way, part of the reason for my moving to the Middle East was that while I continue to learn in any environment, days that I spend in the US or Japan tend to be mostly similar to previous days and relatively predictable, pushing me towards the somewhat typical mode of feeling in control or knowledgeable about what's going on.

What I find fascinating (and stressful) is that every day I spend in the Middle East is completely full of surprises and pushes me closer and closer to the understanding that I really don't understand anything. Sort of the pure idiot mode. In a way, I've become more aware and much more mindful of everything. One effect of this is that I less and less fear of the unpredictable and the unknown and unknowable.

I'm still really at the beginning of my immersion process, but chatting with everyone about my experiences in Dubai and reading some of the books that I brought with me helped me tie together some of these thoughts and reflect so I though I'd share. ;-)

On the blog confused of calcutta and open... there are thoughtful posts here and here about my comments regarding RDFa (a W3C recommendation about how to express license and other information about content in XHTML) in my talk at DLD. I agree with a lot in the posts and my only defense that I have is that it's very hard to present a nuanced argument, especially to an audience mostly pre-disposed to disagree with you in 30 minutes and I had to drastically over-simply what we're doing. I posted a comment on the blogs and it's stuck in moderation so I decided to go ahead and convert the comment into a blog post.

Let me see if I can clarify a bit...

First of all, one of the risks we always run with Creative Commons is that people misunderstand that somehow a CC license trumps things like fair use. In no way does the license itself diminish fair use. First of all, it can't. Secondly, we would never want it to. One of the things we struggle with is how to make this more clear. See our blog post about the Gatehouse Complaint for an example of our position on this.

In my talk at DLD, I'm guilty of over-simplifying what RDFa does and how it might "prevent piracy". I think the primary thing that it does is expose copyright information to users, creators, software and services so that things like attribution, how to make payment for rights and other things that users might want to do will be easier and lower-friction. My assumption is that most users will want to follow the law if that choice is simple. If you are easily able to search for works that you may use legally, it is more likely you will opt to use these first. The primary driver for "protection against piracy" is the crowding out of content that is illegally used by content that wants to be used.

I think this happens in open source and free software. I think Wikipedia is a very good example in the content space. You can use Wikipedia for free, legally, and it makes it less likely you'll need to illegally access similar content that is not free to use. The process that is current managed by hand and a few scripts on Wikimedia Commons is a very good example of what could be made easier with RDFa. Currently, if you are writing an article and you need an image, you go to Wikimedia Commons and look for the image or you upload one if one doesn't exist. A large number of volunteers scrub uploads and protect the Wikimedia Commons against copyright violations. When you upload a work, there are several ways that you "prove" it is legitimate work for the commons. There is an uploader that will grab the metadata from Flickr, you can cite an email or other document and assert that the owner has licensed it under one of the free licenses, you can assert it is your own work and you are willing to license it under a free license, or you can make an argument that the use is fair use in one of their forms.

What RDFa would do in this context is that an uploaded object could easily be checked for a free license. Such work would be immediately approved. Having your camera or Photoshop also RDFa aware would make it much easier for the uploaders who often stumble through the Wikimedia forms to upload their own content, only to have it erased because they filled out the licensing form incorrectly. If the work doesn't have verifiable or any RDFa information with a free license, then you would need to go through some work to verify whether the work was actually free to use, fair use being one of them, just as we do now.

RDFa is not like DRM in that I do not think it will prevent people from making illegal or fair use copies or remixes, but that instead it will reinforce and make easy the use and verification of free licenses and content and eliminate one cluster of illegal users who would use free content if there were a choice.

Also, I would add that I don't think DRM will be successful. My point with respect to DRM is that those people who decide to ignore copyright notices in metadata are probably just as likely to use software to strip DRM. Exposing the underlying intent of the copyright owner will make "good behavior" easier since preventing bad behavior is nearly impossible. Also, it will make it clear when you will need to defend the use of a work under fair use or something similar and when you are within the scope of the copyright license.

I'm sure the rigidity of the implementation will vary just as the "validity" of the html on the web varies widely. Ultimately, I think the best implementations will help the user be aware of all of the copyright information and help preserve and annotate this information with the least friction.