RDFa and Creative Commons

Ito, Joichi

On the blog confused of calcutta and open... there are thoughtful posts here and here about my comments regarding RDFa (a W3C recommendation about how to express license and other information about content in XHTML) in my talk at DLD. I agree with a lot in the posts and my only defense that I have is that it's very hard to present a nuanced argument, especially to an audience mostly pre-disposed to disagree with you in 30 minutes and I had to drastically over-simply what we're doing. I posted a comment on the blogs and ~~it's stuck in moderation~~ so I decided to go ahead and convert the comment into a blog post.

Let me see if I can clarify a bit...

First of all, one of the risks we always run with Creative Commons is that people misunderstand that somehow a CC license trumps things like fair use. In no way does the license itself diminish fair use. First of all, it can't. Secondly, we would never want it to. One of the things we struggle with is how to make this more clear. See our blog post about the Gatehouse Complaint for an example of our position on this.

In my talk at DLD, I'm guilty of over-simplifying what RDFa does and how it might "prevent piracy". I think the primary thing that it does is expose copyright information to users, creators, software and services so that things like attribution, how to make payment for rights and other things that users might want to do will be easier and lower-friction. My assumption is that most users will want to follow the law if that choice is simple. If you are easily able to search for works that you may use legally, it is more likely you will opt to use these first. The primary driver for "protection against piracy" is the crowding out of content that is illegally used by content that wants to be used.

I think this happens in open source and free software. I think Wikipedia is a very good example in the content space. You can use Wikipedia for free, legally, and it makes it less likely you'll need to illegally access similar content that is not free to use. The process that is current managed by hand and a few scripts on Wikimedia Commons is a very good example of what could be made easier with RDFa. Currently, if you are writing an article and you need an image, you go to Wikimedia Commons and look for the image or you upload one if one doesn't exist. A large number of volunteers scrub uploads and protect the Wikimedia Commons against copyright violations. When you upload a work, there are several ways that you "prove" it is legitimate work for the commons. There is an uploader that will grab the metadata from Flickr, you can cite an email or other document and assert that the owner has licensed it under one of the free licenses, you can assert it is your own work and you are willing to license it under a free license, or you can make an argument that the use is fair use in one of their forms.

What RDFa would do in this context is that an uploaded object could easily be checked for a free license. Such work would be immediately approved. Having your camera or Photoshop also RDFa aware would make it much easier for the uploaders who often stumble through the Wikimedia forms to upload their own content, only to have it erased because they filled out the licensing form incorrectly. If the work doesn't have verifiable or any RDFa information with a free license, then you would need to go through some work to verify whether the work was actually free to use, fair use being one of them, just as we do now.

RDFa is not like DRM in that I do not think it will prevent people from making illegal or fair use copies or remixes, but that instead it will reinforce and make easy the use and verification of free licenses and content and eliminate one cluster of illegal users who would use free content if there were a choice.

Also, I would add that I don't think DRM will be successful. My point with respect to DRM is that those people who decide to ignore copyright notices in metadata are probably just as likely to use software to strip DRM. Exposing the underlying intent of the copyright owner will make "good behavior" easier since preventing bad behavior is nearly impossible. Also, it will make it clear when you will need to defend the use of a work under fair use or something similar and when you are within the scope of the copyright license.

I'm sure the rigidity of the implementation will vary just as the "validity" of the html on the web varies widely. Ultimately, I think the best implementations will help the user be aware of all of the copyright information and help preserve and annotate this information with the least friction.

3 Comments

01. Don Park - Feb 02, 2009 - 00:24

Joi, there is a simple, albeit absurd at first glance, solution to address the key concern which is software telling users what they can't do.

If RDFa could only assert positives and never negatives, then RDFa aware software will know with certainty only what users can do, leaving the rest ambiguous enough to limit worst case software behavior to warnings.

02. karl - Feb 02, 2009 - 03:22

@don: RDFa is a language to manipulate vocabularies. Like any languages you can abuse them and people will. That's normal. It's part of the process (says the song). The important is to have a mechanism to declare things. Then the network effect and the community will find their own regulatory frameworks depending on the contexts.

03. Don Park - Feb 02, 2009 - 08:38

@karl: I understand the nature of RDFa as well as W3C (egad). My comment was to point out and apply to the key concern here the usefulness of intentional empty spaces in design, whether it's technical standards, music, or art.

I disagree that 'the community will find their own...". It may. It may not. Chaos can be useful, like boiling hot water, but good ramen doesn't make itself.

Joi Ito's Web

Joi Ito's conversation with the living web.