Aaron and Daiji Hirata talking about UTF-8
Had lunch with Aaron Swartz.
Aaron Swartz is a teenage writer, coder, and hacker. In 1999, he won the ArsDigita Prize for excellence in building non-commercial web sites. In 2000, he co-authored the RSS 1.0 specification, now used by thousands of sites to notify their readers of updates. In 2001, he joined the W3C's RDF Core Working Group which is developing the format for the Semantic Web. In 2002, he became the Metadata Advisor to the Creative Commons. The rest of the time, he works on a variety of other projects.
He has a Weblog and also runs the Google Weblog. Larry sent me email to make sure that we met while Aaron was visiting Japan. Thanks Larry!

So, we talked a lot about RSS. RSS 2.0 isn't truely XML compliant, but even one of the co-founders of XML, Tim Bray, uses PERL regex to parse XML a lot of the time and doesn't bother with the formalities of running a true XML parser.

Now here's the dirty secret; most of it is machine-generated XML, and in most cases, I use the perl regexp engine to read and process it. I've even gone to the length of writing a prefilter to glue together tags that got split across multiple lines, just so I could do the regexp trick.
Well, if one of the founders of XML thinks that XML parsers are a pain, they probably are. Most RSS news feeders do not parse RSS as XML, but just clean it up and figure it out and doesn't reject non-XML compliant feeds. I have a feeling that these standards committees, while very important, are starting to get away from the original spirit of the Internet of "keep it simple, make it work".

Aaron, is a no-bullshit guy and who spent a lot of time with the W3C folks trying to get them to understand why RSS was so important. Well, I say, lets get on with it and just make it all work, even if it isn't formal XML. RSS is hot right now and wide adoption could revolutionize everything from digital cameras to DRM.

8 Comments

Link at "he works on a variety of other projects" doesn't seem to work. Consider checking . . .

Thanks Karl-Friedrich. Fixed.

Argh!
Yes RSS is VERY important! Especially RSS 1.0! Why because it is RDF based! That said, it isn't RSS that is so important, it is RDF! See the forest for the trees! The true power of RSS 1.0 comes from the framework that RDF provides.

(for example, visit http://www.benhammersley.com/ and look into his RSS 1.0 feed... looky at all the possibilites!)

We neeeed standards to move forward now! Granted those standards need to be simple (something that RDF is arguably not).

The W3C's procedures are almost entirely open to the public (to a point). I strongly urge any and all to get involved! Read the specs (they aren't THAT hard to read, or at least to get a sense of what they imply), sign up for the public mailing lists, etc! Please get involved! :)

http://www.w3.org/
http://www.w3.org/Consortium/#public
http://lists.w3.org/Archives/Public/

Boris. I agree about the importance of standards, but they seem to be getting more and more unweildy. I'm not really in a very good position to talk since I have never made a whole-hearted effort to penetrate the W3C or the IETF, but it seems to me from the sidelines that they are becoming shadows of many of the big huge standards committees that the Internet so elegantly did an end run around with TCP/IP vs. the CCITT stuff.

I like RSS because it works. How much of the stuff coming out of W3C and the XML stuff REALLY works? Do people REALLY use XML parsers? Does it REALLY make sense for me to convert my web page to XHTML?

I'm pretty open to being convinced otherwise and I WILL go and take a better look at the links you posted. I agree that reinventing the wheel is stupid, but as someone who worked on SGML before using HTML, I think that simplicity has almost infinite value if the complex stuff isn't getting adopted...

I do agree with you and with developers who bemoan the "complexity" of RDF and other W3C initiatives. My argument however is that teh W3C is *proposing* standards, based on what it and it's memebership of engineers and developers, in all their experience, deem to be "what's needed". Let me reinforce this: they *propose*. They do not force it upon us.

As said, the Working Draft to Final Recommendation process is open for anyone to comment on. In fact, no W3C proposals go final without having EVERY single public inquiry addressed and responded to.

That said, if RDF, XML, XHTML prove to be really too unwieldy for the developers (which is entirely possible), they should let the W3C know, and not just turn their backs.

Where would the Internet be if engineers turned their backs on the IETF and their RFC system?

We have the W3C, a truly wonderful organisation and resource: let's make use of it! It is imperative! Granted they can use some updated interaction processes... A streamlined RFC process. They need to keep in mind that unlike the IETF, many people who want to interact with them are non-engineers (and it requires a totally different manor of information presentation to interface with non-engineers/computer scientists... strange people they are... ;)

To wrap up, yes simplicity is a key ingredient. Consensus is also... and so is moving forward.

Funny, my thoughts turn now to the importance of simple and effective I/O to allow for a critical mass of usership in order to "allow" emergence... W3C streamlines it's communications interfaces, more minds contribute and converse and decide, and boom.

I'm probably missing something here, but what's wrong with using XSLT/XPath to parse and/or filter RSS? That works fine, doesn't it?

Mike: What you are missing is that you'll have to write XSLT/XPath sheets for each flavor of RSS (0.91, 0.92, 1.0, 2.0) ...

The thing is, RSS 1.0 is the only flavor of RSS which follows the W3C's RDF framework, hence, you can be sure your XSLT/Xpath will work with all RSS 1.0 files.

Also, it allows you to, saaay, insert FOAF info in your RSS feed.. or RDF:Bio information.. or Geo... or WebID ... All we need is the tools to to the reading and writing. All we need is one more "killer use" for RDF and it'll be on it's way...

Hi Boris.

I see. Thanks very much for your response!

"All we need is one more "killer use" for RDF and it'll be on it's way"

Yeah. In fact, I'd say there's already a "killer need" for a universal metadata model.

Some people say Blogs just add noise to the net. I disagree, although I conceed that the argument could be made semi-cogently. As a counter, I would point out that ironically, blogs may have given rise to a technology which meets the need for a universally consumable data format that adds to the structure and organization of the net, thereby making it much more useful.

thanks again,
mike

Leave a comment

About this Archive

This page is an archive of recent entries in the Business and the Economy category.

Books is the previous category.

Computer and Network Risks is the next category.

Find recent content on the main index.

Monthly Archives