CC Complexity

I wanted to drop a simple link on my blog to show a Creative Commons license. Not so simple, as it turns out. My expectation was a parameter-based URL to describe my license (ie. “author=Scott&url=blah&…”), which would make the URL easily constructible and understandable. Creative Commons has chosen to over complicate the linking process, using RDF microformats to describe your license attributes and scraping your page to display your license.

Yes, that’s right, each license click-through triggers a scrape of your page from Horrible. Absolutely horrible.

The only thing of interest for me is how they get it to work. My understanding of XMLNS and RDF is close to zero, but my initial glance tells me it’s anything but simple.

Here’s what I see.

This code:

<a rel="license" href=""><br />
<img alt="Creative Commons License" style="border-width:0" src="" /><br />
</a><br />
<br />This work by<br />
<a xmlns:cc="" href="/" property="cc:attributionName" rel="cc:attributionURL">Scott Manjourides</a> is licensed under a<br />
<a rel="license" href="">Creative Commons Attribution-Share Alike 3.0 United States License</a>.

Will trigger this request from your client (browser):

Which causes a page request from to my server. Apache log on my server to prove it: - - [06/Mar/2008:10:18:41 -0800] "GET /scott/ HTTP/1.0" 200 37985 "-" "Python-urllib/1.16"

The result of this scrape request is a list of attributes describing my license:

({"morePermissionsAgent": "", "morePermissions": "", "attributionUrl": "http:\/\/\/scott\/", "morePermissionsDomain": "", "commercialLicense": "", "attributionName": "Scott Manjourides", "licenseUrl": "http:\/\/\/licenses\/by-sa\/3.0\/us\/", "allowAdvertising": false})

That’s all I can deduce at this point. I don’t know where the /scrape URI is generated, but it is coming from the client browser.

I’m no web genius, but this process seems way too complex to simply display a page describing my license.