Mar 14, 2007

Open Data 2007

The Open Data 2007 session yesterday at Reuters, organized in part by Seth Goldstein, was interesting, as others have reported.

And while much of the debate centered around what exactly were the implications of open data in the web world, I was surprised that there was indeed alot of surprise about the extent to which our behaviors and actions are being tracked and the data therefrom being manipulated and analyzed, for example to provide targeted advertising or to track "buzz data" (people from Tacoda, Right Media, AggregateKnowledge and BuzzMetrics all participated in this event).

In my naivety perhaps I believed it was common knowledge that most people building or involved in web applications knew the magnitude to which these actions are being analyzed. Indeed, with the rise of web services, I would have thought the availability of such services to ingest the stream of data individuals are producing and then output that stream in new ways was not in question. And, that the extent of the use of cookies in online advertising was surely not new news.

My assumptions were proven wrong at the Open Data sessions. Which raises a more fundamental question about our data that was not answered. At a certain level I subscribe to the Jerry Garcia theorem of open data - when Garcia was asked many times how the Grateful Dead could allow the open taping and trading of tapes of their shows, he replied:

"Once we're done with it, you can have it"

Once he produced the music, it could freely be had. The corollary here being that once information is produced in this digital medium it can be utilized by others. The original value to the producer comes with that initial act of producing. Then our interconnected mashed up world takes over. (Of course, Garcia was not referring to others profiting from his downstream music, so maybe this theory falls apart here).

Nevertheless, I think I am in the minority in subscribing to this view. But maybe it's important to consider nonetheless. For example, Roger Ehrenberg's Monitor110 is doing fascinating things monitoring a breadth of content across the web, and then applying analysis and a presentation layer to create actionable value from this information to the investment community. If, for example, I am writing alot about advertising and education, and somehow Roger's engine picks that up and correlates it with other info he then sells to hedge funds interested in private innovation in those segments, what's his obligation to me? To let me know? To compensate me? To allow me to opt out from his collecting activities? And if there is some obligation he has, will that then stifle innovation? Do we even care about that as an outcome? The issues not only relate to rights ($$), but perhaps more important ones such as notice and consent. And, of course, privacy.

And this same type of example applies with respect to the services BuzzMetrics provides, or the data AggregateKnowledge collects.

Ultimately I think in this example he (or any other similar provider) has no obligation to me based on the Garcia Theorem listed above. The social compact, if you will, is that my interest in the content is in the act of producing it, and not down stream from there. It's a cost, if you will, of living and producing digitally. And if someone can utilize that data for new services, maybe the greater good is served.

Or maybe not.