Open data

As part of her role at Bowker, Laura Dawson has been doing some interesting work on identifiers. That work led her to look at developments in the semantic web, where she is part of an effort to better describe books in ways that support search and discovery.

Laura recently sent me a link to 5 Star Open Data, a relatively simple web site dedicated to describing various levels of development in the provision of open data on the web. From least to most useful, the site lists characteristics for the five levels (verbatim):

  1. Make your stuff available on the Web (whatever format) under an open license
  2. Make it available as structured data (e.g., Excel instead of image scan of a table)
  3. Use non-proprietary formats (e.g., CSV instead of Excel)
  4. Use URIs to identify things, so that people can point at your stuff
  5. Link your data to other data to provide context

The various levels were proposed by Sir Tim Berners-Lee, who gets credit for bringing us the web. The site provides specific examples as well as advantages and disadvantages (generally cost and complexity) for preparing data that qualifies for each of the five levels.

This work focuses on open access, but it is interesting on several levels. It struck Laura that most publishers are distributing documents today (whether open data or not) at the first level – just put it on the web. Structured formats, open formats, the use of URIs and linking documents to other documents – these are well beyond what is being done now.

Tim Berners-Lee has a pretty good track record of seeing the potential in a well-structured system. We might start thinking more seriously about linked data. At least Laura is already there.

About Brian O'Leary

Founder and principal of Magellan Media Consulting, Brian O’Leary helps enterprises with media and publishing components capitalize on the power of content. A veteran of more than 30 years in the publishing industry and a prolific content producer himself, Brian leverages the breadth and depth of his experience to deliver innovative content solutions.