RDFa

Linked data and staff contact pages

You may remember that I am interested in the extent to which we should use Semantic Web (or Linked Data) on the DCC web site.

Read more >

Semantically richer PDF?

PDF is very important for the academic world, being the document format of choice for most journal publishers. Not everyone is happy about that, partly because reading page-oriented PDF documents on screen (especially that expletive-deleted double-column layout) can be a nightmare, but also because PDF documents can be a bit of a semantic desert. Yes, you can include links in modern PDFs, and yes, you can include some document or section metadata. But tagging the human-readable text with machine-readable elements remains difficult.

Read more >

Science publishing, workflow, PDF and Text Mining

… or, A Semantic Web for science?It’s clear that the million or so scientific articles published each year contain lots of science. Most of that science is accessible to scientists in the relevant discipline. Some may be accessible to interested amateurs. Some may also be accessible (perhaps in a different sense) to robots that can extract science facts and data from articles, and record them in databases.

Read more >

Novartis/Broad Institute Diabetes data

Graham Pryor spotted an item on the CARMEN blog, pointing to a Business Week article (from 2007, we later realised) about a commercial pharma (Novartis) making research data from its Type 2 Diabetes studies available on the web. This seemed to me an interesting thing to explore (as a data person, not a genomics scientist), both for what it was, and for how they did it.

Read more >

IJDC again

At the end of July I reported on the second issue of the International Journal of Digital Curation (IJDC), and asked some questions:

Read more >

The DCC is funded by

Joint Information Systems Committee