Jump to Navigation

Text mining? No, it's MINE !

Blogs > eScholarship: research data, publishing, impact ...

Computers can rapidly scan through thousands of research papers to make useful connections, but work is being slowed by publishers' unease, according to Richard Van Noorden's article 'Trouble at the text mine' in Nature 483, 134-135 (08 March 2012) doi:10.1038/483134a.

While publishers are happy to sell their content to institutions, publishers generally do not want the same content crawled by text mining software. Agreements are being hammered out on a one-to-one basis - which is not scalable in a world where lots of researchers want to mine data. A bottleneck is being created as the publishers make up their minds. Read how this plays out.

Cameron Neylon summarised what would probably be a lot of people's reactions with his posting They.Just.Don't.Get.It. As he says: "This permission seeking will fail because it cannot scale in the same way that the demand will."

One frustrated researcher has now e-mailed all the main science publishers for permission to mine their content. He will log their responses online in the hope of raising awareness of the problem.

Peter Murray-Rust has updated the issue on his blog. It has been bubbling away for a while.