Science in the open?

Open Source Science? Or Distributed Science?:
[Via Common Knowledge]

I was asked in an interview recently about “open source science” and it got me thinking about the ways that, in the “open” communities of practice, we frequently over-simplify the realities of how software like GNU/Linux actually came to be. Open Source refers to a software worldview. It’s about software development, not a universal truth that can be easily exported. And it’s well worth unpacking the worldview to understand it, and then to look at the realities of open source software as they map – or more frequently do not map – to science.

The foundations of open source software are relatively easy to track. In the beginning, there was free software and Richard Stallman. RMS didn’t just invent the GPL as a legal, he wrote crucial foundational software for writing software, notably the GNU compiler collection, GNU Debugger, and the original Emacs. So from the beginning, there was not only a free legal tool, but tools for coding that were better than other systems at the time.

Simultaneously, we can see that the emergence of microcomputers and ubiquitous access to the internet expanded the number (and interconnectivity) of potential programmers. Suddenly there were tens of thousands of programmers with computers at home and at work. The explosion of the Web saw the creation of infrastructure like code repositories, version control systems, and coding communities. Thanks to object-orientation, software was also very amenable to being broken into defined, modular chunks and tasks. One coder could work on a kernel function, another on a user interface function, a third on an application, and they could be reasonably sure that as long as they all followed the standards, their work would snap together into the growing distribution. The phrase “open source” can sort of be a shorthand for this kind of innovation, which we also see in wikipedia and other community built projects.

Open source, if we view it through a different lens, is really more about a distributed methodology for software development. The burden of creation is widely distributed across a massive community with more-or-less equal access to tools and systems. In this context, the role of the legal tool is more akin to an enzyme. It was an essential piece of a puzzle, but it was not the only piece. In fact, without the rest of the infrastructure (connectivity, tools, and people) the legal tool on its own would not have led us to GNU/Linux.


A really nice discussion of the differences between Open Source approaches in high tech and the need for Distributed Sources in science. There have been an lot of overlap in the development of new high tech tools and those seen in biotech.

Open source has its place but the idea of making a network deal with the needs of science is something that needs to be given careful thought. But it should be possible.

After all, the Web was created at CERN to deal with distributed science.

Technorati Tags: ,