Skip to Main Content

Open Research Handbook: Open collaboration and citizen science

A practical guide to Open Research

About open collaboration and citizen science

Online computing and research tools allow the researcher to provide direct public access to the research process. Websites, wikis and blogs, online research environments, and citizen science platforms can all be used variously to document and publish the primary processes and materials of research, and enable direct participation in research activities by wider groups of users.

Many of these tools create the possibility of a new kind of research, which extends beyond the closed group to a wider public, and enables the research process to be co-creative, massively collaborative, and to evolve in response to critical feedback.

The basic model of online open collaborative research can be applicable to all research domains, not just the sciences. There may be more specialised tools available for use by experimental scientists, but platforms such as Zooniverse and collaborative tools can be just as effective in areas of arts and humanities or social sciences research.

Citizen science and open collaboration

At its most basic and universal, online open collaboration is built around generic online platforms such as blogs and wikis, which allow public access to and and participation in research.

One of the foundational examples is the Polymath Project started by Cambridge mathematician Tim Gowers in 2009, a blog-based application of the crowdsourcing principle to the solving of mathematical problems, which demonstrated that problems could be solved much more quickly and efficiently if they were published and worked online, with multiple contributors bringing their own pieces to the puzzle and working together to complete the picture. This approach to solving scientific problems is discussed in Michael Nielsen's TEDx talk, Open science now!

The website, blog and wiki continue to be powerful tools for engaging audiences and involving people in research. They are well-suited to managing straightforward interactions. But they are not purpose-built to support research processes and have some limitations:

  • They may be insufficiently dynamic or flexible to manage research workflows and complex collaborative interactions between multiple participants;
  • They may lack version control, past state recovery and information export features, making it difficult to maintain a record of the research process, which may be essential for authentication and replication of results;
  • They may lack key features such as central document storage, content management functions, and access controls, meaning they have to be used in conjunction with other services that provide essential components.

For these reasons online platforms have emerged and developed with specialised sets of features purpose-built to meet specific research requirements.

The concept of open notebook science was introduced in 2006 by the chemist Jean-Claude Bradley. It was explicitly related to the Open Source software model, and defined by the existence of 'a URL to a laboratory notebook that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world.'

A wide variety of Electronic Lab Notebooks (ELNs) is available, from generic tools to those that are designed to work with specific types of experiment, scientific instrumentation or data types. Some of these ELNs will require local installation and/or local management, and may be offered as free/Open Source products or subscription services, but there are a number of services that are fully web-based and available free to individual users or groups. An excellent overview of products is provided by the the Gurdon Institute at the University of Cambridge.

Unlike paper-based lab notebooks, ELNs can be used to make experimental documentation openly accessible in a structured and usable format, either by export into document formats, or, in the case of some online services, by providing direct public access. Most ELNs have been designed around the model of a closed research group or project team, and so may not provide efficient workflows for making information publicly accessible or for enabling open collaboration. For example, the ELN RSpace allows documents or notebooks to be shared with members of a lab group and other RSpace users, but does not provide open collaborative access. It does have integrations with various popular cloud storage services such as Dropbox and OneDrive, and with the online collaborative tool Slack, but does not allow the entire project to be shared as does the Open Science Framework.

The collaborative protocol tool protocols.io applies the version control model of a code repository platform such as GitHub to the experimental protocol. Protocols can be collaboratively developed in a closed group, and then released in public versions, which are assigned DOIs so that they can be cited from related publications. The public versions can be directly commented, but also forked (i.e. cloned) and modified, allowing for iterative and version-controlled open development and refinement of experiments.

The Open Science Framework is a full lifecycle research management platform, run by the non-profit Center for Open Science. It provides:

  • dashboard-based project management functionality, with access controls for closed and public collaboration, version control features, and project analytics;
  • a central document store with file sharing and version control;
  • integrations with Box, Dropbox, GoogleDrive and Amazon Web Services cloud storage and compute, with GitHub for code management, with figshare and Dataverse for data repositories, and with Mendeley for reference management;
  • a pre-registration function for publishing time-stamped study designs;
  • a preprint server for rapid communication of results.

OSF has established itself as a popular platform, particularly in the health and social and behavioural sciences. This is due both to its usability as a total research environment, and to the role of the Center for Open Science as a champion of Open Research, notably through high-profile interventions such as the Reproducibility Project undertaken by COS founder Brian Nosek and colleagues, and through its development and advocacy of solutions for more reproducible and efficient science, including study pre-registrations and the registered reports publication model.

Citizen science is defined in the OED as 'scientific work undertaken by members of the general public, often in collaboration with or under the direction of professional scientists and scientific institutions'. There is no reason why this model of research should be confined to the sciences, although it is here that it has become most well-established. We could expand our definition to include all disciplines, and speak of citizen research or citizen scholarship.

There has also been growing use of online technologies in support of citizen science and public engagement projects, such as those hosted by Zooniverse, where projects that have large amounts of data in need of human analysis can leverage the processing power of the online crowd. This model of research is particularly suitable for projects that require basic analysis or processing of large amounts of data which cannot be undertaken by computer, for example, identifying features or patterns in images, or transcribing images of hand-written texts.

Online citizen science projects can facilitate reproducibility, open up new avenues of research, and lead to new insights. The 'wisdom of the crowd' principle can be used to mitigate human error by taking an average of values and eliminating anomalous outliers. Citizen scientists can identify new features in data, or be inspired to ask new questions and formulate new ways of solving problems. For example, an American high school student who took part in the University's Solar Stormwatch project created an algorithm to trace storm fronts in heliospheric image data, which is now being actively developed in collaboration with our scientists. [case study reference]

Open science now!

Useful links