Why are we a FAIR organization?21 June 2022
Rapid technical development means that neuroscience datasets are growing ever bigger and more complex, which makes them harder to store, analyze and share. However, if data are organized, well defined and well described in a standardized way, computational methods can help. Standards are needed at all levels: for data collection and description, for reporting methods and documenting workflows, for describing and sharing code for data processing and analysis.
This is where FAIR comes in. The driving force behind the FAIR Principles grew out of the FORCE11 and INCF communities who shared members with a strong interest in openness and sharing of data and code. Our GB Chair, Maryann Martone, gives her personal reflections on the impact of the two communities on FAIR in this recent Comment in GigaScience.
The FAIR principles are guidelines for how to improve Findability, Accessibility, Interoperability, and Reuse of digital assets, such as research data. They emphasize machine-actionability for both data and metadata to make it possible to use computation to deal with big and complex datasets. The FAIR principles recommend using community standards, broadly accepted methods and formats that are commonly used in the field that generated the data. This way, data is easier to share and reuse for others in the same field, and it is easier to develop software tools to read and process the data.
Many subfields of neuroscience still lack community standards, but developing new standards is a complex process that needs community engagement, coordination and consensus. INCF’s mission is to further the use of community standards by helping the international neuroscience community come together to develop standards and best practices for their own research, and by endorsing and promoting the resulting standards and best practices to the wider community.
A brief summary of the FAIR principles:
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services.
Once the user finds the required data, she/he/they need to know how they can be accessed, possibly including authentication and authorisation.
The data usually needs to be integrated with other data. In addition, the data needs to interoperate with applications or workflows for analysis, storage, and processing.
The ultimate goal of FAIR is to optimize the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
To make data “easy to find” and “well described” for machines we need technical solutions like persistent identifiers (PIDs), controlled vocabularies and taxonomies - more about those in future blog posts