INCF SIG on Neuroshapes: Open SHACL schemas for FAIR neuroscience data
This SIG aims to coordinate community efforts for the development of open, use case driven and shared validatable data models (schemas, vocabularies) to enable the FAIR principles (Findable, Accessible, Interoperable and Reusable) for basic, computational and clinical neuroscience (meta)data.
Sean Hill, Krembil Centre for Neuroinformatics, CAMH, Toronto, Canada
Andrew Davison, UNIC, CNRS, Gif sur Yvette, France
Mohameth François Sy, Blue Brain Project, EPFL, Geneva, Switzerland
About this SIG
The INCF Neuroshapes Special Interest Group (SIG) will coordinate community efforts for the development of open, use case driven and shared validatable data models (schemas, vocabularies) to enable the FAIR principles (Findable, Accessible, Interoperable and Reusable) for basic, computational and clinical neuroscience (meta)data. In addition, provenance information is used to define the context from which scientific data was generated, providing important information about the type of the data, its significance, quality and potential for integration and reuse.The data models developed thus far include entities for electrophysiology, neuron morphology, brain atlases, and computational modeling. Future developments could include brain imaging, transcriptomic and clinical form data, as determined by the SIG membership and community interests.
The main goal is to promote:
- the use of standard semantic markups and linked data principles as ways to structure metadata and related data: the W3C RDF format is leveraged, specifically its developer-friendly JSON-LD serialization. The adoption of linked data principles and JSON-LD will ease federated access and discoverability of distributed neuroscience (meta)data over the web.
- the use of the W3C SHACL (Shapes Constraint Language) recommendation as a rich metadata schema language which is formal and expressive; interoperable; machine-readable; and domain-agnostic. With SHACL, (meta)data quality can be enforced based on schemas and vocabularies (easily discoverable and searchable) rather than being fully encoded in procedural codes. SHACL also provides key interoperability capabilities to ensure the evolution of standard data models and data longevity. It allows to incrementally build standard data models in terms of semantics and sophistication.
- the reuse of existing schemas and semantic markups (like schema.org ) and existing ontologies and controlled vocabularies (including NIFSTD - NIF Standard Ontologies)
- the use of the W3C PROV-O recommendation as a format to record (meta)data provenance: a SHACL version of the W3C PROV-O is created.
Also, this SIG aims at creating a community for an open and use case driven development of not only data models (schemas and vocabularies) and tools around them but also guidelines for FAIR neuroscience (meta)data.
The produced data models (schemas and vocabularies) and tools are developed, maintained and shared through Github: github.com/INCF/neuroshapes . This Neuroshapes repository currently contains data models developed and adopted by the Human Brain Project community and the Blue Brain Project. They are tested using the Blue Brain Nexus Knowledge Graph . A web domain (shapes-registry.org) has been secured to offer a Web interface to search and provide persistent access to the schemas.
Upcoming and recent meetings
The SIG met before Neuroinformatics 2018 in Montreal, on August 8.
This meeting was the first one for the INCF Neuroshapes Special Interest Group. The goal was mainly to present Neuroshapes goal and motivation to the Neuroscience community at INCF NI2018. But also to connect with other data sharing and open science initiatives like NIDM to see if they can adopt Neuroshapes’ approach in term of data models. Participants showed interest in using the W3C SHACL specification, as a way to complement existing data models with data validation capability. They showed interest in the ability to describe what are the expected properties of a dataset by mean of schemas using json-ld (semantic markups) and W3C SHACL.
The participants identified the need to have a lightweight SHACL python validator to speed the adoption of Neuroshapes within the Neuroscience community even though many JAVA based SHACL validators exist. The SIG welcomes any contribution in term of supported data models (specially a SHACL version of BIDS) and python tools.
A meeting is planned for end of October or beginning of November 2018.
Read the full meeting report here.
Blue Brain Project
Human Brain Project
Krembil Centre for Neuroinformatics
Sean Hill, Krembil Centre for Neuroinformatics, CAMH, Chair
Andrew Davison, CNRS, Human Brain Project, Chair
Anna-Kristin Kaufmann, EPFL, Blue Brain Project
Huanxiang Lu, EPFL, Blue Brain Project
Tom Gillespie, UCSD, Neuroscience Information Framework
Genrich Ivaska, EPFL, Blue Brain Project
Oliver Schmid, EPFL, Human Brain Project
Jean-Denis Courcol, EPFL, Blue Brain Project
Samuel Kerrien, EPFL, Blue Brain Project
Jeff Muller, EPFL, Human Brain Project
Mohameth François Sy, EPFL, Blue Brain Project, Co-Chair
Bogdan Roman, EPFL, Blue Brain Project
Pradeep Reddy Raamana, Rotman Research Institute