GSoC 2018 project ideas
We have been accepted as a mentoring organization again in 2018, with projects proposed and mentored by our Nodes and international community.
Below is our project ideas list, which is now final.
We have a discussion forum for mentors and interested students. To be added, email email@example.com with a description of which project you are interested in.
If you have general questions about INCF and our participation in GSoC, please contact us on firstname.lastname@example.org.
- 2018 Google Summer of Code webpages (see especially the timeline and the F.A.Q.)
- Google Summer of Code student guide
Proposals and ideas for potential INCF projects within Google Summer of Code
1. Dynamic GUI for designing and launching signal processing method workflows in a Hadoop Infrastructure
Description: Our current work is focused on development of an assistive system for motor impaired people using EEG event-related potential signal processing and machine learning (i.e. brain-computer interface) methods. This system collects human brain data of audio/video stimulated subject. Then collected data are processed by customized classifiers and feedback is provided. According to results of classification, a desired activity can be performed. It can be switching on a TV, opening a window sunblind etc.
Because a large amount of collected data is trained by time-consuming classifiers, we operate an apache-hadoop server providing both a data storage and an environment for running signal processing methods. In the current state, the system lacks a user friendly and easy-to- use user interface.
Aims: The aim for an applicant is to design and develop a flexible user interface able to generate user input forms according to parameters of individual signal processing methods.
We will provide a client and a server where used signal processing methods are installed. The student is requested to implement a system that reads all input and output parameters of each method and generates a user interface customized by specific method needs. Such a generated interface allows the user to fill in the method parameters and run the method on a selected dataset. Such a designed user interface will be easily regenerated when a new method in the library is added.
The student should also design and implement a format/language (XML, JSON) describing the GUI forms respecting the method parameters.
Finally, methods can be run in a workflow in which a result of one can be used as an input of a next one. There must be tested if there is a match between the method input and the output and if so, the connection can be displayed in a form of an arrow. Each workflow can be represented by XML or JSON format as well.
Skills: Java, Maven, JSON, XML, REST GitHub, optionally Hadoop
Mentor: Petr Ježek (University of West Bohemia, Czech Republic, Czech National INCF Node)
2. The Virtual Brain projects
There are several modeling studies using brain network models which incorporate biologically realistic macroscopic connectivity (the so-called connectome) to understand the global dynamics observed in the healthy and diseased brain measured by different neuroimaging modalities such as fMRI, EEG and MEG. For this particular modelling approach in Computational Neuroscience, open source frameworks enabling the collaboration between researchers with different backgrounds are not widely available. The Virtual Brain is, so far, the only neuroinformatics project filling that place. Several open issues addressed by the following proposals involve
- Packaging (containers, cloud)
- UX design (concept, modernize)
2.1 Packaging TVB for the modern world
Description: TVB has, for the moment, distributed its packages either in the form of sources from Git repositories for developers, or a zip package per platform for end-users, and only recently through Pypi. This leaves much to be desired: in the scientific community, the use of the Anaconda distribution has made the Conda package manager popular. For Linux, a project called NeuroDebian seeks to package much of the available neuroscience software as Debian packages, which are then usable by many derivative distributions. Native launchers for the most usual operating systems would be good to have. Lastly, for many situations, it is good practice to run software in an isolated environment, with tools such as Vagrant, Docker, Amazon Web Image (AMI), etc. To address these possibilities this proposal involves preparing new packaging scripts for one or ideally all of the above mentioned options.
Expected Results: One or more of: packages for Conda and NeuroDebian, Vagrantfile, script for building a Docker image or AMI, native launchers for TVB Distribution.
Skills: Python, Bash & Unix command line, Debian packaging, virtual machines, containers.
Mentors: Lia Domide, Mihai Andrei
2.2 Visualize a large Connectome in 3D using HTML5
Description: Data visualization plays a crucial role in TVB's neuroinformatics platform, and a Structural Connectivity (connectome) is a core datatype, modelling full brain regions and their connections. An interaction paradigm needs to be proposed, as well as the implementation to be done for such a connectivity visualizer in the browser client of TVB. We need to easily display and interact with up to 1000 regions in a connectivity (1000^2 adjacency matrix) in 2D and 3D. Rendering performance as well as per-element interaction is important. Interaction from the user: rotate, zoom, move, edit edges, etc. are all necessary. The current implementation is documented here: http://docs.thevirtualbrain.org/manuals/UserGuide/UserGuide-UI_Connectivity.html#long-range-connectivity
Expected Results: Completely redo and improve a section of TVB front-end (Connectivity Cockpit) from UX design, down to implementation, web technologies and optimization for extremely large data structures.
Skills: HTML5, JS, CSS and Python are necessary; Experience in web development, SVG, WebGL, ReactJS is helpful.
Mentors: Paula Popa, Lia Domide
2.3 Monitor Sensors signal in 2D and 3D
Description: One major feature of TVB’s neuroinformatics platform is Timeseries analysis. Supporting empirical or simulated sources for signals the platform already offers a 2D viewer where users can study Timeseries files. But is it enough? Think about going next level and create a 3D viewer. This would be like looking at the patient’s personalized brain in the seizure moment and see the lead field potentials. An interaction paradigm needs to be proposed and the implementation to be done in the browser client of TVB. Rendering performance is important. Interaction from the user: rotate, zoom, move, play/pause movie, etc. are all necessary.
The viewers we already have:
Proof of concept:
Expected Results: Implement a new 3D viewer from UX design, down to implementation, web technologies. Skills: HTML5, JS, CSS and Python are necessary; Experience in web development, SVG, WebGL, ReactJS is helpful.
Mentors: Paula Popa, Lia Domide
2.4 Reusable visualization tools for Jupyter
Description: TVB's web-based UI provides several very useful visualization tools, which are setup for full screen use. As TVB is used in wider contexts (HBP collaboratory, Jupyter notebooks), it is important to ensure the relevent visualization tools are present. This project is to refactor the widgets in TVB UI to become reusable components which can be employed from a Jupyter notebook for use in the HBP collaboratory, while maintaining compatibility with the existing TVB framework. Tools are to be refactored, choice up to the student, in order of priority anatomical visualization (surface, connectivity) (e.g. use XTK) time series viewer (e.g. use vispy) the phase plane tool Use of WebGL (in particular Python/notebook oriented GL toosl) are encouraged, where numerous interesting opportunities for optimization are present, e.g. XTK for anatomy, vispy for time series.
Expected results: A set of classes usuable within Jupyter notebook, for displaying common data objects via WebGL or WebGL-based libraries.
Skills: Familiarity visualizing data with WebGL; familiarity with IPyWidgets & Jupyter would be helpful
Mentors: Marmaduke Woodman (@maedoc)
2.5 Reusable configuration UIs for Jupyter
Description: Similar to project 2.4, several form-based UIs are present in TVB's UI which should be reusable independently within a noteobok context to allow for visual configuration of a simulator or analysis algorithm. This project is the refactor those form UIs to widgets usable from IPython notebook, while maintaining compatibility with the existing TVB framework. The configuration pages are dynamically generated from metadata in the codebase. Simulator configuration Generic analysis config Use of ipywidgets, in order to maximize notebook compatibility is recommended.
Expected results: A set of IPyWidgets which can be connected to TVB objects, generate a configuration UI from the object's metadata, & configure them during use of a Jupyter notebook
Skills: Familiarity with class programming in Python; familiarity with IPyWidgets & Jupyter would be helpful
Mentors: Marmaduke Woodman (@maedoc)
2.6 Bringing Stan & TVB together for inference
Description: Stan is a state-of-the-art probabilistic programming language which allows for inference on complex statistical models, and has been used for a TVB prototype model for seizure propagation in Jirsa et al 2017 The Virtual Epileptic Patient: Individualized whole-brain models of epilepsy spread. Stan's algorithms allow for efficient exploration of a parameter space in addition to data fitting. This project is to translate essential algorithms from TVB to the Stan language for use in inference of TVB models on data, test against TVB results, and to provide a few examples of inference.
Expected results: A set of Stan files implementing essential TVB algorithms, with tests and examples.
Skills: Numerical/scientific programming, data science
Mentors: Marmaduke Woodman (@maedoc)
3. Modular Machine Learning and Classification Toolbox for ImageJ 3
Description: The Active segmentation ImageJ plugin was developed in the scope of GSOC 2016 and 2017. The plugin provides a general-purpose environment that allows biologists and other domain experts to use transparently state-of- the-art techniques in machine learning to improve their image segmentation results. ImageJ is a public domain Java image processing program extensively used in life and material sciences. The program was designed with an open architecture that provides extensibility via plugins.
Motivation: We would like to expand the existing functionality of the Active Segmentation plugin to incorporate learning from entire images presented as instances. In GSOC 2017 we incorporated Zernike moments as a compound feature vector and used it to train the classifier.
Project idea: The present project can go into different directions: either to incorporate Legendre moments and parallelize their computation to further enrich the feature space or to incorporate Principal Component Analysis reduction scheme to reduce feature redundancy.
- Fix existing issues and bugs
- Improve the user interface for region of interest display
- Extend the metadata format
- Provide a reference implementation
Minimal set of deliverables:
- Requirement specification - Prepared by the candidate after understanding the functionality.
- System Design - Detailed plan for development of the plugin and test cases.
- Implementation and testing - Details of implementation and testing of the plugin.
Required skills: Experience with Java
Desired skills: experience with ImageJ, machine learning preferably WEKA
Mentors: Dimiter Prodanov (email@example.com), INCF Belgian Node; (backup) Sumit Vohra, KU Leuven
4. Building a portable open pipeline to detect the hemodynamic response function at rest
Description: BIDS-apps (http://bids-apps.neuroimaging.io/, http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005209) are portable neuroimaging pipelines which read data stored in the BIDS format (which is meant to become the standard format for neuroimaging data sharing). Many neuroinformatics tools are adopting these pipelines, which assure robust and reproducible analyses. We developed a tool to retrieve the hemodynamic response function from resting state fMRI (https://github.com/guorongwu/rsHRF). The code is currently written in matlab. In order to increase its diffusion and portability we would like to translate it in python and build a BIDS-apps out of it.
Aims: The plan is to translate the existing matlab code to python, and to build BIDS-app out of it (see an example here https://github.com/BIDS-Apps/example). Ultimately this app will be tested and will be shared.
Skills: open source development experience with Python and GitHub. Experience with Docker containers is a plus, along with coursework in signal processing or biomedical data analysis. An interest in the underlying biological processes is most welcome. Good communication skills and familiarity with open science practices are expected.
Mentor: Daniele Marinazzo (firstname.lastname@example.org)
5. Building high-resolution 3D models of brain vasculature
Description: Our group has developed a set of tools to identify and extract vessels from ultra-high field high-resolution magnetic resonance images (MRI) of the human brain. However, since the vasculature is relatively thin and MRI represent data in volume elements (voxels, the 3D equivalent of pixels), it is difficult to generate accurate 3D models of the vasculature. Such models are important to if we want to better understand how energy is supplied and used in our brain and how vascular health is influencing brain activity, learning, and aging.
In this project, we seek to develop 3D meshing and visualization techniques that are better adapted to the complexity of vascular structures. For instance, we will consider techniques that have been originally developed for rendering the brain's white-matter anatomical connections identified with diffusion tractography. Our software, Nighres, is currently running as a set of Java libraries and Python interfaces, and we will favor creating methods that work directly in Python and interact well with established Python-based neuroimaging toolboxes such as Nibabel, Nilearn, and/or Dipy.
With a Google Summer of Code project, we would like to explore the following topics:
- creating high-resolution smooth 3D mesh representations of extracted vessels in Python;
- using and extending fiber visualization models from existing Python packages;
- defining global shape representations of vascular trees from those models.
Skills: The student should be proficient in Python and have a background in Computer Graphics, Data Visualization, or Computer-aided Design. Basic knowledge of MR image processing and/or brain anatomy would be advantageous, as would a familiarity with Java and Python-based Neuroimaging tools and experience in Open Source software development.
Mentors: Pierre-Louis Bazin, Christopher Steele
6. High Resolution brain monitoring data API
Description: High Resolution ICU brain monitoring data from hospitals can be collected, cleaned and packaged as HDF5 files using ICM+ Software in an HDF5 structure. Each file contains high frequency signals from different probes connected to the patient.
Researchers doing analysis have no means of accessing the data in a granular way. They can, for example search for and retrieve all data for a selected patient. However they cannot, for example, search for only arterial blood pressure recordings for a specified subgroup of patients from a selected time interval.
- Identify a suitable datastore for staging data and create scripts to load data into the platform
- Create a REST-API for accessing the data in different formats and sampling frequencies from the data repository
- Creating API client package in R.
Needed: Python, web services, R, HDF5
Desired: Worked with time-series data analysis, MongoDB
Lead: Visakh Muraleedharan, INCF
Co-Mentor: Manuel Cabeleira, University of Cambridge
7. Open source, cross simulator, large scale cortical models
Description: An increasing number of studies are using large scale network models incorporating realistic connectivity to understand information processing in cortical structures. High performance computational resources are becoming more widely available to computational neuroscientists for this type of modelling and general purpose, well tested simulation environments such as NEURON and NEST are widely used. New, well annotated experimental data and detailed compartmental models are becoming available from the large scale brain initiatives. However, the majority of neuronal models which have been published over the past number of years are only available in simulator specific formats, illustrating a subset of features associates with the original studies.
This work will involve converting a number of published large scale network models into open, simulator independent formats such as NeuroML and PyNN and testing them across multiple simulator implementations. They will be made freely available to the community through the Open Source Brain repository (http://www.opensourcebrain.org) for reuse, modification and extension.
Skills required: Python; XML; open source development; computational modelling experience.
Desirable: Java experience; a background in theoretical neuroscience and/or large scale modelling.
- Select a number of large scale cortical network models for the conversion & testing process.
- Convert network structure and cell/synaptic properties to NeuroML and/or PyNN. Where appropriate use the simulator independent specification in LEMS to specify cell/synapse dynamics & to allow mapping to simulators. Implementing extensions to PyNN, NeuroML or other tools may be required.
- Make models available on the Open Source Brain repository, along with documentation and references.
Mentor: Padraig Gleeson
Keywords: Python, XML, networks, modelling, simulation
8. Contextual Geometric Representations of Cultural Behavior
Description: Contextual Geometric Structures (CGSs) are an alternative approach to modeling intelligent behavior, and represent both structural and neurocognitive aspects of human culture. We can use this approach in two ways: 1) to enrich our understanding of general intelligence by providing a link between the empirical and contextual worlds, and 2) as an approximation of neural representations that result in culture-mediated intelligent behavior. While we can utilize the CGS approach to test hypotheses about individual and collective human behavior, the most informative use it in adding a cultural layer onto general intelligence.
We can use the CGS approach in two ways: 1) to enrich our understanding of general intelligence by providing a link between the neural processing, empirical observation, and contextual worlds, and 2) as an approximation of the cultural influences on intelligent behavior. While we can utilize the CGS approach to test hypotheses, the most informative use it in adding a cultural layer onto general intelligence. The model consists of an n-dimensional space that define the phenomenology of specific cultural representations, and a connectionist network that
provides a link to the empirical world.
CGSs provide a meta-model of neuronal processing by mapping complex concepts to human diversity. Populations of agents representing a wide range of distinct cultural representations can thus be contained inside a single concept space. Their overlap and independence provide
information about the abilities of different cultural traditions to generalize with respect to the empirical world. However, unlike a general intelligence, cultures within a CGS would provide semantic context and intersubjective understanding within a problem domain. CGS agents and populations of agents could be particularly useful as a tool for human-machine interaction, particularly during interactions where deep meaning is useful in facilitating communication.
Aims: Goals for the coding period will be formalized in collaboration with the student, but includes several outstanding issues: 1) transform symbolic models and higher-level representations into executable code, 2) creating test scenarios for two test problems (biodiversity and culturomics), 3) establish an analytical framework for performance metrics.
While currently developed as a set of mathematical and geometrical models, a long-term vision for CGS programs includes a means to approximate and predict phenomena in human behavior not easily explicable by rational decision-making models. For more information (conference presentation and technical papers), please see the following CGS project summaries: Project site on OSF Commons which includes papers and presentations (https://osf.io/ynffr/) and Explainer Videos on YouTube (https://www.youtube.com/playlist?list=PL4RJ4xCetB62KEKyVsyOo3pYTabvVCRVg).
Skills: Open source development experience with languages such as C++, Python. An interest in the underlying biological processes is essential, and applicants with strong abstract thinking abilities are preferred. Good communication skills and familiarity with open science practices are
Mentor: Bradly Alicea (email@example.com)
9.1 Building a Neural Network Animation tool using Python and Blender
Description: Blender is a powerful 3D graphics and animations software, which has a Python interface to allow programmatic creation of graphics. This project proposes to generate a system of Python scripts to generate cutting edge 3D graphics (and potentially animations) of neural networks using Blender. The finished product would be a software that accepts NineML or NeuroML descriptions of neural networks and produce a high quality 3D visual representation of the network. As an optional additional feature it would allow to provide a file with time series data of a network simulation, either as voltage time series or spiking data, and would produce an entire sequence of visualizations that can be rendered into a movie.
Skills required: Python; experience with Blender and what neuronal network simulations are would be helpful.
Mentors: Jamie Knight (J.C.Knight@sussex.ac.uk), and Thomas Nowotny (firstname.lastname@example.org)
9.2 A PyNN interface to GeNN
Description: PyNN is a Python based framework for describing neuronal network models. It is widely used in the computational neuroscience and neuromorphic computing communities. The proposal is to develop a PyNN interface for GeNNso that users of PyNN will be able to benefit from accelerated GPU simulations with GeNN. Important aspects of this work will be a flexible design that allows for future changes in both PyNN and GeNN, good coverage of the entire PyNN model range and optimised data management between Python and the C/C++ based GeNN.
Skills required: Python, PyNN, C/C++; experience with neuronal network simulations and CUDA would be helpful.
Mentors: Jamie Knight (J.C.Knight@sussex.ac.uk), Michael Schmuker (email@example.com), and Thomas Nowotny (firstname.lastname@example.org)
10.1 Physics-based XML Model-building for the Mosaic Embryo
Description: The DevoWorm project is building a physics-based simulation of mosaic embryogenesis, with application to the nematode Caenorhabditis elegans. This initiative will focus on incorporating secondary data from nematodes and (for early development) other species such as sea squirts into an XML-based computational framework. The model-building will result in an XML specification of embryo physics that describes a developmental process of your choosing. If time permits, this specification will be used to build trees and networks that describe relationships between individual cells. This will provide the host organization with an informatics framework for understanding neural precursor cells and developing nervous systems.
Aims: The current focus is on XML-based model-building for representing mosaic development in the worm Caenorhabditis elegans. This model will ultimately be executable in either CompuCell3D (from CC3DML) or WormSim (NeuroML). Depending on the student’s interests, they might be interested in either early development (developmental cell lineages in CC3DML) or later development (the transition from developmental cell lineage to terminally-differentiated connectome using the NeuroML). The internship period will consist of two parts: building models of the function and physical interactions between cells (in XML), and building tree and graphical representations from these models (using tools such as GraphViz and Gephi). Your proposed timeline should include working on the XML model first, and then move toward creating tree and network visualizations. Both the CC3DML and NeuroML options have a longer-term arc, and the applicant would be encouraged to contribute beyond the formal "Summer of Code". These models would fill a critical gap in the DevoWorm project, namely the ability to simulate the physical constraints and intercellular signaling potential within whole embryos among systems that exhibit deterministic cellular differentiation.
Skills: open source development experience with languages such as XML, C++, and Python. An interest in the underlying biological processes is essential, and applicants with strong abstract thinking abilities are preferred. This involves a willingness to take a "whatever works, works" philosophy, and encourage critical thinking. Good communication skills and familiarity with open science practices are expected.
Mentor: Bradly Alicea (email@example.com)
10.2 Digital Morphogenesis: towards a k-D embryo
Description: The DevoWorm project is building a computational basis for decomposing the geometry and morphogenesis of developmental processes. During Google Summer of Code 2017, a methodology was developed for extracting data from high-resolution microscopy images
of C. elegans embryos. In addition, previous analytical and computational work (https://devoworm.weebly.com/publications.html) has identified the potential for new ways to describe hidden complexity in the embryogenetic process.
A 4-D representation will be used to construct dynamic developmental cell lineage trees with spatial resolution. This will lead to novel data structures and algorithmic processes related to the developmental lineage tree and imagining biologically-plausible alternatives. We propose to expand upon this line of research by inferring developmental cell genealogies from the resulting numeric data. This will allow us to work towards a
theoretical concept called the k-D embryo, which is a framework to explain emergent processes in embryogenesis and will enable scientific discovery at multiple temporal and spatial scales.
Aims: There are two parts to this project. The first part is to create a 4-D data structure using either a class-based or RDF framework. This is based on previous work within the DevoWorm group involving re-representing developmental lineage trees. Ultimately, we wish to access and explore specific sublineages of differentiating cells. The second part is to construct and visualize the k-D embryo itself. This will be done using a specific method called Voronoi treemaps, which allow for k-D partitioning of the data in a bottom-up fashion. Voronoi treemaps will be created using one of several candidate algorithms, from A* pathfinding to recurrent neural networks. Being able to construct and explore embryogenetic pathways in such as interactive manner enables the understanding and reconstruction of alternative phenotypes.
Taken collectively, the innovative data structure and decomposition techniques opens up the opportunity to develop a number of artificial life applications. A final visualization will result from the database and tree-building exercise, and will allow users to explore multiple facets of the embryogenetic process.
Skills: open source development experience with languages such as C++, Python, R. An interest in the underlying biological processes is essential, and applicants with strong abstract thinking abilities are preferred. Good communication skills and familiarity with open science practices are expected.
Mentor: Bradly Alicea (firstname.lastname@example.org)
10.3 Advanced Neuron Dynamics in WormSim
Aims: The current visualization of the C. elegans nervous system in WormSim represents its 302 neurons as spheres connected by lines in a “ball and stick” model laid out in the shape of the worm. It currently only shows connectivity between the neurons, without showing dynamics of the neurons simulated. Since Geppetto 0.2.4, however, there is the potential to add several things to improve the neuronal view experience and make it easier to understand what is going on:
● Visualize more realistic neuronal 3D shapes (i.e. morphologies), reusing the shapes of neurons from http://browser.openworm.org
● Have multiple 3D canvas to show simultaneously both the body and the muscles of the worm and nervous system simulations
● Add Cytoscape.js-based widgets to show animated network connectivity & dynamics graphs
● Add dynamic plotly.js-based widgets to show animated 3D phase diagrams that collapse the activity of the network to one picture.
When this project is complete, the candidate would have added all of these to WormSim to enable a new release to the OpenWorm audience.
Desired: React, Three.js, Java, UI/UX, Computational neuroscience training
Mentors: Matteo Cantarelli (email@example.com), Giovanni Idili (firstname.lastname@example.org), Stephen Larson (email@example.com)
10.4 Mobile application to explore C. elegans nervous system dynamics
Description: The OpenWorm community has aggregated and curated C. elegans connectome data and computational models of the C. elegans neurons and muscles. At the same time technology has been built in the context of OpenWorm to visualize connectomes and detailed morphologies and furthermore run simulations of the C. elegans neuronal network.
OpenWorm released WormSim, which puts a simple version of the worm simulation online. However WormSim interface was only designed for fruition on a Desktop browser. A mobile ready version of these tools is highly desirable in order to enable researchers to access OpenWorm models and tools on tablets and modern phones.
Aims: A re-imagined mobile accessible web-based application based on existing OpenWorm technologies to visualize the C. elegans connectome including detailed morphologies and replay recordings of simulations of C. elegans network dynamics and body simulation:
● Build a mobile container for the C. elegans connectome / morphology browser
● Build a mobile container for replaying C. elegans neuronal simulations
Desired: Unity, React, Three.js, Java, UI/UX, Computational neuroscience training
Mentors: Matteo Cantarelli (firstname.lastname@example.org), backup Giovanni Idili (email@example.com)
10.5 Add support for Neurodata Without Borders 2.0 to Geppetto
Description: Geppetto is an open source platform used to build neuroscience applications. Geppetto is used today as the engine of Virtual Fly Brain, Open Source Brain, Patient H.M., NEURON-UI and the WormSim. Neurodata Without Borders is a unified data format for cellular-based neurophysiology data, focused on the dynamics of groups of neurons measured under a large range of experimental conditions.
Geppetto has already proof of concept support to visualize metadata and electrophysiology traces stored in NWB version 1. At Society for Neuroscience 2017 a new release of NWB was presented along with new SDKs to access the format.
Aims: The aim is to enable Java and/or Python Geppetto backends to read NWB version 2. This will enable every Geppetto based application to integrate NWB files in their workflow, enabling visualization of simulated data alongside of electrophysiology recordings.
● Extend pre-existing backend modules to incorporate the latest NWB SDK
● Improve pre-existing frontend interface to visualize new artefacts and allow the user to search and visualize the content of the NWB files
When this project is complete, the candidate would have added support for NWB 2 to Geppetto.
Desired: Java, React, UI/UX
Mentors: Matteo Cantarelli (firstname.lastname@example.org), backup Giovanni Idili (email@example.com)
11 Extended support for NIX file format in GIN
Description: The G-Node Data Infrastructure (GIN) services provide a platform for management and sharing of data in neuroscience. Inspired by GitHub, the platform uses a git/git-annex backend for versioning and sharing of scientific data, offering the power of a web based repository management service combined with a distributed file storage. It addresses the range of research data workflows starting from data analysis on the local workstation to remote collaboration and data publication. GIN also provides indexing services for convenient searching of data and metadata, including information in well-defined formats like the odML metadata format and the NIX format for scientific data.
In this project we want to enhance the GIN data management services by making use of specific features of the NIX format, such as the comprehensive organization of metadata and the representation of relationships between the data. This would materialize as a set of features on the GIN web frontend for extended search, visualization and exploration of data stored on GIN.
Aims: Outcomes of this project would be the ability to extract structural properties and metadata from files and to present and visualize the results as statistical summaries.
Mentors: The G-Node team
12 Increasing usability for Maxima
Description: Maxima is a system for the manipulation of symbolic and numerical expressions with more than 40-year history. The system is maintained by an active community of users and developers and is incorporated in systems, such as Sage.
Maxima is open source and its functionality is on par with commercial systems, such as Maple and Mathematica. The main applications of the system are calculus, dynamical systems and matrix algebra, which can be used for example in the context of biophysical modelling.
The project idea can evolve along two diverse choices:
1. Package Manager for Maxima. This GSOC project will create software for Maxima to download and install packages from hosts such as Github. The package manager will track versions and dependencies and maintain a collection of installed packages. It will be possible to invoke the package manager from within a Maxima session or possibly through a stand-alone program as well. There are some parts of a package manager for Maxima at present; this project will reuse and extend the existing parts or replace them as needed. The emphasis for this project will be to get a simple package manager working on a variety of systems (Windows, Linux, Mac) and Lisp implementations. Additional functionality will be considered if time permits.
2. Jupyter interface. There is an existing Jupyter interface for Maxima, but it is extremely difficult to install. This GSOC project will be to take the code for the existing interface and modify or replace it as needed, so that it is easy to install and works out of the box on a variety of systems (Windows, Linux, Mac). The emphasis for this project will be to focus on the installation problem, and just get basic functionality working in the user interface (text, math formulas, and plotting). Additional features in the user interface can be considered if time permits.
Skills: A general background in computer science is enough plus the willingness to start learning Lisp. Experience with functional programming languages Clojure or Haskell is an advantage.
Mentors: Dimiter Prodanov and Robert Dodier firstname.lastname@example.org
13 Improvements to the Brian simulator
13.1 Import NeuroML morphologies
Description: While Brian 2 is most commonly used to simulate networks of single-compartment neurons, it also offers support for multi-compartmental models of potentially complex morphologies (see documentation). While it already offers support to import morphologies in the SWC format (the format used for morphologies on neuromorpho.org), this only concerns the morphologies and not the ion channels and their distributions across the neuron. In recent years, NeuroML has emerged as a common standard to describe detailed models in a simulator-independent way, and a significant number of models has been ported to this format is available (see opensourcebrain.org).
The aim of this project is to:
- Implement support to import NeuroML morphologies from Brian 2
- Add convenient (semi-manual) access to other information stored in the NeuroML file, i.e. the LEMS definitions of the ion channels and their distribution
- Test and evaluate differences between simulations of NeuroML models in Brian 2 and other simulators (such as NEURON)
Skills: Python programming, experience with computational neuroscience desirable, experience with XML-based formats helpful
13.2 Random numbers
Description: Random numbers are an important part of neural simulations, used when setting up a simulation (stochastic synaptic connectivity, random distribution of synaptic weights or delays, etc.) as well as during a simulation (simplified "Poisson neurons", stochasticly fluctuating input conductances, unreliable synapses,etc.). Currently, Brian 2 generates random numbers using the well-establised Mersenne Twister algorithm (the algorithm that is also used in the numpy library). The current system allows the generation of reproducible random numbers but has a few short-comings:
- Random number generation is somewhat slow
- Random numbers are not reproducible across code generation targets and across different numbers of threads
We would like to improve the random number generation system, by:
- Introducing a general interface that allows to switch to a different random number generator
- Integrating an existing Counter-based random number generator (e.g. Random123, also used by the NEURON simulator) into Brian 2, and implementing random number calls in a way that allows to reproduce random number streams independent of the code generation target and the number of threads.
- Evaluating and documenting the options for the user
Skills: Python and C programming, experience with using mathematical libraries desirable, experience with computational neuroscience helpful
13.3 Model encapsulation
Description: Brian's "standalone mode" gives the user maximal performance by converting the full model description into a set of C++ files and transparently compiling and executing these files. However, the resulting binaries lack the flexibility to adjust parameters without recompiling, thus negating some of the speed-up and limiting its use in model fitting. The aim of this project is to implement a new “encapsulation” mode for Brian replacing the existing standalone mode. In addition to the current standalone-binary, this mode will generate a C++ package and API for a user’s model which does not depend on Python or Brian, and takes as arguments the values of the chosen parameters so that it does not need to be recompiled when these change.
Skills: C++ and Python programming, experience with Brian highly desirable
13.4 Improving Brian's parallelization capabilities (OpenMP)
Description: Brian can parallelize simulations over multiple processor cores by making use of the OpenMP framework. However, in its current state Brian does not yet make full use of the parallelization potential. In addition, there are some corner cases where activating parallelization can lead to incorrect results (this is why OpenMP support is still marked as "experimental").
The aim of this project is to finalize the OpenMP support, by:
- deciding whether parallelization can be safely used based on information about the respective code fragment (which variables are read/written in what way)
- identifying and implementing additional opportunities for parallelization
- extensive testing
- Optional ("stretch goal"): transferring the OpenMP support to other code generation targets (Cython, C++ via weave)
Skills: C++ and Python programming, experience with OpenMP or other parallelization techniques helpful
13.5 Realtime simulations
Description: Brian 2 can generate code in a so-called "standalone mode", where it generates a full set of code files that encapsulate the full simulation without references to external libraries (apart from the standard library). Furthermore, it allows to combine high-level simulation code (differential equations, etc.) with user-specified low-level functions written in the target language (e.g. C++). This can be used to connect a neural simulation to external input (e.g. from a camera or a microphone) or to actuators (e.g. a motor). These features make Brian 2 potentially highly suited to write neural simulation code that runs on an external device such as a robot. However, currently the simulation time advances in fixed time step, completely indepedent from the "real time". To make Brian more useful in such contexts, we would therefore like two support two modes of "realtime" simulation:
- a mechanism that slows down a simulation so that its progress matches the realtime
- a mechanism that adapts the simulation time step so that it matches the real time that has passed since the last update
Skills: C++ and Python programming, experience with Brian or other neural simulators helpful
14 MOOSE simulator projects
14.1 Efficient Estimation/Optimization of Biochemical Models in MOOSE Simulator
MOOSE (Multiscale Object Oriented Simulation Environment) is designed to simulate multiscale neural networks e.g. it can simulate detailed electrical neural model with localised biochemical reactions. These underlying biochemical pathways are fundamental to the diverse computation a neuron performs
The set of reactions and molecular players involved in a biochemical pathway are usually known. One can collect the outcomes of various experiments from the literature. Given the reaction network and experimental data, the aim of this project to estimate the model (e.g. rate parameters of reactions, concentration of intermediate species) such that the model can ‘explain/fit’ the experimental data with ‘reasonable’ accuracy. In other words, it is an optimization problem where the parameters of the model need to be tweaked by the optimizer so that the model does a good job of replicating the available data.
In particular, this project involves following major tasks.
- Chemical models given in SBML (or MOOSE) and set of experimental results given in a table, one need to formulate the optimization problem H. Language: Python. XML parsing. Familiarity with python-libsbml is plus.
- Find strategies to efficiently solve H. Language: Python for prototyping.
- Implement final strategy as a solver in moose in C++. Familiarity with Gnu Scientific Library and/or boost libraries.
- Optional: CUDA/OpenMP/multithreaded implementation.
The student should be familiar with scientific computing in Python and C++. He/she should also be familiar with optimization techniques involved in model estimation. Familiarity with parallelizing of algorithms using CUDA/OpenMP is plus.
Programming Languages: Python, C++, CUDA/OpenMP (optional).
Skill Set: Optimization techniques, Model estimation, Concurrent Programming.
Mentors: Upinder Bhalla (email@example.com ) , Dilawar Singh (firstname.lastname@example.org )
Institute: National Center Of Biological Science, Bangalore, India
14.2 Optimization/Parallelization of Neural Networks with Sequence Recognizing Elements in MOOSE Simulator
Biological neurons are different from the logistic units used in artificial neural networks in several ways. The complexity and diversity in biological neurons can give rise different types of interesting computations, such as sequence recognition by individual neurons. The MOOSE (Multiscale Object Oriented Simulation Environment) is a tool that can be used to explore such computations at multiple scales using abstract as well detailed models.
Currently MOOSE implements sequence recognition in abstract model neurons suitable for making large neural networks. The aim of this project is to optimize and parallelize this feature, in order to improve the computational efficiency.
The tasks involved in this project are
- Familiarity with MOOSE C++ core (Previous Work)
- Write a parallel solver for sequence recognizer.
- Benchmarks and tests for solver.
The student should be familiar with C++ and Python (optional). Knowledge of parallelization and code optimization techniques is highly desirable. Familiarity with GNU Scientific Library is required. Familiarity with BOOST libraries is a plus.
Programming Languages: C++, Python (optional)
Skill Set: Optimization techniques, Parallelization techniques
Mentors: Upinder Bhalla (email@example.com), Bhanu Priya (firstname.lastname@example.org)
Institute: National Center Of Biological Science, Bangalore, India
15. MRI Defacing Detector
Image courtesy of NAMIC
General intro: Magnetic resonance imaging (MRI) data is expensive to acquire, and is subject to privacy and ethical restrictions that don’t exist in other scientific domains. Nevertheless, public sharing of MRI data is enabling larger-scale neuroimaging studies, and pooling data from multiple studies is enabling many new research scenarios. Public sharing is also democratizing access to precious resources, allowing new and less wealthy investigators to participate in the great scientific endeavour of understanding the brain and treating neurological disorders. To ethically share MRI data, images must be prepared so that research subjects (or patients) cannot be identified from the image. One of the de-identification steps involves removing (or masking) the part of the image that corresponds to the face, so that the subject cannot be visually recognized. The aim of this project is to construct an automatic classifier that would reliably detect whether this “defacing” step was already performed, and thus if the data is ready to be shared.
Aims: The project will consist of the following stages:
- Preparation of a dataset. MRI data must be assembled that captures the ranges of variability possible for MRI data, in terms of subject age, possible subject pathology, MRI scanner type, MRI acquisition parameters and MRI field strength. From this dataset, defaced images can be produced differently, for example using the pydeface or mri_deface tools.
- Training and evaluation of the classifier. High specificity will be prioritized over sensitivity, and the final test results should be reported on data from scanning sites (and maybe defacing tools) that were never seen in the training dataset.
Mentors: Chris Gorgolewski, Stanford University and Andrew Doyle, McGill University
16. BIDS-starter-kit: Creating tutorials for Brain Imaging Data Structure (BIDS)
Context and motivation: The Brain Imaging Data Structure (BIDS ) is a new standard for neuroimaging data organization that improves interoperability across labs, enabling data discovery and innovation. However, as a new standard, there is still a relative lack of accessible materials to encourage user adoption.
The proposed project aims to address this gap by developing a series of interactive tutorials that will serve to onboard users to the BIDS ecosystem through the BIDS-starter-kit . As BIDS is a community-driven effort, tutorials will be created in consultation with the existing BIDS community. Evaluating the usability and interacting with BIDS users/developers to integrate relevant feedback will therefore be a significant portion of time spent.
Figure caption: If you’ve ever tried to understand someone else’s directory structure for their neuroimaging analyses, you’ll know how hard it is. The BIDS-starter-kit GSoC student will build interactive tutorials to help thousands of researchers around the world make the whole field of neuroimaging more efficient and reproducible.
Tool description: Tutorials will be implemented as a compilation of Jupyter Notebooks , a language agnostic platform for interspersing code with narrative text and resulting output. Tutorials will be made interactive via Binder  integration, allowing users to execute all material directly within the browser. Content for tutorials will be sourced from discussions with active BIDS community members  but will cover basic content including: creating relevant scan metadata, validating that existing data sets are in BIDS format, and employing BIDS-oriented applications (i.e., BIDS Apps) for data analysis.
Project description and aims: This project is aimed towards students seeking to develop their coding, community development, and user support skills. The successful candidate will gain both 1) real world experience engaging with a wide range of researchers and developers as well as 2) experience with design thinking for open innovation .
Measurable outcomes include the creation of a series of interactive, lightweight tutorials for orienting new adopters to the BIDS ecosystem. The Nipype tutorial  is an excellent example of the format we aim to develop.
Skills needed/desired: Interested students should be comfortable with the Python, MATLAB, and R programming languages in order to interface with a wide variety of neuroimaging tools. A basic familiarity with neuroimaging data formats and preprocessing would also be desirable. Project development will be on GitHub  so an appetite to learn git and open project management  is important, if the student does not already have this experience.
Keywords: Python; R; MATLAB; Jupyter notebooks; Binder; Docker; user experience; usability; community development; brain imaging; reproducible research
Mentors: Kirstie Whitaker (Alan Turing Institute and University of Cambridge) with Elizabeth DuPre (McGill University) and Dora Hermes (Stanford University)
Relevant external links:
INCF is currently managing or participating in a number of collaborative projects to develop tools and other resources for neuroscience researchers.