GSoC 2017 project ideas

We have been approved as a mentoring organization again in 2017, with projects proposed and mentored by our Nodes and international community.

Below is our project ideas list, which is now final.

We have a discussion forum for mentors and interested students. To be added, email gsoc@incf.org with a description of which project you are interested in.

If you have general questions about INCF and our participation in GSoC, please contact us on gsoc@incf.org.

Other resources


Proposals and ideas for potential INCF projects within Google Summer of Code


G-Node projects

1.1 NIX Dataframe support

The NIX project aims to develop standardized methods and models for storing electrophysiology and other neuroscience data together with their metadata in one common, open file format. It does so by having a central DataArray object that can hold n-dimensional homogeneous data, linked with units and other metadata. Although a DataArray can hold any type of data (int, float, etc) it can only hold one type at a time. In recent years working with heterogenous data in form of tables (DataFrames) has become more and more popular (cf. pandas[2] for Python). The aim of this GSoC project would be to develop a proof-of-concept for such DataFrames in NIX and its python bindings (so pandas DataFrames can be read and written to NIX files).

Skills needed: C++ (C++11) and Python.

Mentors: G-Node Developers (Jan Grewe, @jgrewe; Achilleas Koutsou, @achilleas-k)

[1] https://github.com/G-Node/nix
[2] http://pandas.pydata.org

1.2 NIX filesystem backend support

The NIX[1] project aims to develop standardized methods and models for storing electrophysiology and other neuroscience data together with their metadata in one common, open file format. Currently NIX uses HDF5 files[2] to store the data on-disk, however, the NIX C++ library was designed from the beginning with the idea in mind that different use cases might require different storage solutions. Therefore NIX has the concept of backends that are responsible for writing and reading the data. The basic groundwork has been done for a ‘filesystem’ backend that stores the data not in an H5 but directly onto the filesystem using appropriate binary file formats, like NumPy’s npy format[3], for numeric data, and the YAML for metadata. The aim of this GSoC project would be to complete the implementation of the ‘filesystem’ backend to the same level as the HDF5 backend, and to implement copy and sync methods to be able to copy between files of possibly different backends.

Skills needed: C++ (C++11)
Mentors: G-Node & NIX Core Team (Jan Grewe, @jgrewe; Achilleas Koutsou, @achilleas-k)

[1] https://github.com/G-Node/nix
[2] https://en.wikipedia.org/wiki/Hierarchical_Data_Format
[3] http://docs.scipy.org/doc/numpy/neps/npy-format.html

1.3 odml Editor - Enhancements and Python 3 compatibility

The odml[1, 2] data model is designed to store arbitrary experimental metadata in a hierarchical format. The odml data model is also part of the NIX data format [3] that links metadata and data. We offer a Python implemented GUI for viewing and editing odml data files (XML). So far it has been implemented as part of the core python odml-library for python 2 but it needs to braced for the future, i.e. python 3.x. This project would include separating UI related parts of the codebase to make the editor stand-alone, python 3.x compatible, and allow the editing of metadata originating from NIX files.

Skills needed: Python (2.x and 3.x)
Mentors: G-Node & NIX Core Team (Achilleas Koutsou, @achilleas-k; Jan Grewe, @jgrewe; Michael Sonntag, @mpsonntag)

[1] https://github.com/G-Node/python-odml
[2] Grewe et. al., 2011, Frontiers in Neuroinformatics, http://dx.doi.org/10.3389/fninf.2011.00016
[3] https://github.com/G-Node/nix


2. Distributed data storage for electronic assistive system

Description: We perform electroencephalography (EEG) and event-related potentials (ERP) experiments. Our current work includes development of an assistive system for brain impaired people. This system collects EEG data from human brain of an audio/video stimulated subject. Collected data are processed by customized classifiers and feedback is provided. According to feedback a desired activity can be performed. Switching on a TV, opening a window sunblind etc.

These experiments produce large collections of data and unstructured metadata. Management and long term storage of these data is crucial for their processing. In initial phase we expect tents of tested subjects and we will perform several experiments per tested subject. With increasing number of experiments demands on a flexible data storage and distributed data processing are also increasing. Fast and accurate processing of such data enables an on-line interaction with tested subject.

In current research large data sets and their processing are moving from local computers to cloud based solutions. These data (so-called BigData) are processed in distributed manner. The server load is scaled among different nodes. This scaling is often operated by MapReduce functions.

Aims: The aim for an applicant is to develop a distributed system for long term storage of experimental data, and provide suitable MapReduce functions for distributed processing of stored data.

We suppose that we provide data and signal processing methods and the applicant writes MapReduce functions for distributed processing and a distributed data storage using e. g. Apache Hadoop. Finally, a suitable user interface will be implemented as well.

Skills: Java, Java EE, Maven, JSON, XML, Web-Services (SOAP and REST), GitHub, Hadoop

Mentor: Petr Ježek (University of West Bohemia, Czech Republic, Czech National INCF Node),
Co-mentor: Roman Mouček (University of West Bohemia, Czech Republic, Czech National INCF Node)


GPU enhanced Neuronal Networks (GeNN)

GeNN is an open source framework for GPU accelerated simulations of spiking neuronal networks based on code generation methods. Users define neuronal networks in a simple C++ API. GeNN translates this model description into optimised CUDA and C/C++ code that can then be used in user-side application code to simulate the described network. Depending on the GPU hardware and the model details, GeNN can achieve speedups between none and 500X.

Below are a number of proposals that suggest improvements and extensions to GeNN.

3.1. A PyNN interface to GeNN

PyNN is a Python based framework for describing neuronal network models. It is widely used in the computational neuroscience and neuromorphic computing communities. The proposal is to develop a PyNN interface for GeNN so that users of PyNN will be able to benefit from accelerated GPU simulations with GeNN. Important aspects of this work will be a flexible design that allows for future changes in both PyNN and GeNN, good coverage of the entire PyNN model range and optimised data management between Python and the C/C++ based GeNN.

Skills required: Python, PyNN, C/C++; experience with neuronal network simulations and CUDA would be helpful.

Mentors: Jamie Knight (J.C.Knight@sussex.ac.uk), Michael Schmuker (m.schmuker@biomachinelearning.net), and Thomas Nowotny (t.nowotny@sussex.ac.uk)

3.2. Adding MPI support to GeNN

Currently GeNN generates C/C++ and CUDA code designed to run on a single GPU on a single shared-memory system. However, in order to improve their power-efficiency, modern computer cluster and supercomputer systems have begun to include GPU acceleration. By adding support for MPI (Message Passing Interface) to GeNN, the simulation of large models could be further accelerated by distributing it amongst multiple nodes using MPI.
An important aspect of this work will be to balance the parallelisation of simulations across the hierarchy of MPI-based parallelism of hosts and the thread/block based parallelism on individual GPUs.

Skills required: C/C++, MPI; experience with neuronal network simulations would be helpful.

Mentors: Jamie Knight (J.C.Knight@sussex.ac.uk), Thomas Nowotny (t.nowotny@sussex.ac.uk)


4. Open source, cross simulator, large scale cortical models

Description: An increasing number of studies are using large scale network models incorporating realistic connectivity to understand information processing in cortical structures. High performance computational resources are becoming more widely available to computational neuroscientists for this type of modelling and general purpose, well tested simulation environments such as NEURON and NEST are widely used. New, well annotated experimental data and detailed compartmental models are becoming available from the large scale brain initiatives. However, the majority of neuronal models which have been published over the past number of years are only available in simulator specific formats, illustrating a subset of features associates with the original studies.

This work will involve converting a number of published large scale network models into open, simulator independent formats such as NeuroML and PyNN and testing them across multiple simulator implementations. They will be made freely available to the community through the Open Source Brain repository (http://www.opensourcebrain.org) for reuse, modification and extension.

Skills

Required: Python; XML; open source development; computational modelling experience.
Desirable: Java experience; a background in theoretical neuroscience and/or large scale modelling.

Aims:

  1. Select a number of large scale cortical network models for the conversion & testing process.
  2. Convert network structure and cell/synaptic properties to NeuroML and/or PyNN. Where appropriate use the simulator independent specification in LEMS to specify cell/synapse dynamics & to allow mapping to simulators. Implementing extensions to PyNN, NeuroML or other tools may be required.
  3. Make models available on the Open Source Brain repository, along with documentation and references.

Mentor: Padraig Gleeson, UCL

Keywords: Python, XML, networks, modelling, simulation


The Virtual Brain: An open-source simulator for whole brain network modeling.

The Virtual Brain (TVB) is one of the few open source neuroinformatics platforms used to simulate whole brain dynamics. Models are not limited to the human brain, and researchers can also work with the macaque's or the rodent's connectome. Models based on biologically realistic macroscopic connectivity will hopefully help us to understand the global dynamics observed in the healthy and diseased brain.
Whether you are interested in beautiful visualizations or differential equations, you can join us and help us improve!

Several open issues of TVB, addressed by the following proposals involve:

  • enhancing visualization
  • improving portability

Further information:
TVB's main web site is http://www.thevirtualbrain.com/ and more technical documentation can be found at http://docs.thevirtualbrain.com/

5.1.  Modernize visualizers (WebGL, HTML5)

Description: Data visualization plays a crucial role in TVB's neuroinformatics platform. Some of our viewers are already old (developed when WebGL was at version 0.9). We want them improved (in fps and code quality), as well as extended in user experience. An interaction paradigm needs to be proposed by the student, as well as the implementation needs to be done. We have over 20 existing viewers, from which the student could choose which ones he or she wants to review. We also have about 4 new viewers who could be implemented from scratch by the student. Rendering performance as well as per-element interaction is important.

Skills required: HTML5/JS/CSS & Python; Experience in web development, JQuery, SVG, WebGL, as well as server side frameworks such as CherryPy, is helpful.

Mentors: Lia Domide (@liadomide), Mihai Andrei (@mihandrei)

5.2. Neural mass models in NeuroML/LEMS

Description: TVB provides many options in terms of neural mass models, however comparing these models to simulations from other simulators such as NEST or PyNN remains challenging because they do not implement neural mass models such as those in TVB. However, a standard model description language, NeuroML / LEMS, has been developed. This project proposes to translate TVB's neural mass models into the NeuroML or LEMS format, test their behavior against the current Python implementation and publish them as an open source resource.

Skills required: Python & XML, as well as experience with neural mass models & differential equations.

Mentors: Lia Domide (@liadomide), Mihai Andrei (@mihandrei)


6. Active Segmentation Toolbox for ImageJ

ImageJ is a public domain Java image processing program extensively used in life sciences. The program was designed with an open architecture that provides extensibility via Java plugins. User-written plugins make it possible to solve almost any image processing or analysis problem or integrate the program with 3rd party software.

Active Segmentation plugin is the redesign of existing Trainable Weka Segmentation (TWS) of ImageJ. The platform was developed in the context of GSOC 2016. The Active Segmentation was developed with the main goal of providing a general purpose environment that allows biologists and other domain experts to use transparently state-of-the-art techniques in machine learning to improve their image segmentation results.

The Active Segmentation provides generic functionality and user friendly interface so that the user can include the state of the art filters and machine learning frameworks from the WEKA library:

  • active learning
  • multi-instance learning designed by third party in a robust manner.

The platform is still under development, although the main functionality has been completed in the context of GSOC 2016. In last Google summer of code, the major focus was on integrating generic filter families and specifically on one family of filters i.e. Gaussian Scale Space.

Motivation: We would like to expand the existing functionality of the Active Segmentation plugin to incorporate learning from entire images presented as instances. In this way image classification functionality can be achieved.

The Project: The project will start by examining the existing Active Segmentation plugin with the purpose to add extra functionality of incorporating statistical features (in addition to the filters already present) as an extra module.
The immediate objectives of the development are to

  • Develop a proof of concept module, which computes Zernike image moments based on a Region of Interest.
  • Propose the necessary design changes in Active Segmentation to incorporate entire image features.
  • Update the existing Graphical user Interface to handle both filters and statistical features.
  • Update the meta-data export functionality to handle the new set of features.

Minimal set of deliverables

  • Requirement specification - Prepared by the candidate after understanding the functionality.
  • System Design - Detailed plan for development of the plugin and test cases.
  • Implementation and testing - Details of implementation and testing of the plugin.

The candidate

Required skills: Experience with Java
Desired skills: experience with ImageJ, machine learning preferably WEKA

Mentors: Dimiter Prodanov (dimiterpp@gmail.com), INCF Belgian Node; (backup) Sumit Vohra, KU Leuven


7. Migrating Ultra-High Resolution Neuroimaging Tools To Python

Description: The CBS High-Res Brain Processing Tools a.k.a. CBSTools is a suite of Open Source software tools for processing ultra-high resolution brain imaging data from high field MRI at sub-millimeter resolutions. Our group has developed these tools in Java as a set of plugins for the MIPAV software package and the JIST pipeline environment. In response to growing community interest, we have recently begun to migrate CBSTools to Python. For this purpose we use the JCC package, a C++ generator to produce the code necessary to call Java classes from CPython, which already provides a basic infrastructure. Our goal is to add user-friendly Python interfaces for our tools, document them through concrete examples and facilitate integration with other popular Python-based Neuroimaging tools such as Nibabel, Nilearn and Nipype. With this we aim to make high-resolution data processing tools available to the broader community and hope to encourage other scientists to contribute with their own code. As this work is just beginning, many open tasks could be addressed in a Google Summer of Code project, such as:

  • encapsulation of the Java code using JCC and creation of Python packages;
  • implementation of Python interfaces for calling CBSTools and manipulating data;
  • integration with Nibabel (data formatting), Nipype (workflow management) and Nilearn (data post-processing and visualization);
  • generation of example workflows on open neuroimaging data.

Skills: The student should be proficient in Python and have a background in MR image processing. Basic knowledge of Java would be advantageous, as would a familiarity with other Python-based Neuroimaging tools and experience in Open Source software development.

Mentors: Pierre-Louis Bazin (@piloubazin), Christopher Steele (@steelec)

Links

  • https://www.nitrc.org/projects/cbs-tools/
  • https://github.com/piloubazin
  • https://github.com/steelec

8. In-memory compression for neuroscience applications.

Context/Motivation: Simulations of brain models involve very large volumes of data. With our current techniques, for instance, models of the complete human brain would require hundreds of petabytes. Meeting such a requirement would create huge demands on storage infrastructure. Given limited hardware bandwidth, it would also place a severe constraint on performance.

The goal of the proposed project is to explore the use of an external library (e.g. miniz or zlib) to compress data in memory. The objective is to integrate this technology in the NEURON simulator, one of the main simulation packages, the BBP uses in its brain-modelling effort

Project/tool info: The project will use neuromapp (https://github.com/BlueBrain/neuromapp ), a software library developed to facilitate the exploration of BBP software and the NEURON simulator. The library reproduces key BBP algorithms as a collection of mini-apps. NEURON computational functions and memory management are represented by mini-app “kernels”.

The successful candidate for the project will integrate an in-memory compression library of his/her choice into the neuromapp library in the form of a new mini-application, lying between the data management layer and the computational kernels.

A second goal is to design a contiguous C++ container with an STL-like interface that transparently implements a compressed memory store.

Skills needed: The student should be proficient in C++. Since he/she may be working remotely on BBP machines, familiarity with the Unix Shell and command-line tools is essential. A minimum knowledge of compression algorithms would be an advantage. Since compression/decompression may be asynchronous, knowledge of threading APIs and execution models (e.g. OpenMP, Pthreads) would also be a plus. Regression tests will be based on Boost-test. Familiarity with unit testing, test driven development and Boost test will be considered as an asset.

Mentors: Timothée Ewart - BBP (timothee.ewart@epfl.ch), Cremonesi Francesco - BBP (francesco.cremonesi@epfl.ch), Fabien Delalondre - BBP (fabien.delalondre@epfl.ch)


Improvements to the Brian simulator

Brian (http://www.briansimulator.org) is a widely used simulator for spiking neural networks, written in Python. We believe that a simulator should not only save the time of processors, but also the time of scientists. Brian is therefore designed to be easy to learn and use, highly flexible and easily extensible. In order to benefit from fast-running simulations despite this flexibility, Brian is built on the concept of code generation: user-specified high-level model descriptions are transparently converted into low-level code (e.g. in C++).

Mentors: Marcel Stimberg (marcel.stimberg@inserm.fr), Dan Goodman (d.goodman@imperial.ac.uk)

9.1 Improvements to the Brian simulator: Import NeuroML morphologies

Description: While Brian 2 is most commonly used to simulate networks of single-compartment neurons, it also offers support for multi-compartmental models of potentially complex morphologies (see documentation). While it already offers support to import morphologies in the SWC format (the format used for morphologies on neuromorpho.org), this only concerns the morphologies and not the ion channels and their distributions across the neuron. In recent years, NeuroML has emerged as a common standard to describe detailed models in a simulator-independent way, and a significant number of models ported to this format has been made available (see opensourcebrain.org).

The aim of this project is to:

  • Implement support to import NeuroML morphologies from Brian 2
  • Add convenient (semi-manual) access to other information stored in the NeuroML file, i.e. the LEMS definitions of the ion channels and their distribution
  • Test and evaluate differences between simulations of NeuroML models in Brian 2 and other simulators (such as NEURON)

Skills: Python programming, experience with computational neuroscience desirable, experience with XML-based formats helpful


9.2 Improvements to the Brian simulator: Random numbers

Description: Random numbers are an important part of neural simulations, used when setting up a simulation (stochastic synaptic connectivity, random distribution of synaptic weights or delays, etc.) as well as during a simulation (simplified "Poisson neurons", stochastically fluctuating input conductances, unreliable synapses,etc.). Currently, Brian 2 generates random numbers using the well-established Mersenne Twister algorithm (the algorithm that is also used in the numpy library). The current system allows the generation of reproducible random numbers but has a few short-comings:

  • Random number generation is somewhat slow
  • Random numbers are not reproducible across code generation targets and across different numbers of threads

We would like to improve the random number generation system, by:

  • Introducing a general interface that allows to switch to a different random number generator
  • Integrating an existing Counter-based random number generator (e.g. Random123, also used by the NEURON simulator) into Brian 2, and implementing random number calls in a way that allows to reproduce random number streams independent of the code generation target and the number of threads.
  • Evaluating and documenting the options for the user

Skills: Python and C programming, experience with using mathematical libraries desirable, experience with computational neuroscience helpful


9.3 Improvements to the Brian simulator: Numerical integration

Description: In Brian 2, differential equations are solved symbolically and transformed into a series of statements that will then converted into the final target-language code and executed every time step (for more details, see our paper). While this explicit solution makes the numerical integration procedure very transparent and explicit, it has also some short-comings:

  • Integration is limited to fixed-step updates, with the same integration timestep for all neurons/synapses within a population
  • For complex equations, symbolically solving the equations takes a significant of time, which is inconvenient for short-running simulations.

We would therefore like to extend the current system, by:

  • allowing for a new approach to numerical integration (more similar to the standard approach in most simulators): generate code that describes the system of differential equations (and possibly the Jacobian) and calls a solver (e.g. from the scipy project or the GSL)
  • Evaluating and documenting the options for the user

Skills: Python and C programming, experience with solving differential equations numerically desirable, experience with computational neuroscience helpful


9.4 Improvements to the Brian simulator: Model encapsulation (C++ standalone mode)

Description: Brian's "standalone mode" gives the user maximal performance by converting the full model description into a set of C++ files and transparently compiling and executing these files. However, the resulting binaries lack the flexibility to adjust parameters without recompiling, thus negating some of the speed-up and limiting its use in model fitting. The aim of this project is to implement a new "encapsulation" mode for Brian replacing the existing standalone mode. In addition to the current standalone-binary, this mode will generate a C++ package and API for a user's model which does not depend on Python or Brian, and takes as arguments the values of the chosen parameters so that it does not need to be recompiled when these change.

Skills: C++ and Python programming, experience with Brian highly desirable


10. Speeding up functional network analysis on fMRI data with distributed, in-memory computation using Apache Spark

Description: Network analysis on functional MRI imaging involves many computationally intensive steps like registration to high resolution brain templates, extraction of brain and correlated voxels and parcellation of brain regions. Rapid processing of individual brains will facilitate new applications of network analysis i.e. performing neurofeedback with fMRI systems, interactive neuroimaging to adaptively assess brain dynamics and personalized analysis of brain function. Processing speedup may be achieved by distributing computation over many nodes or parallelization on a GPU.

In this work, our focus will first be on the distributed computation approach with particular attention to in-memory computation strategies. If there is sufficient interest, the applicant is invited to propose optimizations that could be more efficiently implemented on GPU. The ideal outcome will be a hybrid software system of both GPU and virtual machines which may adaptively combine and manage available resources to the processing task at hand.

Aims

  1. Design and implement in-memory data partitioning algorithm for fMRI processing
  2. Implement critical path algorithms in fMRI processing like correlation, registration into Resilient Distributed Dataset functions on Spark
  3. Integrate the Spark code on cluster management platform with neuroimaging management software (XNAT)

Skills
Required Python; Apache Spark or Hadoop; Memcache / Redis or other in-memory distributed datastores
Desired Familiarity with fMRI processing pipeline; CUDA and GPU programming

Mentors
Eric TW Ho, UTP, Malaysia (hotattwei@utp.edu.my)
Epifanio Bagarinao, Nagoya University, Japan (ebagarinao@met.nagoya-u.ac.jp)

Keywords: Spark, fMRI, distributed computing, in-memory analytics, GPU


Polish Node projects

11.1 Brain-computer interface components in PsychoPy

Description: PsychoPy [1] is an open-source application allowing for running a wide range of neuroscience, psychology and psychophysics experiments. The tool is unique in giving an experimenter a choice of interface: use the Builder interface to build rich, flexible experiments easily or use the Coder interface to write extremely powerful experiments in the widely-used Python programming language. However powerful, the tool is generic and thus lacks support for many specific types of procedures. One particular group of procedures are those related to brain-computer interfaces based on EEG (BCI). BCI paradigms like P300, SSVEP and motor imagery require procedures with specific hardware support and interactive user interface constantly communicating with online classification modules. Consequently, the aim of the project is to enrich PsychoPy with the support of procedures used in BCI research. The biggest task will be to design and implement additional components in the Builder interface providing GUI for creating abovementioned procedures.

[1] http://www.psychopy.org/

Skills needed: Advanced Python (2.x and 3.x) programming skills, experience with using GTK library.

Mentors:
Mateusz Kruszyński mateusz.kruszynski@braintech.pl, BrainTech Ltd. http://braintech.pl
Piotr Durka durka@fuw.edu.pl, University of Warsaw http://www.fuw.edu.pl


11.2 Multilingual support in Svarog

Description: Svarog [1] stands for “Signal Viewer, Analyzer and Recorder On GPL”, providing the best in the FOSS domain, commercial grade user friendly interface for review and analysis of mutivariate biomedical time series, mainly EEG. It was created to solve the two basic technical problems encountered in EEG research, which are incompatibility of proprietary EEG data formats and difficult (for non-technical users like neuroscientists) access to the novel algorithms for EEG analysis. The first problem is solved within the framework of SignalML [2][3] which is a XML-based metaformat for machine-readable description of EEG formats, which can be compiled on the fly into code for reading EEG data. As a solution to the latter issue, we propose an open API for plugins operating on the multivariate time series within the mouse-driven Svarog environment. Already implemented methods include FFT, spectrogram, wavelet analysis, matching pursuit, directed transfer function and ICA.

An obvious step into making the application available for the international community is to support different languages, as well as date, time, and other values. The best way to do that is to implement internationalisation throughout the application, so as it will be possible to translate it into some specific language, without any software modifications.

This idea was implemented in the first versions of Svarog, but was partly abandoned in the last decade and approach chosen at that time is at risk of becoming deprecated in the next versions of Java. As the codebase increased, including new modules and plugins, a need emerged to rethink and implement an updated internationalisation support covering 100% of the application. Consequently, the aim of the project is to review and redesign parts of current code in relation to i18n and to implement a full support for localisation Svarog to any new language.

[1] http://svarog.pl
[2] SignalML: metaformat for description of biomedical time series P.J. Durka and D. Ircha, Computer Methods and Programs in Biomedicine Volume 76, Issue 3, pp. 253-259, December 2004, http://braintech.pl/wp-content/uploads/2015/09/SignalML.pdf
[3] http://signalml.org

Skills needed: Advanced Java programming skills.

Mentors:
Mateusz Kruszyński mateusz.kruszynski@braintech.pl, BrainTech Ltd. http://braintech.pl
Piotr Durka durka@fuw.edu.pl, University of Warsaw http://www.fuw.edu.pl


OpenWorm Projects

12.1 Advanced Neuron Dynamics in WormSim

Description: The OpenWorm project is building a simulation of the C. elegans in an open science fashion. Last year, OpenWorm released WormSim, which puts a simple version of the worm simulation online, making it available within a web browser without any need to compile any code, courtesy of Geppetto. Under the hood, Geppetto reuses a lot of open source libraries, both on the browser client side, and many java-based libraries on the server side.

Geppetto functionality has been built with a strong focus on its API, both server side and client side with Javascript to ensure reproducibility and scripting capabilities. The console based interactions are ideal for developers and testers in order for scientists to easily access all the existing functionality.

A live demo of Geppetto can be found at https://live.geppetto.org and the documentation can be found at http://docs.geppetto.org. The public development kanban board can be found at https://waffle.io/openworm/org.geppetto. Additional samples are available (Hodgkin-Huxley Cell, Auditory cortex Network, and Fluid dynamics simulation).

Aims: The current visualization of the C. elegans nervous system in WormSim represents its 302 neurons as spheres connected by lines in a “ball and stick” model laid out in the shape of the worm. It currently only shows connectivity between the neurons, without showing dynamics of the neurons simulated. Since Geppetto 0.2.4, however, there is the potential to add several things to improve the neuronal view experience and make it easier to understand what is going on:

  • Incorporate dynamics of neuronal simulations as color changes, incorporating a neuronal simulator into WormSim
  • Visualize more realistic neuronal 3D shapes (i.e. morphologies), reusing the shapes of neurons from http://browser.openworm.org
  • Add Cytoscape.js-based widgets to show animated network connectivity & dynamics graphs
  • Add dynamic plotly.js-based widgets to show animated 3D phase diagrams that collapse the activity of the network to one picture.

When this project is complete, the candidate would have added all of these to WormSim to enable a new release to the OpenWorm audience.

Skills:
Essential: Javascript, HTML5, CSS, Open source development
Desired: Backbone, WebGL, Java, UI/UX, Computational neuroscience training

Mentors: Matteo Cantarelli (matteo@openworm.org), Giovanni Idili (giovanni@openworm.org), Stephen Larson (stephen@openworm.org)


12.2 Model completion dashboard for OpenWorm computational neuroscience

Description: The OpenWorm project is building a simulation of the C. elegans in an open science fashion. The model completion dashboard is a web-based visualization of the digital versions of biological entities that are currently captured within OpenWorm’s database API, PyOpenWorm. This interface is designed to display the results of the unifying modeling activity, and should be coordinated with the crowdsourcing platform for C. elegans ion channels known as ChannelWorm. This interface allows a user to drill down into our model, and view the states of completion of modeled components at each level. At the highest level, matrices display, using a color indicator, the level of completion of each cell in the model. Rolling over the data displayed at each level gives information about the references for that particular piece of data.

Aims: In 2015, a good start was made towards implementing this interface. This year we want to complete the interface and put it into production, to enabling greater transparency to the OpenWorm community for what modeling has been done, and what modeling is left to accomplish. We also want to integrate this with the scientific roadmap, which will also involve line graphs.

Skills:
Essential:
Python, Javascript, HTML5, CSS, Open source development.
Desired: Good communication skills, database experience, UI/UX

Mentor: Stephen Larson (stephen@openworm.org)


12.3 Physics-based Modeling of the Mosaic Embryo in CompuCell3D

Description: The DevoWorm project is building a physics-based simulation of mosaic embryogenesis, with an emphasis on the nematode Caenorhabditis elegans. This initiative will focus on incorporating secondary data from nematodes and other species (e.g. sea squirts) with existing computational frameworks. The outcome will be to create physics-based models describing the process of cellular differentiation along with trees and networks that describe the relationships between cells. Our investigations of embryogenesis have an emphasis on modeling the precursor of neuronal cells and nervous systems.

Aims: The current plan is to utilize existing software platforms, including CompuCell3D, GraphViz, and Gephi. Experience with C++ is essential for building representations of the function and physical interactions between cells via CompuCell3D, while experience with Java and Python would be helpful in working with tree and graph representations (e.g. GraphViz and Gephi). The goal would be to work on the CompuCell3D model first, and then move toward integrating these models with tree and network visualizations. Therefore, the project has a longer-term arc, and the applicant would be encouraged to contribute beyond the formal "Summer of Code". Our commitment to various software and languages beyond the CompuCell3D models is open-ended, and is contingent on a "whatever works, works" philosophy. If successfully developed, these models would fill a critical gap in the DevoWorm project, namely the ability to simulate the physical constraints and intercellular signaling potential within whole embryos among systems that exhibit deterministic cellular differentiation.

Skills: open source development experience with C++, Python, Java. An interest in the underlying biological processes is essential, and applicants with strong abstract thinking abilities are preferred. Good communication skills and familiarity with open science practices are expected.

Mentor: Bradly Alicea (balicea@openworm.org)


12.4 Image processing with ImageJ (segmentation of high-resolution images)

Description: The DevoWorm project is looking to build segmentation and digitization tools to extract information from microscopy imaging data. The project will focus on finding a way to extract data information about cellular structures and a cell's relative position within the embryo. The outcome would be open-source code for ImageJ plugins (Java) and applications run in Python. These tools will have particular relevance for identifying and parameterizing the precursors of neuronal cells within early-stage embryos.

Aims: The current plan is to develop plug-ins for ImageJ (Java) or scripts and other code to be run in a Python environment. This will require some research on the student's part. While the documentation exists for building applications in these two domains, our project does not currently have a library of existing software in this area. Our longer-term goal is to have a set of ready-made tools for extracting numeric data from high-resolution microscopy images and high-quality publication images. The student will work with the mentor to formulate the details of feature extraction and feature engineering, and then implement and benchmark these solutions on a number of secondary datasets.

Skills: open source development experience with Java and Python, along with coursework in machine learning/pattern recognition/computer vision. An interest in the underlying biological processes is essential. Good communication skills and familiarity with open science practices are expected.

Mentor: Bradly Alicea (balicea@openworm.org)


13. GPU implementations for MOOSE

Neuronal and multiscale simulations are computationally demanding, yet they require large numbers of very similar calculations. One of the core problems in this domain is to rapidly perform detailed single-neuron calculations. These are typically the bottleneck in detailed multiscale and network models.

The central computation is solution of a large, almost tridiagonal matrix representing compartments in a neuron. Individual entries in this matrix require an inner loop to compute current contributions to the compartment. It is a particularly interesting problem to optimize GPU computations for these neuron calculations, since there is a tradeoff between memory transfers and speed of individual GPU cores.

The MOOSE project has had an effort on GPU implementations since 2014, and this has progressed well both through GSoC efforts and through in-house work. There is a GitHub repository and documentation for the effort. The current stage is very interesting for a GSoC project as it involves fine-tuning and optimizing extant GPU code, and smooth integration of this into the main MOOSE code-base.

Skills: The project requires good C++ knowledge and recommended experience with MPI and/or one of the GPU environments like OpenCL and/or CUDA.

Mentors: Upi Bhalla, Dilawar Singh (NCBS, India)