12 research outputs found
Neuron Segmentation Using Deep Complete Bipartite Networks
In this paper, we consider the problem of automatically segmenting neuronal
cells in dual-color confocal microscopy images. This problem is a key task in
various quantitative analysis applications in neuroscience, such as tracing
cell genesis in Danio rerio (zebrafish) brains. Deep learning, especially using
fully convolutional networks (FCN), has profoundly changed segmentation
research in biomedical imaging. We face two major challenges in this problem.
First, neuronal cells may form dense clusters, making it difficult to correctly
identify all individual cells (even to human experts). Consequently,
segmentation results of the known FCN-type models are not accurate enough.
Second, pixel-wise ground truth is difficult to obtain. Only a limited amount
of approximate instance-wise annotation can be collected, which makes the
training of FCN models quite cumbersome. We propose a new FCN-type deep
learning model, called deep complete bipartite networks (CB-Net), and a new
scheme for leveraging approximate instance-wise annotation to train our
pixel-wise prediction model. Evaluated using seven real datasets, our proposed
new CB-Net model outperforms the state-of-the-art FCN models and produces
neuron segmentation results of remarkable qualityComment: miccai 201
Studying Software Engineering Patterns for Designing Machine Learning Systems
Machine-learning (ML) techniques have become popular in the recent years. ML
techniques rely on mathematics and on software engineering. Researchers and
practitioners studying best practices for designing ML application systems and
software to address the software complexity and quality of ML techniques. Such
design practices are often formalized as architecture patterns and design
patterns by encapsulating reusable solutions to commonly occurring problems
within given contexts. However, to the best of our knowledge, there has been no
work collecting, classifying, and discussing these software-engineering (SE)
design patterns for ML techniques systematically. Thus, we set out to collect
good/bad SE design patterns for ML techniques to provide developers with a
comprehensive and ordered classification of such patterns. We report here
preliminary results of a systematic-literature review (SLR) of good/bad design
patterns for ML
Automatic Fault Detection for Deep Learning Programs Using Graph Transformations
Nowadays, we are witnessing an increasing demand in both corporates and
academia for exploiting Deep Learning (DL) to solve complex real-world
problems. A DL program encodes the network structure of a desirable DL model
and the process by which the model learns from the training dataset. Like any
software, a DL program can be faulty, which implies substantial challenges of
software quality assurance, especially in safety-critical domains. It is
therefore crucial to equip DL development teams with efficient fault detection
techniques and tools. In this paper, we propose NeuraLint, a model-based fault
detection approach for DL programs, using meta-modelling and graph
transformations. First, we design a meta-model for DL programs that includes
their base skeleton and fundamental properties. Then, we construct a
graph-based verification process that covers 23 rules defined on top of the
meta-model and implemented as graph transformations to detect faults and design
inefficiencies in the generated models (i.e., instances of the meta-model).
First, the proposed approach is evaluated by finding faults and design
inefficiencies in 28 synthesized examples built from common problems reported
in the literature. Then NeuraLint successfully finds 64 faults and design
inefficiencies in 34 real-world DL programs extracted from Stack Overflow posts
and GitHub repositories. The results show that NeuraLint effectively detects
faults and design issues in both synthesized and real-world examples with a
recall of 70.5 % and a precision of 100 %. Although the proposed meta-model is
designed for feedforward neural networks, it can be extended to support other
neural network architectures such as recurrent neural networks. Researchers can
also expand our set of verification rules to cover more types of issues in DL
programs
Application performance evaluation using Deep Learning
Developing software for exascale systems will become even
more challenging than for today’s systems. Methods for evaluating
the performance of applications and identifying potential
weaknesses are essential for reaching optimal performance.
Though the tools available today are not widely used,
and generally require some expert knowledge.
In recent years different deep learning techniques have enjoyed
great success in various fields, and especially in image
recognition. Though it is still to find its way in to the area of
application performance evaluation.
This work will take the first step towards introducing deep
learning to the area of HPC performance evaluation, opening
the door for others. Convolutional neural networks will be
fed images of timeline views of HPC applications and will
identify the intrinsic behavior of the application and return
some principal performance metrics.
The results show that deep learning techniques indeed can
be utilized for evaluating the performance of parallel applications,
with the main limitation for its success being the sizes
of the data sets available. Furthermore a number of exciting
directions for taking the next step utilizing deep learning
techniques with performance evaluation are suggested
The Development of Regional Forest Inventories Through Novel Means
For two decades Light Detection and Ranging (LiDAR) data has been used to develop spatially-explicit forest inventories. Data derived from LiDAR depict three-dimensional forest canopy structure and are useful for predicting forest attributes such as biomass, stem density, and species. Such enhanced forest inventories (EFIs) are useful for carbon accounting, forest management, and wildlife habitat characterization by allowing practitioners to target specific areas without extensive field work. Here in New England, LiDAR data covers nearly the entire geographical extent of the region. However, until now the region’s forest attributes have not been mapped. Developing regional inventories has traditionally been problematic because most regions – including New England – are comprised of a patchwork of datasets acquired with various specifications. These variations in specifications prohibit developing a single set of predictive models for a region. The purpose of this work is to develop a new set of modeling techniques, allowing for EFIs consisting of disparate LiDAR datasets. The work presented in the first chapter improves upon existing LiDAR modeling techniques by developing a new set of metrics for quantifying LiDAR based on ecological ii principles. These fall into five categories: canopy height, canopy complexity, individual tree attributes, crowding, and abiotic. These metrics were compared to those traditionally used, and results indicated that they are a more effective means of modeling forest attributes across multiple LiDAR datasets. In the following chapters, artificial intelligence (AI) algorithms were developed to interpret LiDAR data and make forest predictions. After settling on the optimal algorithm, we incorporated satellite spectral, disturbance, and climate data. Our results indicated that this approach dramatically outperformed the traditional modeling techniques. We then applied the AI model to the region’s LiDAR, developing 10 m resolution wall-to-wall forest inventory maps of fourteen forest attributes. We assessed error using U.S. federal inventory data, and determined that our EFIs did not differ significantly in 33, 25, and 30/38 counties when predicting biomass, percent conifer, and stem density. We were ultimately able to develop the region’s most complete and detailed forest inventories. This will allow practitioners to assess forest characteristics without the cost and effort associated with extensive field-inventories
Testing Feedforward Neural Networks Training Programs
Nowadays, we are witnessing an increasing effort to improve the performance
and trustworthiness of Deep Neural Networks (DNNs), with the aim to enable
their adoption in safety critical systems such as self-driving cars. Multiple
testing techniques are proposed to generate test cases that can expose
inconsistencies in the behavior of DNN models. These techniques assume
implicitly that the training program is bug-free and appropriately configured.
However, satisfying this assumption for a novel problem requires significant
engineering work to prepare the data, design the DNN, implement the training
program, and tune the hyperparameters in order to produce the model for which
current automated test data generators search for corner-case behaviors. All
these model training steps can be error-prone. Therefore, it is crucial to
detect and correct errors throughout all the engineering steps of DNN-based
software systems and not only on the resulting DNN model. In this paper, we
gather a catalog of training issues and based on their symptoms and their
effects on the behavior of the training program, we propose practical
verification routines to detect the aforementioned issues, automatically, by
continuously validating that some important properties of the learning dynamics
hold during the training. Then, we design, TheDeepChecker, an end-to-end
property-based debugging approach for DNN training programs. We assess the
effectiveness of TheDeepChecker on synthetic and real-world buggy DL programs
and compare it with Amazon SageMaker Debugger (SMD). Results show that
TheDeepChecker's on-execution validation of DNN-based program's properties
succeeds in revealing several coding bugs and system misconfigurations, early
on and at a low cost. Moreover, TheDeepChecker outperforms the SMD's offline
rules verification on training logs in terms of detection accuracy and DL bugs
coverage