60 research outputs found

    On coding labeled trees

    Get PDF
    Trees are probably the most studied class of graphs in Computer Science. In this thesis we study bijective codes that represent labeled trees by means of string of node labels. We contribute to the understanding of their algorithmic tractability, their properties, and their applications. The thesis is divided into two parts. In the first part we focus on two types of tree codes, namely Prufer-like codes and Transformation codes. We study optimal encoding and decoding algorithms, both in a sequential and in a parallel setting. We propose a unified approach that works for all Prufer-like codes and a more generic scheme based on the transformation of a tree into a functional digraph suitable for all bijective codes. Our results in this area close a variety of open problems. We also consider possible applications of tree encodings, discussing how to exploit these codes in Genetic Algorithms and in the generation of random trees. Moreover, we introduce a modified version of a known code that, in Genetic Algorithms, outperform all the other known codes. In the second part of the thesis we focus on two possible generalizations of our work. We first take into account the classes of k-trees and k-arch graphs (both superclasses of trees): we study bijective codes for this classes of graphs and their algorithmic feasibility. Then, we shift our attention to Informative Labeling Schemes. In this context labels are no longer considered as simple unique node identifiers, they rather convey information useful to achieve efficient computations on the tree. We exploit this idea to design a concurrent data structure for the lowest common ancestor problem on dynamic trees. We also present an experimental comparison between our labeling scheme and the one proposed by Peleg for static trees

    Advances in integrating autonomy with acoustic communications for intelligent networks of marine robots

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2013Autonomous marine vehicles are increasingly used in clusters for an array of oceanographic tasks. The effectiveness of this collaboration is often limited by communications: throughput, latency, and ease of reconfiguration. This thesis argues that improved communication on intelligent marine robotic agents can be gained from acting on knowledge gained by improved awareness of the physical acoustic link and higher network layers by the AUV’s decision making software. This thesis presents a modular acoustic networking framework, realized through a C++ library called goby-acomms, to provide collaborating underwater vehicles with an efficient short-range single-hop network. goby-acomms is comprised of four components that provide: 1) losslessly compressed encoding of short messages; 2) a set of message queues that dynamically prioritize messages based both on overall importance and time sensitivity; 3) Time Division Multiple Access (TDMA) Medium Access Control (MAC) with automatic discovery; and 4) an abstract acoustic modem driver. Building on this networking framework, two approaches that use the vehicle’s “intelligence” to improve communications are presented. The first is a “non-disruptive” approach which is a novel technique for using state observers in conjunction with an entropy source encoder to enable highly compressed telemetry of autonomous underwater vehicle (AUV) position vectors. This system was analyzed on experimental data and implemented on a fielded vehicle. Using an adaptive probability distribution in combination with either of two state observer models, greater than 90% compression, relative to a 32-bit integer baseline, was achieved. The second approach is “disruptive,” as it changes the vehicle’s course to effect an improvement in the communications channel. A hybrid data- and model-based autonomous environmental adaptation framework is presented which allows autonomous underwater vehicles (AUVs) with acoustic sensors to follow a path which optimizes their ability to maintain connectivity with an acoustic contact for optimal sensing or communication.I wish to acknowledge the sponsors of this research for their generous support of my tuition, stipend, and research: the WHOI/MIT Joint Program, the MIT Presidential Fellowship, the Office of Naval Research (ONR) # N00014-08-1-0011, # N00014-08-1-0013, and the ONR PlusNet Program Graduate Fellowship, the Defense Advanced Research Projects Agency (DARPA) (Deep Sea Operations: Applied Physical Sciences (APS) Award # APS 11-15 3352-006, APS 11-15-3352-215 ST 2.6 and 2.7

    A biased random-key genetic algorithm for the capacitated minimum spanning tree problem

    Get PDF
    This paper focuses on the capacitated minimum spanning tree(CMST)problem.Given a central processor and a set of remote terminals with specified demands for traffic that must flow between the central processor and terminals,the goal is to design a minimum cost network to carry this demand. Potential links exist between any pair of terminals and between the central processor and the terminals. Each potential link can be included in the design at a given cost.The CMST problem is to design a minimum-cost network connecting the terminals with the central processor so that the flow on any arc of the network is at most Q. A biased random-keygenetic algorithm(BRKGA)is a metaheuristic for combinatorial optimization which evolves a population of random vectors that encode solutions to the combinatorial optimization problem.This paper explores several solution encodings as well as different strategies for some steps of the algorithm and finally proposes a BRKGA heuristic for the CMST problem. Computational experiments are presented showing the effectivenes sof the approach:Seven newbest- known solutions are presented for the set of benchmark instances used in the experiments.Peer ReviewedPostprint (author’s final draft

    Scalable software and models for large-scale extracellular recordings

    Get PDF
    The brain represents information about the world through the electrical activity of populations of neurons. By placing an electrode near a neuron that is firing (spiking), it is possible to detect the resulting extracellular action potential (EAP) that is transmitted down an axon to other neurons. In this way, it is possible to monitor the communication of a group of neurons to uncover how they encode and transmit information. As the number of recorded neurons continues to increase, however, so do the data processing and analysis challenges. It is crucial that scalable software and analysis tools are developed and made available to the neuroscience community to keep up with the large amounts of data that are already being gathered. This thesis is composed of three pieces of work which I develop in order to better process and analyze large-scale extracellular recordings. My work spans all stages of extracellular analysis from the processing of raw electrical recordings to the development of statistical models to reveal underlying structure in neural population activity. In the first work, I focus on developing software to improve the comparison and adoption of different computational approaches for spike sorting. When analyzing neural recordings, most researchers are interested in the spiking activity of individual neurons, which must be extracted from the raw electrical traces through a process called spike sorting. Much development has been directed towards improving the performance and automation of spike sorting. This continuous development, while essential, has contributed to an over-saturation of new, incompatible tools that hinders rigorous benchmarking and complicates reproducible analysis. To address these limitations, I develop SpikeInterface, an open-source, Python framework designed to unify preexisting spike sorting technologies into a single toolkit and to facilitate straightforward benchmarking of different approaches. With this framework, I demonstrate that modern, automated spike sorters have low agreement when analyzing the same dataset, i.e. they find different numbers of neurons with different activity profiles; This result holds true for a variety of simulated and real datasets. Also, I demonstrate that utilizing a consensus-based approach to spike sorting, where the outputs of multiple spike sorters are combined, can dramatically reduce the number of falsely detected neurons. In the second work, I focus on developing an unsupervised machine learning approach for determining the source location of individually detected spikes that are recorded by high-density, microelectrode arrays. By localizing the source of individual spikes, my method is able to determine the approximate position of the recorded neuriii ons in relation to the microelectrode array. To allow my model to work with large-scale datasets, I utilize deep neural networks, a family of machine learning algorithms that can be trained to approximate complicated functions in a scalable fashion. I evaluate my method on both simulated and real extracellular datasets, demonstrating that it is more accurate than other commonly used methods. Also, I show that location estimates for individual spikes can be utilized to improve the efficiency and accuracy of spike sorting. After training, my method allows for localization of one million spikes in approximately 37 seconds on a TITAN X GPU, enabling real-time analysis of massive extracellular datasets. In my third and final presented work, I focus on developing an unsupervised machine learning model that can uncover patterns of activity from neural populations associated with a behaviour being performed. Specifically, I introduce Targeted Neural Dynamical Modelling (TNDM), a statistical model that jointly models the neural activity and any external behavioural variables. TNDM decomposes neural dynamics (i.e. temporal activity patterns) into behaviourally relevant and behaviourally irrelevant dynamics; the behaviourally relevant dynamics constitute all activity patterns required to generate the behaviour of interest while behaviourally irrelevant dynamics may be completely unrelated (e.g. other behavioural or brain states), or even related to behaviour execution (e.g. dynamics that are associated with behaviour generally but are not task specific). Again, I implement TNDM using a deep neural network to improve its scalability and expressivity. On synthetic data and on real recordings from the premotor (PMd) and primary motor cortex (M1) of a monkey performing a center-out reaching task, I show that TNDM is able to extract low-dimensional neural dynamics that are highly predictive of behaviour without sacrificing its fit to the neural data

    Converging models for transcriptome studies of human diseases : the case of oculopharyngeal muscular dystrophy

    Get PDF
    This dissertation mainly focuses on interdisciplinary approaches for biomedical knowledge discovery. This required special efforts in developing systematic strategies to integrate various data sources and techniques, leading to improved discovery of mechanistic insights on human diseases. Chapter one looks at the possibility in which combining various bioinformatics-based strategies can significantly improve the characterization of the OPMD mouse model. We discuss that this approach in knowledge discovery, on the basis of our extensive analysis, helped us to shed some light on how this model system relates to OPMD pathophysiology in human. In Chapter two, we expand on this combinatory approach by conducting a cross-species data analysis. In this study, we have looked for common patterns that emerge by assessing the transcriptome data from three OPMD model systems and patients. This strategy led to unravelling the most prominent molecular pathway involved in OPMD pathology. The third chapter achieves a similar goal to identify similar molecular and pathophysiological features between OPMD and the common process of skeletal muscle ageing. Engaging in a study in which the focus was made on the universality of biological processes, in the light of evolutionary mechanisms and common functional features, led to novel discoveries. This work helped us uncover remarkable insights on molecular mechanisms of ageing muscles and protein aggregation. Chapters four and five take a different route by tackling the field of computational biology. These chapters aim to extend network inference by providing novel strategies for the exploitation and integration of multiple data sources. We show that these developments allow us to infer more robust regulatory mechanisms to be identified while translations and predictions are made across very different datasets, platforms, and organisms. Finally, the dissertation is concluded by providing an outlook on ways the field of systems biology can evolve in order to offer enhanced, diversified and robust strategies for knowledge discovery.UBL - phd migration 201

    Towards Better Image Embeddings Using Neural Networks

    Get PDF
    The primary focus of this dissertation is to study image embeddings extracted by neural networks. Deep Learning (DL) is preferred over traditional Machine Learning (ML) for the reason that feature representations can be automatically constructed from data without human involvement. On account of the effectiveness of deep features, the last decade has witnessed unprecedented advances in Computer Vision (CV), and more real-world applications are expected to be introduced in the coming years. A diverse collection of studies has been included, covering areas such as person re-identification, vehicle attribute recognition, neural image compression, clustering and unsupervised anomaly detection. More specifically, three aspects of feature representations have been thoroughly analyzed. Firstly, features should be distinctive, i.e., features of samples from distinct categories ought to differ significantly. Extracting distinctive features is essential for image retrieval systems, in which an algorithm finds the gallery sample that is closest to a query sample. Secondly, features should be privacy-preserving, i.e., inferring sensitive information from features must be infeasible. With the widespread adoption of Machine Learning as a Service (MLaaS), utilizing privacy-preserving features prevents privacy violations even if the server has been compromised. Thirdly, features should be compressible, i.e., compact features are preferable as they require less storage space. Obtaining compressible features plays a vital role in data compression. Towards the goal of deriving distinctive, privacy-preserving and compressible feature representations, research articles included in this dissertation reveal different approaches to improving image embeddings learned by neural networks. This topic remains a fundamental challenge in Machine Learning, and further research is needed to gain a deeper understanding

    ON NEURAL ARCHITECTURES FOR SEGMENTATION IN NATURAL AND MEDICAL IMAGES

    Get PDF
    Segmentation is an important research field in computer vision. It requires recognizing and segmenting the objects at the pixel level. In the past decade, many deep neural networks have been proposed, which have been central to the development in this area. These frameworks have demonstrated human-level or beyond performance on many challenging benchmarks, and have been widely used in many real-life applications, including surveillance, autonomous driving, and medical image analysis. However, it is non-trivial to design neural architectures with both efficiency and effectiveness, especially when they need to be tailored to the target tasks and datasets. In this dissertation, I will present our research works in this area from the following aspects. (i) To enable automatic neural architecture design on the costly 3D medical image segmentation, we propose an efficient and effective neural architecture search algorithm that tackles the problem in a coarse-to-fine manner. (ii) To further take advantage of the neural architecture search, we propose to search for a channel-level replacement for 3D networks, which leads to strong alternatives to 3D networks. (iii) To perform segmentation with great detail, we design a coarse-to-fine segmentation framework for matting-level segmentation; (iv) To provide stronger features for segmentation, we propose a stronger transformer-based backbone that can work on dense tasks. (v) To better resolve the panoptic segmentation problem in an end-to-end manner, we propose to combine transformers with the traditional clustering algorithm, which leads to a more intuitive segmentation framework with better performance
    corecore