3,340 research outputs found

    Hacia el modelado 3d de tumores cerebrales mediante endoneurosonografía y redes neuronales

    Get PDF
    Las cirugías mínimamente invasivas se han vuelto populares debido a que implican menos riesgos con respecto a las intervenciones tradicionales. En neurocirugía, las tendencias recientes sugieren el uso conjunto de la endoscopia y el ultrasonido, técnica llamada endoneurosonografía (ENS), para la virtualización 3D de las estructuras del cerebro en tiempo real. La información ENS se puede utilizar para generar modelos 3D de los tumores del cerebro durante la cirugía. En este trabajo, presentamos una metodología para el modelado 3D de tumores cerebrales con ENS y redes neuronales. Específicamente, se estudió el uso de mapas auto-organizados (SOM) y de redes neuronales tipo gas (NGN). En comparación con otras técnicas, el modelado 3D usando redes neuronales ofrece ventajas debido a que la morfología del tumor se codifica directamente sobre los pesos sinápticos de la red, no requiere ningún conocimiento a priori y la representación puede ser desarrollada en dos etapas: entrenamiento fuera de línea y adaptación en línea. Se realizan pruebas experimentales con maniquíes médicos de tumores cerebrales. Al final del documento, se presentan los resultados del modelado 3D a partir de una base de datos ENS.Minimally invasive surgeries have become popular because they reduce the typical risks of traditional interventions. In neurosurgery, recent trends suggest the combined use of endoscopy and ultrasound (endoneurosonography or ENS) for 3D virtualization of brain structures in real time. The ENS information can be used to generate 3D models of brain tumors during a surgery. This paper introduces a methodology for 3D modeling of brain tumors using ENS and unsupervised neural networks. The use of self-organizing maps (SOM) and neural gas networks (NGN) is particularly studied. Compared to other techniques, 3D modeling using neural networks offers advantages, since tumor morphology is directly encoded in synaptic weights of the network, no a priori knowledge is required, and the representation can be developed in two stages: off-line training and on-line adaptation. Experimental tests were performed using virtualized phantom brain tumors. At the end of the paper, the results of 3D modeling from an ENS database are presented

    ModDrop: adaptive multi-modal gesture recognition

    Full text link
    We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure

    Fourteenth Biennial Status Report: März 2017 - February 2019

    No full text

    Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities

    Full text link
    The vast proliferation of sensor devices and Internet of Things enables the applications of sensor-based activity recognition. However, there exist substantial challenges that could influence the performance of the recognition system in practical scenarios. Recently, as deep learning has demonstrated its effectiveness in many areas, plenty of deep methods have been investigated to address the challenges in activity recognition. In this study, we present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition. We first introduce the multi-modality of the sensory data and provide information for public datasets that can be used for evaluation in different challenge tasks. We then propose a new taxonomy to structure the deep methods by challenges. Challenges and challenge-related deep methods are summarized and analyzed to form an overview of the current research progress. At the end of this work, we discuss the open issues and provide some insights for future directions

    Cortical Dynamics of Navigation and Steering in Natural Scenes: Motion-Based Object Segmentation, Heading, and Obstacle Avoidance

    Full text link
    Visually guided navigation through a cluttered natural scene is a challenging problem that animals and humans accomplish with ease. The ViSTARS neural model proposes how primates use motion information to segment objects and determine heading for purposes of goal approach and obstacle avoidance in response to video inputs from real and virtual environments. The model produces trajectories similar to those of human navigators. It does so by predicting how computationally complementary processes in cortical areas MT-/MSTv and MT+/MSTd compute object motion for tracking and self-motion for navigation, respectively. The model retina responds to transients in the input stream. Model V1 generates a local speed and direction estimate. This local motion estimate is ambiguous due to the neural aperture problem. Model MT+ interacts with MSTd via an attentive feedback loop to compute accurate heading estimates in MSTd that quantitatively simulate properties of human heading estimation data. Model MT interacts with MSTv via an attentive feedback loop to compute accurate estimates of speed, direction and position of moving objects. This object information is combined with heading information to produce steering decisions wherein goals behave like attractors and obstacles behave like repellers. These steering decisions lead to navigational trajectories that closely match human performance.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National Geospatial Intelligence Agency (NMA201-01-1-2016

    Multimodal machine learning for intelligent mobility

    Get PDF
    Scientific problems are solved by finding the optimal solution for a specific task. Some problems can be solved analytically while other problems are solved using data driven methods. The use of digital technologies to improve the transportation of people and goods, which is referred to as intelligent mobility, is one of the principal beneficiaries of data driven solutions. Autonomous vehicles are at the heart of the developments that propel Intelligent Mobility. Due to the high dimensionality and complexities involved in real-world environments, it needs to become commonplace for intelligent mobility to use data-driven solutions. As it is near impossible to program decision making logic for every eventuality manually. While recent developments of data-driven solutions such as deep learning facilitate machines to learn effectively from large datasets, the application of techniques within safety-critical systems such as driverless cars remain scarce.Autonomous vehicles need to be able to make context-driven decisions autonomously in different environments in which they operate. The recent literature on driverless vehicle research is heavily focused only on road or highway environments but have discounted pedestrianized areas and indoor environments. These unstructured environments tend to have more clutter and change rapidly over time. Therefore, for intelligent mobility to make a significant impact on human life, it is vital to extend the application beyond the structured environments. To further advance intelligent mobility, researchers need to take cues from multiple sensor streams, and multiple machine learning algorithms so that decisions can be robust and reliable. Only then will machines indeed be able to operate in unstructured and dynamic environments safely. Towards addressing these limitations, this thesis investigates data driven solutions towards crucial building blocks in intelligent mobility. Specifically, the thesis investigates multimodal sensor data fusion, machine learning, multimodal deep representation learning and its application of intelligent mobility. This work demonstrates that mobile robots can use multimodal machine learning to derive driver policy and therefore make autonomous decisions.To facilitate autonomous decisions necessary to derive safe driving algorithms, we present an algorithm for free space detection and human activity recognition. Driving these decision-making algorithms are specific datasets collected throughout this study. They include the Loughborough London Autonomous Vehicle dataset, and the Loughborough London Human Activity Recognition dataset. The datasets were collected using an autonomous platform design and developed in house as part of this research activity. The proposed framework for Free-Space Detection is based on an active learning paradigm that leverages the relative uncertainty of multimodal sensor data streams (ultrasound and camera). It utilizes an online learning methodology to continuously update the learnt model whenever the vehicle experiences new environments. The proposed Free Space Detection algorithm enables an autonomous vehicle to self-learn, evolve and adapt to new environments never encountered before. The results illustrate that online learning mechanism is superior to one-off training of deep neural networks that require large datasets to generalize to unfamiliar surroundings. The thesis takes the view that human should be at the centre of any technological development related to artificial intelligence. It is imperative within the spectrum of intelligent mobility where an autonomous vehicle should be aware of what humans are doing in its vicinity. Towards improving the robustness of human activity recognition, this thesis proposes a novel algorithm that classifies point-cloud data originated from Light Detection and Ranging sensors. The proposed algorithm leverages multimodality by using the camera data to identify humans and segment the region of interest in point cloud data. The corresponding 3-dimensional data was converted to a Fisher Vector Representation before being classified by a deep Convolutional Neural Network. The proposed algorithm classifies the indoor activities performed by a human subject with an average precision of 90.3%. When compared to an alternative point cloud classifier, PointNet[1], [2], the proposed framework out preformed on all classes. The developed autonomous testbed for data collection and algorithm validation, as well as the multimodal data-driven solutions for driverless cars, is the major contributions of this thesis. It is anticipated that these results and the testbed will have significant implications on the future of intelligent mobility by amplifying the developments of intelligent driverless vehicles.</div

    Deep Learning Methods for Human Activity Recognition using Wearables

    Get PDF
    Wearable sensors provide an infrastructure-less multi-modal sensing method. Current trends point to a pervasive integration of wearables into our lives with these devices providing the basis for wellness and healthcare applications across rehabilitation, caring for a growing older population, and improving human performance. Fundamental to these applications is our ability to automatically and accurately recognise human activities from often tiny sensors embedded in wearables. In this dissertation, we consider the problem of human activity recognition (HAR) using multi-channel time-series data captured by wearable sensors. Our collective know-how regarding the solution of HAR problems with wearables has progressed immensely through the use of deep learning paradigms. Nevertheless, this field still faces unique methodological challenges. As such, this dissertation focuses on developing end-to-end deep learning frameworks to promote HAR application opportunities using wearable sensor technologies and to mitigate specific associated challenges. In our efforts, the investigated problems cover a diverse range of HAR challenges and spans from fully supervised to unsupervised problem domains. In order to enhance automatic feature extraction from multi-channel time-series data for HAR, the problem of learning enriched and highly discriminative activity feature representations with deep neural networks is considered. Accordingly, novel end-to-end network elements are designed which: (a) exploit the latent relationships between multi-channel sensor modalities and specific activities, (b) employ effective regularisation through data-agnostic augmentation for multi-modal sensor data streams, and (c) incorporate optimization objectives to encourage minimal intra-class representation differences, while maximising inter-class differences to achieve more discriminative features. In order to promote new opportunities in HAR with emerging battery-less sensing platforms, the problem of learning from irregularly sampled and temporally sparse readings captured by passive sensing modalities is considered. For the first time, an efficient set-based deep learning framework is developed to address the problem. This framework is able to learn directly from the generated data, bypassing the need for the conventional interpolation pre-processing stage. In order to address the multi-class window problem and create potential solutions for the challenging task of concurrent human activity recognition, the problem of enabling simultaneous prediction of multiple activities for sensory segments is considered. As such, the flexibility provided by the emerging set learning concepts is further leveraged to introduce a novel formulation of HAR. This formulation treats HAR as a set prediction problem and elegantly caters for segments carrying sensor data from multiple activities. To address this set prediction problem, a unified deep HAR architecture is designed that: (a) incorporates a set objective to learn mappings from raw input sensory segments to target activity sets, and (b) precedes the supervised learning phase with unsupervised parameter pre-training to exploit unlabelled data for better generalisation performance. In order to leverage the easily accessible unlabelled activity data-streams to serve downstream classification tasks, the problem of unsupervised representation learning from multi-channel time-series data is considered. For the first time, a novel recurrent generative adversarial (GAN) framework is developed that explores the GAN’s latent feature space to extract highly discriminating activity features in an unsupervised fashion. The superiority of the learned representations is substantiated by their ability to outperform the de facto unsupervised approaches based on autoencoder frameworks. At the same time, they rival the recognition performance of fully supervised trained models on downstream classification benchmarks. In recognition of the scarcity of large-scale annotated sensor datasets and the tediousness of collecting additional labelled data in this domain, the hitherto unexplored problem of end-to-end clustering of human activities from unlabelled wearable data is considered. To address this problem, a first study is presented for the purpose of developing a stand-alone deep learning paradigm to discover semantically meaningful clusters of human actions. In particular, the paradigm is intended to: (a) leverage the inherently sequential nature of sensory data, (b) exploit self-supervision from reconstruction and future prediction tasks, and (c) incorporate clustering-oriented objectives to promote the formation of highly discriminative activity clusters. The systematic investigations in this study create new opportunities for HAR to learn human activities using unlabelled data that can be conveniently and cheaply collected from wearables.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202
    corecore