2,721 research outputs found

    A Self-Organizing Neural System for Learning to Recognize Textured Scenes

    Full text link
    A self-organizing ARTEX model is developed to categorize and classify textured image regions. ARTEX specializes the FACADE model of how the visual cortex sees, and the ART model of how temporal and prefrontal cortices interact with the hippocampal system to learn visual recognition categories and their names. FACADE processing generates a vector of boundary and surface properties, notably texture and brightness properties, by utilizing multi-scale filtering, competition, and diffusive filling-in. Its context-sensitive local measures of textured scenes can be used to recognize scenic properties that gradually change across space, as well a.s abrupt texture boundaries. ART incrementally learns recognition categories that classify FACADE output vectors, class names of these categories, and their probabilities. Top-down expectations within ART encode learned prototypes that pay attention to expected visual features. When novel visual information creates a poor match with the best existing category prototype, a memory search selects a new category with which classify the novel data. ARTEX is compared with psychophysical data, and is benchmarked on classification of natural textures and synthetic aperture radar images. It outperforms state-of-the-art systems that use rule-based, backpropagation, and K-nearest neighbor classifiers.Defense Advanced Research Projects Agency; Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657

    Development of a Novel Technique for Predicting Tumor Response in Adaptive Radiation Therapy

    Get PDF
    This dissertation concentrates on the introduction of Predictive Adaptive Radiation Therapy (PART) as a potential method to improve cancer treatment. PART is a novel technique that utilizes volumetric image-guided radiation therapy treatment (IGRT) data to actively predict the tumor response to therapy and estimate clinical outcomes during the course of treatment. To implement PART, a patient database containing IGRT image data for 40 lesions obtained from patients who were imaged and treated with helical tomotherapy was constructed. The data was then modeled using locally weighted regression. This model predicts future tumor volumes and masses and the associated confidence intervals based on limited observations during the first two weeks of treatment. All predictions were made using only 8 days worth of observations from early in the treatment and were all bound by a 95% confidence interval. Since the predictions were accurate with quantified uncertainty, they could eventually be used to optimize and adapt treatment accordingly, hence the term PART (Predictive Adaptive Radiation Therapy). A challenge in implementing PART in a clinical setting is the increased quality assurance that it will demand. To help ease this burden, a technique was developed to automatically evaluate helical tomotherapy treatments during delivery using exit detector data. This technique uses an auto-associative kernel regression (AAKR) model to detect errors in tomotherapy delivery. This modeling scheme is especially suited for the problem of monitoring the fluence values found in the exit detector data because it is able to learn the complex detector data relationships. Several AAKR models were tested using tomotherapy detector data from deliveries that had intentionally inserted errors and different attenuations from the sinograms that were used to develop the model. The model proved to be robust and could predict the correct “error-free” values for a projection in which the opening time of a single MLC leaf had been decreased by 10%. The model also was able to determine machine output errors. The automation of this technique should significantly ease the QA burden that accompanies adaptive therapy, and will help to make the implementation of PART more feasible

    PlaNet - Photo Geolocation with Convolutional Neural Networks

    Full text link
    Is it possible to build a system to determine the location where a photo was taken using just its pixels? In general, the problem seems exceptionally difficult: it is trivial to construct situations where no location can be inferred. Yet images often contain informative cues such as landmarks, weather patterns, vegetation, road markings, and architectural details, which in combination may allow one to determine an approximate location and occasionally an exact location. Websites such as GeoGuessr and View from your Window suggest that humans are relatively good at integrating these cues to geolocate images, especially en-masse. In computer vision, the photo geolocation problem is usually approached using image retrieval methods. In contrast, we pose the problem as one of classification by subdividing the surface of the earth into thousands of multi-scale geographic cells, and train a deep network using millions of geotagged images. While previous approaches only recognize landmarks or perform approximate matching using global image descriptors, our model is able to use and integrate multiple visible cues. We show that the resulting model, called PlaNet, outperforms previous approaches and even attains superhuman levels of accuracy in some cases. Moreover, we extend our model to photo albums by combining it with a long short-term memory (LSTM) architecture. By learning to exploit temporal coherence to geolocate uncertain photos, we demonstrate that this model achieves a 50% performance improvement over the single-image model

    Deep learning cardiac motion analysis for human survival prediction

    Get PDF
    Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using cardiac magnetic resonance imaging, to create time-resolved three-dimensional segmentations using a fully convolutional network trained on anatomical shape priors. This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. To handle right-censored survival outcomes, our network used a Cox partial likelihood loss function. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p < .0001) for our model C=0.73 (95%\% CI: 0.68 - 0.78) than the human benchmark of C=0.59 (95%\% CI: 0.53 - 0.65). This work demonstrates how a complex computer vision task using high-dimensional medical image data can efficiently predict human survival

    Platonic model of mind as an approximation to neurodynamics

    Get PDF
    Hierarchy of approximations involved in simplification of microscopic theories, from sub-cellural to the whole brain level, is presented. A new approximation to neural dynamics is described, leading to a Platonic-like model of mind based on psychological spaces. Objects and events in these spaces correspond to quasi-stable states of brain dynamics and may be interpreted from psychological point of view. Platonic model bridges the gap between neurosciences and psychological sciences. Static and dynamic versions of this model are outlined and Feature Space Mapping, a neurofuzzy realization of the static version of Platonic model, described. Categorization experiments with human subjects are analyzed from the neurodynamical and Platonic model points of view

    Linear Regression and Unsupervised Learning For Tracking and Embodied Robot Control.

    Get PDF
    Computer vision problems, such as tracking and robot navigation, tend to be solved using models of the objects of interest to the problem. These models are often either hard-coded, or learned in a supervised manner. In either case, an engineer is required to identify the visual information that is important to the task, which is both time consuming and problematic. Issues with these engineered systems relate to the ungrounded nature of the knowledge imparted by the engineer, where the systems have no meaning attached to the representations. This leads to systems that are brittle and are prone to failure when expected to act in environments not envisaged by the engineer. The work presented in this thesis removes the need for hard-coded or engineered models of either visual information representations or behaviour. This is achieved by developing novel approaches for learning from example, in both input (percept) and output (action) spaces. This approach leads to the development of novel feature tracking algorithms, and methods for robot control. Applying this approach to feature tracking, unsupervised learning is employed, in real time, to build appearance models of the target that represent the input space structure, and this structure is exploited to partition banks of computationally efficient, linear regression based target displacement estimators. This thesis presents the first application of regression based methods to the problem of simultaneously modeling and tracking a target object. The computationally efficient Linear Predictor (LP) tracker is investigated, along with methods for combining and weighting flocks of LP’s. The tracking algorithms developed operate with accuracy comparable to other state of the art online approaches and with a significant gain in computational efficiency. This is achieved as a result of two specific contributions. First, novel online approaches for the unsupervised learning of modes of target appearance that identify aspects of the target are introduced. Second, a general tracking framework is developed within which the identified aspects of the target are adaptively associated to subsets of a bank of LP trackers. This results in the partitioning of LP’s and the online creation of aspect specific LP flocks that facilitate tracking through significant appearance changes. Applying the approach to the percept action domain, unsupervised learning is employed to discover the structure of the action space, and this structure is used in the formation of meaningful perceptual categories, and to facilitate the use of localised input-output (percept-action) mappings. This approach provides a realisation of an embodied and embedded agent that organises its perceptual space and hence its cognitive process based on interactions with its environment. Central to the proposed approach is the technique of clustering an input-output exemplar set, based on output similarity, and using the resultant input exemplar groupings to characterise a perceptual category. All input exemplars that are coupled to a certain class of outputs form a category - the category of a given affordance, action or function. In this sense the formed perceptual categories have meaning and are grounded in the embodiment of the agent. The approach is shown to identify the relative importance of perceptual features and is able to solve percept-action tasks, defined only by demonstration, in previously unseen situations. Within this percept-action learning framework, two alternative approaches are developed. The first approach employs hierarchical output space clustering of point-to-point mappings, to achieve search efficiency and input and output space generalisation as well as a mechanism for identifying the important variance and invariance in the input space. The exemplar hierarchy provides, in a single structure, a mechanism for classifying previously unseen inputs and generating appropriate outputs. The second approach to a percept-action learning framework integrates the regression mappings used in the feature tracking domain, with the action space clustering and imitation learning techniques developed in the percept-action domain. These components are utilised within a novel percept-action data mining methodology, that is able to discover the visual entities that are important to a specific problem, and to map from these entities onto the action space. Applied to the robot control task, this approach allows for real-time generation of continuous action signals, without the use of any supervision or definition of representations or rules of behaviour

    An Integrated Fuzzy Inference Based Monitoring, Diagnostic, and Prognostic System

    Get PDF
    To date the majority of the research related to the development and application of monitoring, diagnostic, and prognostic systems has been exclusive in the sense that only one of the three areas is the focus of the work. While previous research progresses each of the respective fields, the end result is a variable grab bag of techniques that address each problem independently. Also, the new field of prognostics is lacking in the sense that few methods have been proposed that produce estimates of the remaining useful life (RUL) of a device or can be realistically applied to real-world systems. This work addresses both problems by developing the nonparametric fuzzy inference system (NFIS) which is adapted for monitoring, diagnosis, and prognosis and then proposing the path classification and estimation (PACE) model that can be used to predict the RUL of a device that does or does not have a well defined failure threshold. To test and evaluate the proposed methods, they were applied to detect, diagnose, and prognose faults and failures in the hydraulic steering system of a deep oil exploration drill. The monitoring system implementing an NFIS predictor and sequential probability ratio test (SPRT) detector produced comparable detection rates to a monitoring system implementing an autoassociative kernel regression (AAKR) predictor and SPRT detector, specifically 80% vs. 85% for the NFIS and AAKR monitor respectively. It was also found that the NFIS monitor produced fewer false alarms. Next, the monitoring system outputs were used to generate symptom patterns for k-nearest neighbor (kNN) and NFIS classifiers that were trained to diagnose different fault classes. The NFIS diagnoser was shown to significantly outperform the kNN diagnoser, with overall accuracies of 96% vs. 89% respectively. Finally, the PACE implementing the NFIS was used to predict the RUL for different failure modes. The errors of the RUL estimates produced by the PACE-NFIS prognosers ranged from 1.2-11.4 hours with 95% confidence intervals (CI) from 0.67-32.02 hours, which are significantly better than the population based prognoser estimates with errors of ~45 hours and 95% CIs of ~162 hours
    • 

    corecore