14,840 research outputs found

    Recognising facial expressions in video sequences

    Full text link
    We introduce a system that processes a sequence of images of a front-facing human face and recognises a set of facial expressions. We use an efficient appearance-based face tracker to locate the face in the image sequence and estimate the deformation of its non-rigid components. The tracker works in real-time. It is robust to strong illumination changes and factors out changes in appearance caused by illumination from changes due to face deformation. We adopt a model-based approach for facial expression recognition. In our model, an image of a face is represented by a point in a deformation space. The variability of the classes of images associated to facial expressions are represented by a set of samples which model a low-dimensional manifold in the space of deformations. We introduce a probabilistic procedure based on a nearest-neighbour approach to combine the information provided by the incoming image sequence with the prior information stored in the expression manifold in order to compute a posterior probability associated to a facial expression. In the experiments conducted we show that this system is able to work in an unconstrained environment with strong changes in illumination and face location. It achieves an 89\% recognition rate in a set of 333 sequences from the Cohn-Kanade data base

    Learning to detect chest radiographs containing lung nodules using visual attention networks

    Get PDF
    Machine learning approaches hold great potential for the automated detection of lung nodules in chest radiographs, but training the algorithms requires vary large amounts of manually annotated images, which are difficult to obtain. Weak labels indicating whether a radiograph is likely to contain pulmonary nodules are typically easier to obtain at scale by parsing historical free-text radiological reports associated to the radiographs. Using a repositotory of over 700,000 chest radiographs, in this study we demonstrate that promising nodule detection performance can be achieved using weak labels through convolutional neural networks for radiograph classification. We propose two network architectures for the classification of images likely to contain pulmonary nodules using both weak labels and manually-delineated bounding boxes, when these are available. Annotated nodules are used at training time to deliver a visual attention mechanism informing the model about its localisation performance. The first architecture extracts saliency maps from high-level convolutional layers and compares the estimated position of a nodule against the ground truth, when this is available. A corresponding localisation error is then back-propagated along with the softmax classification error. The second approach consists of a recurrent attention model that learns to observe a short sequence of smaller image portions through reinforcement learning. When a nodule annotation is available at training time, the reward function is modified accordingly so that exploring portions of the radiographs away from a nodule incurs a larger penalty. Our empirical results demonstrate the potential advantages of these architectures in comparison to competing methodologies

    SENSOR PERFORMANCE ANALYSIS FOR MINE DETECTION WITH UNMANNED VEHICLES IN VERY SHALLOW WATER AND SURF ZONES

    Get PDF
    The very shallow water and surf zones present extraordinary challenges for classifying submerged objects such as mines or shoals. Accessing these areas with traditional unmanned underwater vehicles is difficult, and remotely operated vehicles often require putting operators in harm’s way. This research explores the potential to perform object classification using only forward-looking sonar in the desired operating zones. Experiments were conducted in a controlled environment for two different target objects, a glass sphere and a rectangular cinder block. Next, forward-looking sonar images were analyzed to determine how the intensity and distribution of target returns changed as a function of distance and angle from the sonar. The ability to correlate experimentally measured intensity profiles with a target’s physical size and shape is examined. Finally, recommendations for future research are proposed to further develop this approach for potential naval applications like mine countermeasures.NECC, Virginia Beach, VA, 23459Lieutenant, United States NavyApproved for public release. Distribution is unlimited

    Cumulative Single-cell Laser Ablation of Functionally or Genetically Defined Respiratory Neurons Interrogates Network Properties of Mammalian Breathing-related Neural Circuits in vitro

    Get PDF
    A key feature of many neurodegenerative diseases is the pathological loss of neurons that participate in generating behavior. to mimic the neuronal degeneration procedure of a functioning neural circuit, we designed a computer-automated system that algorithmically detects and sequentially laser-ablates constituent neurons from a neural network with single-cell precision while monitoring the progressive change of the network function in real time. We applied this cell-specific cumulative lesion technique to an advantageous experimental model, the preBotzinger Complex (preBotC), the mammalian respiratory central pattern generator (CPG) that can be retained in thin slice preparations and spontaneously generates breathing-related motor activity in vitro . as a consequence, we sought to investigate the issue: how many neurons are necessary for generating respiratory behavior in vitro ? This question pertains to whether and how progressive cell destruction will impair, and possibly preclude, behaviorally relevant network function. Our ablation system identifies rhythm-generating interneurons in the preBotC based on genetically encoded fluorescent protein markers or imaged Ca 2+ activity patterns, stores their physical locations in memory, and then randomly laser-ablates the neuron targets one at a time in sequence, while continuously measuring changes to respiratory motor output via hypoglossal (XII) nerve electrophysiologicallyin vitro. A critical feature of the system is custom software package dubbed Ablator (in Python code) that detects cell targets, controls stage translation, focuses the laser, and implements the spot-lesion protocol automatically. Experiments are typically carried out in three steps: 1) define the domain of lesion and initialize the system, 2) perform image acquisition and target detection algorithms and maps populations of respiratory neurons in the bilateral volumes of the slice, 3) determine the order of lesions and then spot-lesion target neurons sequentially until all the targets are exhausted. Here we show that selectively and cumulatively deleting rhythmically active inspiratory neurons that are detected via Ca 2+ imaging in the preBotC, progressively decreases respiratory frequency and the amplitude of motor output. On average, the deletion of 120+/-45 neurons stopped spontaneous respiratory rhythm, and our data suggest ∼82% of the rhythm generating neurons remain un-lesioned. Similarly, destruction of 85+/-45 homeodomain transcription factor Dbx1-derived (Dbx1+) neurons, which were hypothesized to comprise the rhythmogenic core of the respiratory CPG in the preBotC, precludes the respiratory motor behavior in vitro as well. The fact that these two estimates of the size of the critical rhythmogenic core in the preBotC are different can be reconciled considering that the Ca2+ imaging method identifies ∼50% inhibitory neurons, which are found in the preBotC but are not rhythmogenic. Dbx1+, on the other hand, identifies only excitatory rhythmogenic neurons. Serial ablations in other medullary respiratory regions did not affect frequency, but diminished the amplitude of motor output to a lesser degree. These data support the hypothesis that cumulative single-cell ablations caused a critical threshold to be crossed during lesioning, after which rhythm generation in the respiratory network was unsustainable. Furthermore, this study provides a novel measurement that can help quantify network properties of the preBotC and gauge its susceptibility to failure. Our results in turn may help explain respiratory failure in patients with neurodegenerative diseases that cause progressive cell death in the brainstem respiratory networks

    A hierarchical and regional deep learning architecture for image description generation

    Get PDF
    This research proposes a distinctive deep learning network architecture for image captioning and description generation. Specifically, we propose a hierarchically trained deep network in order to increase the fluidity and descriptive nature of the generated image captions. The proposed deep network consists of initial regional proposal generation and two key stages for image description generation. The initial regional proposal generation is based upon the Region Proposal Network from the Faster R-CNN. This process generates regions of interest that are then used to annotate and classify human and object attributes. The first key stage of the proposed system conducts detailed label description generation for each region of interest. The second stage uses a Recurrent Neural Network (RNN)-based encoder-decoder structure to translate these regional descriptions into a full image description. Especially, the proposed deep network model can label scenes, objects, human and object attributes, simultaneously, which is achieved through multiple individually trained RNNs The empirical results indicate that our work is comparable to existing research and outperforms state-of-the-art existing methods considerably when evaluated with out-of-domain images from the IAPR TC-12 dataset, especially considering that our system is not trained on images from any of the image captioning datasets. When evaluated with several well-known evaluation metrics, the proposed system achieves an improvement of ∼60% at BLEU-1 over existing methods on the IAPR TC-12 dataset. Moreover, compared with related methods, the proposed deep network requires substantially fewer data samples for training, leading to a much-reduced computational cost
    corecore