1,863 research outputs found

    A Monocular SLAM Method to Estimate Relative Pose During Satellite Proximity Operations

    Get PDF
    Automated satellite proximity operations is an increasingly relevant area of mission operations for the US Air Force with potential to significantly enhance space situational awareness (SSA). Simultaneous localization and mapping (SLAM) is a computer vision method of constructing and updating a 3D map while keeping track of the location and orientation of the imaging agent inside the map. The main objective of this research effort is to design a monocular SLAM method customized for the space environment. The method developed in this research will be implemented in an indoor proximity operations simulation laboratory. A run-time analysis is performed, showing near real-time operation. The method is verified by comparing SLAM results to truth vertical rotation data from a CubeSat air bearing testbed. This work enables control and testing of simulated proximity operations hardware in a laboratory environment. Additionally, this research lays the foundation for autonomous satellite proximity operations with unknown targets and minimal additional size, weight, and power requirements, creating opportunities for numerous mission concepts not previously available

    Vision Science and Technology at NASA: Results of a Workshop

    Get PDF
    A broad review is given of vision science and technology within NASA. The subject is defined and its applications in both NASA and the nation at large are noted. A survey of current NASA efforts is given, noting strengths and weaknesses of the NASA program

    Learning a Family of Detectors

    Full text link
    Object detection and recognition are important problems in computer vision. The challenges of these problems come from the presence of noise, background clutter, large within class variations of the object class and limited training data. In addition, the computational complexity in the recognition process is also a concern in practice. In this thesis, we propose one approach to handle the problem of detecting an object class that exhibits large within-class variations, and a second approach to speed up the classification processes. In the first approach, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly solved with using a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. For applications where explicit parameterization of the within-class states is unavailable, a nonparametric formulation of the kernel can be constructed with a proper foreground distance/similarity measure. Detector training is accomplished via standard Support Vector Machine learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the image masks for foreground objects are provided in training, the detectors can also produce object segmentation. Methods for generating a representative sample set of detectors are proposed that can enable efficient detection and tracking. In addition, because individual detectors verify hypotheses of foreground state, they can also be incorporated in a tracking-by-detection frame work to recover foreground state in image sequences. To run the detectors efficiently at the online stage, an input-sensitive speedup strategy is proposed to select the most relevant detectors quickly. The proposed approach is tested on data sets of human hands, vehicles and human faces. On all data sets, the proposed approach achieves improved detection accuracy over the best competing approaches. In the second part of the thesis, we formulate a filter-and-refine scheme to speed up recognition processes. The binary outputs of the weak classifiers in a boosted detector are used to identify a small number of candidate foreground state hypotheses quickly via Hamming distance or weighted Hamming distance. The approach is evaluated in three applications: face recognition on the face recognition grand challenge version 2 data set, hand shape detection and parameter estimation on a hand data set, and vehicle detection and estimation of the view angle on a multi-pose vehicle data set. On all data sets, our approach is at least five times faster than simply evaluating all foreground state hypotheses with virtually no loss in classification accuracy

    Information Theoretical Analysis of the Uniqueness of Iris Biometrics

    Get PDF
    With the rapid globalization of technology in the world, the need for a more reliable and secure online method of authentication is required. This can be achieved by using each individual’s distinctive biometric identifiers, such as the face, iris, fingerprint, palmprint, etc.; however, there is a bound to the uniqueness of each identifier and consequently, a limit to the capacity that a biometric recognition system can sustain before false matches occur. Therefore, knowing the limitations on the maximum population that a biometric modality can uniquely represent is essential now more than ever. In an effort to address the general problem, we turn to the use of iris biometrics to measure its uniqueness. The measure of iris uniqueness was first introduced by John Daugman in 2003 and its analysis since then remained an open research problem. Daugman defines uniqueness as the ability to enroll more and more classes into a recognition system while the probability of collision among the classes remains fixed and near zero. Due to errors while collecting these datasets (such as occlusions, illumination conditions, camera noise, motion, and out-of-focus blur) and quality degradation from any signal processing of the iris data, even the highest in-quality datasets will not approach a perfect zero probability of collision. Because of this, we appeal to techniques presented in information theory to analyze and find the maximum possible population the system can support while also measuring the quality of the iris data present in the datasets themselves. The focus of this work is divided into two new techniques to find the maximum population of an iris database: finding the limitations of Daugman\u27s widely accepted IrisCode and proposing a new methodology leveraging the raw iris data. Firstly, Daugman\u27s IrisCode is defined as binary templates representing each independent class present in the database. Through the assumption that a one-to-one encoding technique is available to map the IrisCode of each class to a new binary codeword with the length determined by the degrees of freedom inferred from the distribution of distances between each pair of independent class IrisCodes, we can appeal to Rate-Distortion Theory (limits of error-correcting codes) to establish bounds on the maximum population the IrisCode algorithm can sustain using the minimum Hamming distance (HD) between codewords as a quality metric. Our second approach leverages an Autoregressive (AR) model to estimate each iris class\u27s distinctive power spectral densities and then assume a similar one-to-one mapping of each iris class to a unique Gaussian codeword. A Gaussian Sphere Packing Bound is invoked to realize the maximum population of the dataset and measure the iris quality dependent on the noise present in the data. Another bound, the Daugman-like Bound, is developed that uses the relative entropy between models of classes as a distance metric, like Hamming distance, to find the maximum population given a fixed recognition error for the system. Using these two approaches, we hope to help researchers understand the limitations present in their recognition system depending on the quality of their iris database

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

    Siamese Instance Search for Tracking

    Get PDF
    In this paper we present a tracker, which is radically different from state-of-the-art trackers: we apply no model updating, no occlusion detection, no combination of trackers, no geometric matching, and still deliver state-of-the-art tracking performance, as demonstrated on the popular online tracking benchmark (OTB) and six very challenging YouTube videos. The presented tracker simply matches the initial patch of the target in the first frame with candidates in a new frame and returns the most similar patch by a learned matching function. The strength of the matching function comes from being extensively trained generically, i.e., without any data of the target, using a Siamese deep neural network, which we design for tracking. Once learned, the matching function is used as is, without any adapting, to track previously unseen targets. It turns out that the learned matching function is so powerful that a simple tracker built upon it, coined Siamese INstance search Tracker, SINT, which only uses the original observation of the target from the first frame, suffices to reach state-of-the-art performance. Further, we show the proposed tracker even allows for target re-identification after the target was absent for a complete video shot.Comment: This paper is accepted to the IEEE Conference on Computer Vision and Pattern Recognition, 201

    Perception of tactile vibrations and a putative neuronal code

    Get PDF
    We devised a delayed comparison task, appropriate for human and rats, in which subjects discriminate between pairs of vibration delivered either to their whiskers, in rats, or fingertips, in humans, with a delay inserted between the two stimuli. Stimuli were composed of a random time series of velocity values (\u201cnoise\u201d) taken from a Gaussian distribution with 0 mean and standard deviation referred to as \u3c31 for the first stimulus and \u3c32 for the second stimulus. The subject must select a response depending on the two vibrations\u2019 relative standard deviations, \u3c31>\u3c32 or \u3c31<\u3c32. In the standard condition, the base and comparison stimuli both had duration of 400 ms and they were separated by a 800 ms pause. In this condition, humans had better performance than did rats on average, yet the best rats were better than the worst humans. To learn how signals are integrated over time, we varied the duration of the second stimulus. In rats, the performance was progressively improved when the comparison stimulus duration increased from 200 to 400 and then to 600 ms. In humans, the effect of comparison stimulus duration was different: an increase in duration did not improve their performance but biased their choice. Stimuli of longer duration were perceived as having a larger value of \u3c3. We employed a novel psychophysical reverse correlation method to find out which kinematic features of the stochastic stimulus influenced the choices of the subjects. This analysis revealed that rats rely principally on features related to velocity and speed values normalized by stimulus duration \u2013 that is, the rate of velocity and speed features per unit time. In contrast, while human subjects used velocity- and speed-related features, they tended to be influenced by the summated values of those features over time. The summation strategy in humans versus the rate strategy in rats accounts for both (i) the lack of improvement in humans for greater stimulus durations and (ii) the bias by which they judged longer stimuli as having a greater value of \u3c3. Next, we focused on the capacity of rats to accomplish a task of parametric working memory, a capacity until now not found in rodents. For delays between the base and comparison stimuli of up to 6-10 seconds, humans and rats showed similar performance. However when the difference in \u3c3 was small, the rats\u2019 performance began to decay over long inter-stimulus delays more markedly than did the humans\u2019 performance. The next chapter reports the analyses of the activity of barrel cortex neurons during the vibration comparison task. 35% of sampled neuron clusters showed a significant change in firing rate as \u3c3 varied, and the change was positive in every case \u2013 the slope of firing rate versus \u3c3 was positive. We used methods related to signal detection theory to estimate the behavioral performance that could be supported by single neuron clusters and found that the resulting \u201cneurometric\u201d curve was much less steep performance than the psychometric curve (the performance of the whole rat). This led to the notion that stimuli are encoded by larger populations. A general linear model (GLM) that combined multiple simultaneously recorded 2 clusters performed much better than single clusters and began to approach animal performance. We conclude that a potential code for the stimulus is the variation in firing rate according to \u3c3, distributed across large populations.In conclusion, this thesis characterizes the perceptual capacities of humans and rats in a novel working memory task. Both humans and rats can extract the statistical structure of a \u201cnoisy\u201d tactile vibration, but seem to integrate signals by different operations. A major finding is that rats are endowed with a capacity to hold stimulus parameters in working memory with a proficiency that, until now, could be ascribed only to primates. The statistical properties of the stimulus appear to be encoded by a distributed population
    • …
    corecore