666 research outputs found
Active Object Localization in Visual Situations
We describe a method for performing active localization of objects in
instances of visual situations. A visual situation is an abstract
concept---e.g., "a boxing match", "a birthday party", "walking the dog",
"waiting for a bus"---whose image instantiations are linked more by their
common spatial and semantic structure than by low-level visual similarity. Our
system combines given and learned knowledge of the structure of a particular
situation, and adapts that knowledge to a new situation instance as it actively
searches for objects. More specifically, the system learns a set of probability
distributions describing spatial and other relationships among relevant
objects. The system uses those distributions to iteratively sample object
proposals on a test image, but also continually uses information from those
object proposals to adaptively modify the distributions based on what the
system has detected. We test our approach's ability to efficiently localize
objects, using a situation-specific image dataset created by our group. We
compare the results with several baselines and variations on our method, and
demonstrate the strong benefit of using situation knowledge and active
context-driven localization. Finally, we contrast our method with several other
approaches that use context as well as active search for object localization in
images.Comment: 14 page
Shape Representation in Primate Visual Area 4 and Inferotemporal Cortex
The representation of contour shape is an essential component of object recognition, but the cortical mechanisms underlying it are incompletely understood, leaving it a fundamental open question in neuroscience. Such an understanding would be useful theoretically as well as in developing computer vision and Brain-Computer Interface applications. We ask two fundamental questions: “How is contour shape represented in cortex and how can neural models and computer vision algorithms more closely approximate this?” We begin by analyzing the statistics of contour curvature variation and develop a measure of salience based upon the arc length over which it remains within a constrained range. We create a population of V4-like cells – responsive to a particular local contour conformation located at a specific position on an object’s boundary – and demonstrate high recognition accuracies classifying handwritten digits in the MNIST database and objects in the MPEG-7 Shape Silhouette database. We compare the performance of the cells to the “shape-context” representation (Belongie et al., 2002) and achieve roughly comparable recognition accuracies using a small test set. We analyze the relative contributions of various feature sensitivities to recognition accuracy and robustness to noise. Local curvature appears to be the most informative for shape recognition. We create a population of IT-like cells, which integrate specific information about the 2-D boundary shapes of multiple contour fragments, and evaluate its performance on a set of real images as a function of the V4 cell inputs. We determine the sub-population of cells that are most effective at identifying a particular category. We classify based upon cell population response and obtain very good results. We use the Morris-Lecar neuronal model to more realistically illustrate the previously explored shape representation pathway in V4 – IT. We demonstrate recognition using spatiotemporal patterns within a winnerless competition network with FitzHugh-Nagumo model neurons. Finally, we use the Izhikevich neuronal model to produce an enhanced response in IT, correlated with recognition, via gamma synchronization in V4. Our results support the hypothesis that the response properties of V4 and IT cells, as well as our computer models of them, function as robust shape descriptors in the object recognition process
Robust fuzzy clustering for multiple instance regression.
Multiple instance regression (MIR) operates on a collection of bags, where each bag contains multiple instances sharing an identical real-valued label. Only few instances, called primary instances, contribute to the bag labels. The remaining instances are noise and outliers observations. The goal in MIR is to identify the primary instances within each bag and learn a regression model that can predict the label of a previously unseen bag. In this thesis, we introduce an algorithm that uses robust fuzzy clustering with an appropriate distance to learn multiple linear models from a noisy feature space simultaneously. We show that fuzzy memberships are useful in allowing instances to belong to multiple models, while possibilistic memberships allow identification of the primary instances of each bag with respect to each model. We also use possibilistic memberships to identify and ignore noisy instances and determine the optimal number of regression models. We evaluate our approach on a series of synthetic data sets, remote sensing data to predict the yearly average yield of a crop and application to drug activity prediction. We show that our approach achieves higher accuracy than existing methods
Second CLIPS Conference Proceedings, volume 2
Papers presented at the 2nd C Language Integrated Production System (CLIPS) Conference held at the Lyndon B. Johnson Space Center (JSC) on 23-25 September 1991 are documented in these proceedings. CLIPS is an expert system tool developed by the Software Technology Branch at NASA JSC and is used at over 4000 sites by government, industry, and business. During the three days of the conference, over 40 papers were presented by experts from NASA, Department of Defense, other government agencies, universities, and industry
Attention Mechanism for Recognition in Computer Vision
It has been proven that humans do not focus their attention on an entire scene at once when they perform a recognition task. Instead, they pay attention to the most important parts of the scene to extract the most discriminative information. Inspired by this observation, in this dissertation, the importance of attention mechanism in recognition tasks in computer vision is studied by designing novel attention-based models. In specific, four scenarios are investigated that represent the most important aspects of attention mechanism.First, an attention-based model is designed to reduce the visual features\u27 dimensionality by selectively processing only a small subset of the data. We study this aspect of the attention mechanism in a framework based on object recognition in distributed camera networks. Second, an attention-based image retrieval system (i.e., person re-identification) is proposed which learns to focus on the most discriminative regions of the person\u27s image and process those regions with higher computation power using a deep convolutional neural network. Furthermore, we show how visualizing the attention maps can make deep neural networks more interpretable. In other words, by visualizing the attention maps we can observe the regions of the input image where the neural network relies on, in order to make a decision. Third, a model for estimating the importance of the objects in a scene based on a given task is proposed. More specifically, the proposed model estimates the importance of the road users that a driver (or an autonomous vehicle) should pay attention to in a driving scenario in order to have safe navigation. In this scenario, the attention estimation is the final output of the model. Fourth, an attention-based module and a new loss function in a meta-learning based few-shot learning system is proposed in order to incorporate the context of the task into the feature representations of the samples and increasing the few-shot recognition accuracy.In this dissertation, we showed that attention can be multi-facet and studied the attention mechanism from the perspectives of feature selection, reducing the computational cost, interpretable deep learning models, task-driven importance estimation, and context incorporation. Through the study of four scenarios, we further advanced the field of where \u27\u27attention is all you need\u27\u27
Recommended from our members
A study of instance-based algorithms for supervised learning tasks : mathematical, empirical, and psychological evaluations
This dissertation introduces a framework for specifying instance-based algorithms that can solve supervised learning tasks. These algorithms input a sequence of instances and yield a partial concept description, which is represented by a set of stored instances and associated information. This description can be used to predict values for subsequently presented instances. The thesis of this framework is that extensional concept descriptions and lazy generalization strategies can support efficient supervised learning behavior.The instance-based learning framework consists of three components. The pre-processor component transforms an instance into a more palatable form for the performance component, which computes the instance's similarity with a set of stored instances and yields a prediction for its target value(s). Therefore, the similarity and prediction functions impose generalizations on the stored instances to inductively derive predictions. The learning component assesses the accuracy of these prediction(s) and updates partial concept descriptions to improve their predictive accuracy.This framework is evaluated in four ways. First, its generality is evaluated by mathematically determining the classes of symbolic concepts and numeric functions that can be closely approximated by IB_1, a simple algorithm specified by this framework. Second, this framework is empirically evaluated for its ability to specify algorithms that improve IB_1's learning efficiency. Significant efficiency improvements are obtained by instance-based algorithms that reduce storage requirements, tolerate noisy data, and learn domain-specific similarity functions respectively. Alternative component definitions for these algorithms are empirically analyzed in a set of five high-level parameter studies. Third, this framework is evaluated for its ability to specify psychologically plausible process models for categorization tasks. Results from subject experiments indicate a positive correlation between a models' ability to utilize attribute correlation information and its ability to explain psychological phenomena. Finally, this framework is evaluated for its ability to explain and relate a dozen prominent instance-based learning systems. The survey shows that this framework requires only slight modifications to fit these highly diverse systems. Relationships with edited nearest neighbor algorithms, case-based reasoners, and artificial neural networks are also described
An evaluation of learning analytics to identify exploratory dialogue in online discussions
Social learning analytics are concerned with the process of knowledge construction as learners build knowledge together in their social and cultural environments. One of the most important tools employed during this process is language. In this paper we take exploratory dialogue, a joint form of co-reasoning, to be an external indicator that learning is taking place. Using techniques developed within the field of computational linguistics, we build on previous work using cue phrases to identify exploratory dialogue within online discussion. Automatic detection of this type of dialogue is framed as a binary classification task that labels each contribution to an online discussion as exploratory or non-exploratory. We describe the development of a self-training framework that employs discourse features and topical features for classification by integrating both cue-phrase matching and k-nearest neighbour classification. Experiments with a corpus constructed from the archive of a two-day online conference show that our proposed framework outperforms other approaches. A classifier developed using the self-training framework is able to make useful distinctions between the learning dialogue taking place at different times within an online conference as well as between the contributions of individual participants
Active Object Localization in Visual Situations
—We describe a method for performing active localization of objects in instances of visual situations. A visual situation is an abstract concept—e.g., “a boxing match”, “a birthday party”, “walking the dog”, “waiting for a bus”—whose image instantiations are linked more by their common spatial and semantic structure than by low-level visual similarity. Our system combines given and learned knowledge of the structure of a particular situation, and adapts that knowledge to a new situation instance as it actively searches for objects. More specifically, the system learns a set of probability distributions describing spatial and other relationships among relevant objects. The system uses those distributions to iteratively sample object proposals on a test image, but also continually uses information from those object proposals to adaptively modify the distributions based on what the system has detected. We test our approach’s ability to efficiently localize objects, using a situation-specific image dataset created by our group. We compare the results with several baselines and variations on our method, and demonstrate the strong benefit of using situation knowledge and active context-driven localization. Finally, we contrast our method with several other approaches that use context as well as active search for object localization in images
Second CLIPS Conference Proceedings, volume 1
Topics covered at the 2nd CLIPS Conference held at the Johnson Space Center, September 23-25, 1991 are given. Topics include rule groupings, fault detection using expert systems, decision making using expert systems, knowledge representation, computer aided design and debugging expert systems
- …