1,456 research outputs found

    Distributional Feature Mapping in Data Classification

    Get PDF
    Performance of a machine learning algorithm depends on the representation of the input data. In computer vision problems, histogram based feature representation has significantly improved the classification tasks. L1 normalized histograms can be modelled by Dirichlet and related distributions to transform input space to feature space. We propose a mapping technique that contains prior knowledge about the distribution of the data and increases the discriminative power of the classifiers in supervised learning such as Support Vector Machine (SVM). The mapping technique for proportional data which is based on Dirichlet, Generalized Dirichlet, Beta Liouville, scaled Dirichlet and shifted scaled Dirichlet distributions can be incorporated with traditional kernels to improve the base kernels accuracy. Experimental results show that the proposed technique for proportional data increases accuracy for machine vision tasks such as natural scene recognition, satellite image classification, gender classification, facial expression recognition and human action recognition in videos. In addition, in object tracking, learning parametric features of the target object using Dirichlet and related distributions may help to capture representations invariant to noise. This further motivated our study of such distributions in object tracking. We propose a framework for feature representation on probability simplex for proportional data utilizing the histogram representation of the target object at initial frame. A set of parameter vectors determine the appearance features of the target object in the subsequent frames. Motivated by the success of distribution based feature mapping for proportional data, we extend this technique for semi-bounded data utilizing inverted Dirichlet, generalized inverted Dirichlet and inverted Beta Liouville distributions. Similar approach is taken into account for count data where Dirichlet multinomial and generalized Dirichlet multinomial distributions are used to map density features with input features

    Image similarity in medical images

    Get PDF
    Recent experiments have indicated a strong influence of the substrate grain orientation on the self-ordering in anodic porous alumina. Anodic porous alumina with straight pore channels grown in a stable, self-ordered manner is formed on (001) oriented Al grain, while disordered porous pattern is formed on (101) oriented Al grain with tilted pore channels growing in an unstable manner. In this work, numerical simulation of the pore growth process is carried out to understand this phenomenon. The rate-determining step of the oxide growth is assumed to be the Cabrera-Mott barrier at the oxide/electrolyte (o/e) interface, while the substrate is assumed to determine the ratio β between the ionization and oxidation reactions at the metal/oxide (m/o) interface. By numerically solving the electric field inside a growing porous alumina during anodization, the migration rates of the ions and hence the evolution of the o/e and m/o interfaces are computed. The simulated results show that pore growth is more stable when β is higher. A higher β corresponds to more Al ionized and migrating away from the m/o interface rather than being oxidized, and hence a higher retained O:Al ratio in the oxide. Experimentally measured oxygen content in the self-ordered porous alumina on (001) Al is indeed found to be about 3% higher than that in the disordered alumina on (101) Al, in agreement with the theoretical prediction. The results, therefore, suggest that ionization on (001) Al substrate is relatively easier than on (101) Al, and this leads to the more stable growth of the pore channels on (001) Al

    Image similarity in medical images

    Get PDF

    Information-Theoretic Active Learning for Content-Based Image Retrieval

    Full text link
    We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode active learning method for binary classification, and apply it for acquiring meaningful user feedback in the context of content-based image retrieval. Instead of combining different heuristics such as uncertainty, diversity, or density, our method is based on maximizing the mutual information between the predicted relevance of the images and the expected user feedback regarding the selected batch. We propose suitable approximations to this computationally demanding problem and also integrate an explicit model of user behavior that accounts for possible incorrect labels and unnameable instances. Furthermore, our approach does not only take the structure of the data but also the expected model output change caused by the user feedback into account. In contrast to other methods, ITAL turns out to be highly flexible and provides state-of-the-art performance across various datasets, such as MIRFLICKR and ImageNet.Comment: GCPR 2018 paper (14 pages text + 2 pages references + 6 pages appendix

    Intrinsic dimensionality in vision: Nonlinear filter design and applications

    Get PDF
    Biological vision and computer vision cannot be treated independently anymore. The digital revolution and the emergence of more and more sophisticated technical applications caused a symbiosis between the two communities. Competitive technical devices challenging the human performance rely increasingly on algorithms motivated by the human vision system. On the other hand, computational methods can be used to gain a richer understanding of neural behavior, e.g. the behavior of populations of multiple processing units. The relations between computational approaches and biological findings range from low level vision to cortical areas being responsible for higher cognitive abilities. In early stages of the visual cortex cells have been recorded which could not be explained by the standard approach of orientation- and frequency-selective linear filters anymore. These cells did not respond to straight lines or simple gratings but they fired whenever a more complicated stimulus, like a corner or an end-stopped line, was presented within the receptive field. Using the concept of intrinsic dimensionality, these cells can be classified as intrinsic-two-dimensional systems. The intrinsic dimensionality determines the number of degrees of freedom in the domain which is required to completely determine a signal. A constant image has dimension zero, straight lines and trigonometric functions in one direction have dimension one, and the remaining signals, which require the full number of degrees of freedom, have the dimension two. In this term the reported cells respond to two dimensional signals only. Motivated by the classical approach, which can be realized by orientation- and frequency-selective Gabor-filter functions, a generalized Gabor framework is developed in the context of second-order Volterra systems. The generalized Gabor approach is then used to design intrinsic two-dimensional systems which have the same selectivity properties like the reported cells in early visual cortex. Numerical cognition is commonly assumed to be a higher cognitive ability of humans. The estimation of the number of things from the environment requires a high degree of abstraction. Several studies showed that humans and other species have access to this abstract information. But it is still unclear how this information can be extracted by neural hardware. If one wants to deal with this issue, one has to think about the immense invariance property of number. One can apply a high number of operations to objects which do not change its number. In this work, this problem is considered from a topological perspective. Well known relations between differential geometry and topology are used to develop a computational model. Surprisingly, the resulting operators providing the features which are integrated in the system are intrinsic-two-dimensional operators. This model is used to conduct standard number estimation experiments. The results are then compared to reported human behavior. The last topic of this work is active object recognition. The ability to move the information gathering device, like humans can move their eyes, provides the opportunity to choose the next action. Studies of human saccade behavior suggest that this is not done in a random manner. In order to decrease the time an active object recognition system needs to reach a certain level of performance, several action selection strategies are investigated. The strategies considered within this work are based on information theoretical and probabilistic concepts. These strategies are finally compared to a strategy based on an intrinsic-two-dimensional operator. All three topics are investigated with respect to their relation to the concept of intrinsic dimensionality from a mathematical point of view

    Novelty detection for semantic place categorization

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

    Visual Human Tracking and Group Activity Analysis: A Video Mining System for Retail Marketing

    Get PDF
    Thesis (PhD) - Indiana University, Computer Sciences, 2007In this thesis we present a system for automatic human tracking and activity recognition from video sequences. The problem of automated analysis of visual information in order to derive descriptors of high level human activities has intrigued computer vision community for decades and is considered to be largely unsolved. A part of this interest is derived from the vast range of applications in which such a solution may be useful. We attempt to find efficient formulations of these tasks as applied to the extracting customer behavior information in a retail marketing context. Based on these formulations, we present a system that visually tracks customers in a retail store and performs a number of activity analysis tasks based on the output from the tracker. In tracking we introduce new techniques for pedestrian detection, initialization of the body model and a formulation of the temporal tracking as a global trans-dimensional optimization problem. Initial human detection is addressed by a novel method for head detection, which incorporates the knowledge of the camera projection model.The initialization of the human body model is addressed by newly developed shape and appearance descriptors. Temporal tracking of customer trajectories is performed by employing a human body tracking system designed as a Bayesian jump-diffusion filter. This approach demonstrates the ability to overcome model dimensionality ambiguities as people are leaving and entering the scene. Following the tracking, we developed a two-stage group activity formulation based upon the ideas from swarming research. For modeling purposes, all moving actors in the scene are viewed here as simplistic agents in the swarm. This allows to effectively define a set of inter-agent interactions, which combine to derive a distance metric used in further swarm clustering. This way, in the first stage the shoppers that belong to the same group are identified by deterministically clustering bodies to detect short term events and in the second stage events are post-processed to form clusters of group activities with fuzzy memberships. Quantitative analysis of the tracking subsystem shows an improvement over the state of the art methods, if used under similar conditions. Finally, based on the output from the tracker, the activity recognition procedure achieves over 80% correct shopper group detection, as validated by the human generated ground truth results
    corecore