465,910 research outputs found

    A study into annotation ranking metrics in geo-tagged image corpora

    Get PDF
    Community contributed datasets are becoming increasingly common in automated image annotation systems. One important issue with community image data is that there is no guarantee that the associated metadata is relevant. A method is required that can accurately rank the semantic relevance of community annotations. This should enable the extracting of relevant subsets from potentially noisy collections of these annotations. Having relevant, non heterogeneous tags assigned to images should improve community image retrieval systems, such as Flickr, which are based on text retrieval methods. In the literature, the current state of the art approach to ranking the semantic relevance of Flickr tags is based on the widely used tf-idf metric. In the case of datasets containing landmark images, however, this metric is inefficient due to the high frequency of common landmark tags within the data set and can be improved upon. In this paper, we present a landmark recognition framework, that provides end-to-end automated recognition and annotation. In our study into automated annotation, we evaluate 5 alternate approaches to tf-idf to rank tag relevance in community contributed landmark image corpora. We carry out a thorough evaluation of each of these ranking metrics and results of this evaluation demonstrate that four of these proposed techniques outperform the current commonly-used tf-idf approach for this task

    View based approach to forensic face recognition

    Get PDF
    Face recognition is a challenging problem for surveillance view images commonly encountered in a forensic face recognition case. One approach to deal with a non-frontal test image is to synthesize the corresponding frontal view image and compare it with frontal view reference images. However, it is often difficult to synthesize a good quality frontal view image from a surveillance video because the test image is usually of low quality. In this paper, we investigate if it is useful to instead transform the reference images so that it matches the pose, illumination and camera of the surveillance view test image. This approach, also called the view based approach, ensures that a face recognition system always gets to compare images having a similar, not necessarily the frontal, view. Our results with surveillance view images captured 6 months apart (taken from the MultiPIE data set) and using five different face recognition systems show that improved recognition performance under surveillance conditions can be attained by exactly matching the pose, illumination and camera between the test and reference images

    Weakly Supervised Localization using Deep Feature Maps

    Full text link
    Object localization is an important computer vision problem with a variety of applications. The lack of large scale object-level annotations and the relative abundance of image-level labels makes a compelling case for weak supervision in the object localization task. Deep Convolutional Neural Networks are a class of state-of-the-art methods for the related problem of object recognition. In this paper, we describe a novel object localization algorithm which uses classification networks trained on only image labels. This weakly supervised method leverages local spatial and semantic patterns captured in the convolutional layers of classification networks. We propose an efficient beam search based approach to detect and localize multiple objects in images. The proposed method significantly outperforms the state-of-the-art in standard object localization data-sets with a 8 point increase in mAP scores

    Integration of Quantitative and Qualitative Techniques for Deformable Model Fitting from Orthographic, Perspective, and Stereo Projections

    Get PDF
    In this paper, we synthesize a new approach to 3-D object shape recovery by integrating qualitative shape recovery techniques and quantitative physics based shape estimation techniques. Specifically, we first use qualitative shape recovery and recognition techniques to provide strong fitting constraints on physics-based deformable model recovery techniques. Secondly, we extend our previously developed technique of fitting deformable models to occluding image contours to the case of image data captured under general orthographic, perspective, and stereo projections

    Modeling of evolving textures using granulometries

    Get PDF
    This chapter describes a statistical approach to classification of dynamic texture images, called parallel evolution functions (PEFs). Traditional classification methods predict texture class membership using comparisons with a finite set of predefined texture classes and identify the closest class. However, where texture images arise from a dynamic texture evolving over time, estimation of a time state in a continuous evolutionary process is required instead. The PEF approach does this using regression modeling techniques to predict time state. It is a flexible approach which may be based on any suitable image features. Many textures are well suited to a morphological analysis and the PEF approach uses image texture features derived from a granulometric analysis of the image. The method is illustrated using both simulated images of Boolean processes and real images of corrosion. The PEF approach has particular advantages for training sets containing limited numbers of observations, which is the case in many real world industrial inspection scenarios and for which other methods can fail or perform badly. [41] G.W. Horgan, Mathematical morphology for analysing soil structure from images, European Journal of Soil Science, vol. 49, pp. 161ā€“173, 1998. [42] G.W. Horgan, C.A. Reid and C.A. Glasbey, Biological image processing and enhancement, Image Processing and Analysis, A Practical Approach, R. Baldock and J. Graham, eds., Oxford University Press, Oxford, UK, pp. 37ā€“67, 2000. [43] B.B. Hubbard, The World According to Wavelets: The Story of a Mathematical Technique in the Making, A.K. Peters Ltd., Wellesley, MA, 1995. [44] H. Iversen and T. Lonnestad. An evaluation of stochastic models for analysis and synthesis of gray-scale texture, Pattern Recognition Letters, vol. 15, pp. 575ā€“585, 1994. [45] A.K. Jain and F. Farrokhnia, Unsupervised texture segmentation using Gabor filters, Pattern Recognition, vol. 24(12), pp. 1167ā€“1186, 1991. [46] T. Jossang and F. Feder, The fractal characterization of rough surfaces, Physica Scripta, vol. T44, pp. 9ā€“14, 1992. [47] A.K. Katsaggelos and T. Chun-Jen, Iterative image restoration, Handbook of Image and Video Processing, A. Bovik, ed., Academic Press, London, pp. 208ā€“209, 2000. [48] M. KĀØoppen, C.H. Nowack and G. RĀØosel, Pareto-morphology for color image processing, Proceedings of SCIA99, 11th Scandinavian Conference on Image Analysis 1, Kangerlussuaq, Greenland, pp. 195ā€“202, 1999. [49] S. Krishnamachari and R. Chellappa, Multiresolution Gauss-Markov random field models for texture segmentation, IEEE Transactions on Image Processing, vol. 6(2), pp. 251ā€“267, 1997. [50] T. Kurita and N. Otsu, Texture classification by higher order local autocorrelation features, Proceedings of ACCV93, Asian Conference on Computer Vision, Osaka, pp. 175ā€“178, 1993. [51] S.T. Kyvelidis, L. Lykouropoulos and N. Kouloumbi, Digital system for detecting, classifying, and fast retrieving corrosion generated defects, Journal of Coatings Technology, vol. 73(915), pp. 67ā€“73, 2001. [52] Y. Liu, T. Zhao and J. Zhang, Learning multispectral texture features for cervical cancer detection, Proceedings of 2002 IEEE International Symposium on Biomedical Imaging: Macro to Nano, pp. 169ā€“172, 2002. [53] G. McGunnigle and M.J. Chantler, Modeling deposition of surface texture, Electronics Letters, vol. 37(12), pp. 749ā€“750, 2001. [54] J. McKenzie, S. Marshall, A.J. Gray and E.R. Dougherty, Morphological texture analysis using the texture evolution function, International Journal of Pattern Recognition and Artificial Intelligence, vol. 17(2), pp. 167ā€“185, 2003. [55] J. McKenzie, Classification of dynamically evolving textures using evolution functions, Ph.D. Thesis, University of Strathclyde, UK, 2004. [56] S.G. Mallat, Multiresolution approximations and wavelet orthonormal bases of L2(R), Transactions of the American Mathematical Society, vol. 315, pp. 69ā€“87, 1989. [57] S.G. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, pp. 674ā€“693, 1989. [58] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 837ā€“842, 1996. [59] B.S. Manjunath, G.M. Haley and W.Y. Ma, Multiband techniques for texture classification and segmentation, Handbook of Image and Video Processing, A. Bovik, ed., Academic Press, London, pp. 367ā€“381, 2000. [60] G. Matheron, Random Sets and Integral Geometry, Wiley Series in Probability and Mathematical Statistics, John Wiley and Sons, New York, 1975

    Unsupervised learning of invariant representations

    Get PDF
    The present phase of Machine Learning is characterized by supervised learning algorithms relying on large sets of labeled examples (. n\u2192 1e). The next phase is likely to focus on algorithms capable of learning from very few labeled examples (. n\u21921), like humans seem able to do. We propose an approach to this problem and describe the underlying theory, based on the unsupervised, automatic learning of a "good" representation for supervised learning, characterized by small sample complexity. We consider the case of visual object recognition, though the theory also applies to other domains like speech. The starting point is the conjecture, proved in specific cases, that image representations which are invariant to translation, scaling and other transformations can considerably reduce the sample complexity of learning. We prove that an invariant and selective signature can be computed for each image or image patch: the invariance can be exact in the case of group transformations and approximate under non-group transformations. A module performing filtering and pooling, like the simple and complex cells described by Hubel and Wiesel, can compute such signature. The theory offers novel unsupervised learning algorithms for "deep" architectures for image and speech recognition. We conjecture that the main computational goal of the ventral stream of visual cortex is to provide a hierarchical representation of new objects/images which is invariant to transformations, stable, and selective for recognition-and show how this representation may be continuously learned in an unsupervised way during development and visual experienc
    • ā€¦
    corecore