1,994 research outputs found

    Detecting the presence of large buildings in natural images

    Get PDF
    This paper addresses the issue of classification of lowlevel features into high-level semantic concepts for the purpose of semantic annotation of consumer photographs. We adopt a multi-scale approach that relies on edge detection to extract an edge orientation-based feature description of the image, and apply an SVM learning technique to infer the presence of a dominant building object in a general purpose collection of digital photographs. The approach exploits prior knowledge on the image context through an assumption that all input images are �outdoor�, i.e. indoor/outdoor classification (the context determination stage) has been performed. The proposed approach is validated on a diverse dataset of 1720 images and its performance compared with that of the MPEG-7 edge histogram descriptor

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    A New DCT-FFT Fusion Based Method for Caption and Scene Text Classification in Action Video Images

    Full text link
    Achieving better recognition rate for text in video action images is challenging due to multi-type texts with unpredictable backgrounds. We propose a new method for the classification of captions (which is edited text) and scene texts (which is part of an image in video images of Yoga, Concert, Teleshopping, Craft, and Recipe classes). The proposed method introduces a new fusion criterion-based on DCT and Fourier coefficients to extract features that represent good clarity and visibility of captions to separate them from scene texts. The variances for coefficients of corresponding pixels of DCT and Fourier images are computed to derive the respective weights. The weights and coefficients are further used to generate a fused image. Furthermore, the proposed method estimates sparsity in Canny edge image of each fused image to derive rules for classifying caption and scene texts. Lastly, the proposed method is evaluated on images of five above-mentioned action image classes to validate the derived rules. Comparative studies with the state-of-the-art methods on the standard databases show that the proposed method outperforms the existing methods in terms of classification. The recognition experiments before and after classification show that the recognition performance rate improves significantly after classification

    Motion Segmentation Aided Super Resolution Image Reconstruction

    Get PDF
    This dissertation addresses Super Resolution (SR) Image Reconstruction focusing on motion segmentation. The main thrust is Information Complexity guided Gaussian Mixture Models (GMMs) for Statistical Background Modeling. In the process of developing our framework we also focus on two other topics; motion trajectories estimation toward global and local scene change detections and image reconstruction to have high resolution (HR) representations of the moving regions. Such a framework is used for dynamic scene understanding and recognition of individuals and threats with the help of the image sequences recorded with either stationary or non-stationary camera systems. We introduce a new technique called Information Complexity guided Statistical Background Modeling. Thus, we successfully employ GMMs, which are optimal with respect to information complexity criteria. Moving objects are segmented out through background subtraction which utilizes the computed background model. This technique produces superior results to competing background modeling strategies. The state-of-the-art SR Image Reconstruction studies combine the information from a set of unremarkably different low resolution (LR) images of static scene to construct an HR representation. The crucial challenge not handled in these studies is accumulating the corresponding information from highly displaced moving objects. In this aspect, a framework of SR Image Reconstruction of the moving objects with such high level of displacements is developed. Our assumption is that LR images are different from each other due to local motion of the objects and the global motion of the scene imposed by non-stationary imaging system. Contrary to traditional SR approaches, we employed several steps. These steps are; the suppression of the global motion, motion segmentation accompanied by background subtraction to extract moving objects, suppression of the local motion of the segmented out regions, and super-resolving accumulated information coming from moving objects rather than the whole scene. This results in a reliable offline SR Image Reconstruction tool which handles several types of dynamic scene changes, compensates the impacts of camera systems, and provides data redundancy through removing the background. The framework proved to be superior to the state-of-the-art algorithms which put no significant effort toward dynamic scene representation of non-stationary camera systems

    Visual Information Retrieval in Endoscopic Video Archives

    Get PDF
    In endoscopic procedures, surgeons work with live video streams from the inside of their subjects. A main source for documentation of procedures are still frames from the video, identified and taken during the surgery. However, with growing demands and technical means, the streams are saved to storage servers and the surgeons need to retrieve parts of the videos on demand. In this submission we present a demo application allowing for video retrieval based on visual features and late fusion, which allows surgeons to re-find shots taken during the procedure.Comment: Paper accepted at the IEEE/ACM 13th International Workshop on Content-Based Multimedia Indexing (CBMI) in Prague (Czech Republic) between 10 and 12 June 201

    The rectification and recognition of document images with perspective and geometric distortions

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A Review on Advanced Decision Trees for Efficient & Effective k-NN Classification

    Get PDF
    K Nearest Neighbor (KNN) strategy is a notable classification strategy in data mining and estimations in light of its direct execution and colossal arrangement execution. In any case, it is outlandish for ordinary KNN strategies to select settled k esteem to all tests. Past courses of action assign different k esteems to different test tests by the cross endorsement strategy however are typically tedious. This work proposes new KNN strategies, first is a KTree strategy to learn unique k esteems for different test or new cases, by including a training arrange in the KNN classification. This work additionally proposes a change rendition of KTree technique called K*Tree to speed its test organize by putting additional data of the training tests in the leaf node of KTree, for example, the training tests situated in the leaf node, their KNNs, and the closest neighbor of these KNNs. K*Tree, which empowers to lead KNN arrangement utilizing a subset of the training tests in the leaf node instead of all training tests utilized in the recently KNN techniques. This really reduces the cost of test organize
    corecore