219 research outputs found

    Unsupervised Diverse Colorization via Generative Adversarial Networks

    Full text link
    Colorization of grayscale images has been a hot topic in computer vision. Previous research mainly focuses on producing a colored image to match the original one. However, since many colors share the same gray value, an input grayscale image could be diversely colored while maintaining its reality. In this paper, we design a novel solution for unsupervised diverse colorization. Specifically, we leverage conditional generative adversarial networks to model the distribution of real-world item colors, in which we develop a fully convolutional generator with multi-layer noise to enhance diversity, with multi-layer condition concatenation to maintain reality, and with stride 1 to keep spatial information. With such a novel network architecture, the model yields highly competitive performance on the open LSUN bedroom dataset. The Turing test of 80 humans further indicates our generated color schemes are highly convincible

    AdaBoost with "keypoint presence features" for real-time vehivle visual detection

    No full text
    International audienceWe present promising results for real-time vehicle visual detection, obtained with adaBoost using new original “keypoints presence features”. These weak-classifiers produce a boolean response based on presence or absence in the tested image of a “keypoint” (~ a SURF interest point) with a descriptor sufficiently similar (i.e. within a given distance) to a reference descriptor characterizing the feature. A first experiment was conducted on a public image dataset containing lateral-viewed cars, yielding 95% recall with 95% precision on test set. Moreover, analysis of the positions of adaBoost-selected keypoints show that they correspond to a specific part of the object category (such as “wheel” or “side skirt”) and thus have a “semantic” meaning

    A novel object tracking algorithm based on compressed sensing and entropy of information

    Get PDF
    Acknowledgments This research is supported by (1) the Ph.D. Programs Foundation of Ministry of Education of China under Grant no. 20120061110045, (2) the Science and Technology Development Projects of Jilin Province of China under Grant no. 20150204007G X, and (3) the Key Laboratory for Symbol Computation and Knowledge Engineering of the National Education Ministry of China.Peer reviewedPublisher PD

    Visual object categorization with new keypoint-based adaBoost features

    Full text link
    We present promising results for visual object categorization, obtained with adaBoost using new original ?keypoints-based features?. These weak-classifiers produce a boolean response based on presence or absence in the tested image of a ?keypoint? (a kind of SURF interest point) with a descriptor sufficiently similar (i.e. within a given distance) to a reference descriptor characterizing the feature. A first experiment was conducted on a public image dataset containing lateral-viewed cars, yielding 95% recall with 95% precision on test set. Preliminary tests on a small subset of a pedestrians database also gives promising 97% recall with 92 % precision, which shows the generality of our new family of features. Moreover, analysis of the positions of adaBoost-selected keypoints show that they correspond to a specific part of the object category (such as ?wheel? or ?side skirt? in the case of lateral-cars) and thus have a ?semantic? meaning. We also made a first test on video for detecting vehicles from adaBoostselected keypoints filtered in real-time from all detected keypoints

    Enhancement of ELDA Tracker Based on CNN Features and Adaptive Model Update

    Get PDF
    Appearance representation and the observation model are the most important components in designing a robust visual tracking algorithm for video-based sensors. Additionally, the exemplar-based linear discriminant analysis (ELDA) model has shown good performance in object tracking. Based on that, we improve the ELDA tracking algorithm by deep convolutional neural network (CNN) features and adaptive model update. Deep CNN features have been successfully used in various computer vision tasks. Extracting CNN features on all of the candidate windows is time consuming. To address this problem, a two-step CNN feature extraction method is proposed by separately computing convolutional layers and fully-connected layers. Due to the strong discriminative ability of CNN features and the exemplar-based model, we update both object and background models to improve their adaptivity and to deal with the tradeoff between discriminative ability and adaptivity. An object updating method is proposed to select the “good” models (detectors), which are quite discriminative and uncorrelated to other selected models. Meanwhile, we build the background model as a Gaussian mixture model (GMM) to adapt to complex scenes, which is initialized offline and updated online. The proposed tracker is evaluated on a benchmark dataset of 50 video sequences with various challenges. It achieves the best overall performance among the compared state-of-the-art trackers, which demonstrates the effectiveness and robustness of our tracking algorithm

    Development of modern methods for the diagnostics of murals in architectural monuments

    Get PDF
    The paper studies monitoring of the state of murals, retrieval of data pertaining to this state and management and storing of the said data. The possibility of integration of traditional methods of mural mapping and modern methods of data visualization, including new Google Project Tango device technology for fixation of complex textures of inner 3D volumes of architectural monuments has been investigated (for instance Assumption Cathedral). We further discuss the express-scanning of automated cartogramming for further comparison of states and methods of assessing the damage done to the mural. Results indicate that additional work is needed to improve the precision of the method.peer-reviewe

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    Resources for computer-based sign recognition from video, and the criticality of consistency of gloss labeling across multiple large ASL video corpora

    Get PDF
    The WLASL purports to be “the largest video dataset for Word-Level American Sign Language (ASL) recognition.” It brings together various publicly shared video collections that could be quite valuable for sign recognition research, and it has been used extensively for such research. However, a critical problem with the accompanying annotations has heretofore not been recognized by the authors, nor by those who have exploited these data: There is no 1-1 correspondence between sign productions and gloss labels. Here we describe a large, linguistically annotated, video corpus of citation-form ASL signs shared by the ASLLRP—with 23,452 sign tokens and an online Sign Bank—in which such correspondences are enforced. We furthermore provide annotations for 19,672 of the WLASL video examples consistent with ASLLRP glossing conventions. For those wishing to use WLASL videos, this provides a set of annotations making it possible: (1) to use those data reliably for computational research; and/or (2) to combine the WLASL and ASLLRP datasets, creating a combined resource that is larger and richer than either of those datasets individually, with consistent gloss labeling for all signs. We also offer a summary of our own sign recognition research to date that exploits these data resources.Published versio

    Conditional Random Fields and Supervised Learning in Automated Skin Lesion Diagnosis

    Get PDF
    Many subproblems in automated skin lesion diagnosis (ASLD) canbe unified under a single generalization of assigning a label, from an predefinedset, to each pixel in an image. We first formalize this generalizationand then present two probabilistic models capable of solving it. The firstmodel is based on independent pixel labeling using maximum a-posteriori(MAP) estimation. The second model is based on conditional randomfields (CRFs), where dependencies between pixels are defined using agraph structure. Furthermore, we demonstrate how supervised learningand an appropriate training set can be used to automatically determineall model parameters. We evaluate both models\u27 ability to segment achallenging dataset consisting of 116 images and compare our results to5 previously published methods
    corecore