24 research outputs found

    Discriminative tracking using tensor pooling

    Get PDF
    How to effectively organize local descriptors to build a global representation has a critical impact on the performance of vision tasks. Recently, local sparse representation has been successfully applied to visual tracking, owing to its discriminative nature and robustness against local noise and partial occlusions. Local sparse codes computed with a template actually form a three-order tensor according to their original layout, although most existing pooling operators convert the codes to a vector by concatenating or computing statistics on them. We argue that, compared to pooling vectors, the tensor form could deliver more intrinsic structural information for the target appearance, and can also avoid high dimensionality learning problems suffered in concatenation-based pooling methods. Therefore, in this paper, we propose to represent target templates and candidates directly with sparse coding tensors, and build the appearance model by incrementally learning on these tensors. We propose a discriminative framework to further improve robustness of our method against drifting and environmental noise. Experiments on a recent comprehensive benchmark indicate that our method performs better than state-of-the-art trackers

    When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)

    Get PDF
    Registration is the process that computes the transformation that aligns sets of data. Commonly, a registration process can be divided into four main steps: target selection, feature extraction, feature matching, and transform computation for the alignment. The accuracy of the result depends on multiple factors, the most significant are the quantity of input data, the presence of noise, outliers and occlusions, the quality of the extracted features, real-time requirements and the type of transformation, especially those ones defined by multiple parameters, like non-rigid deformations. Recent advancements in machine learning could be a turning point in these issues, particularly with the development of deep learning (DL) techniques, which are helping to improve multiple computer vision problems through an abstract understanding of the input data. In this paper, a review of deep learning-based registration methods is presented. We classify the different papers proposing a framework extracted from the traditional registration pipeline to analyse the new learning-based proposal strengths. Deep Registration Networks (DRNs) try to solve the alignment task either replacing part of the traditional pipeline with a network or fully solving the registration problem. The main conclusions extracted are, on the one hand, 1) learning-based registration techniques cannot always be clearly classified in the traditional pipeline. 2) These approaches allow more complex inputs like conceptual models as well as the traditional 3D datasets. 3) In spite of the generality of learning, the current proposals are still ad hoc solutions. Finally, 4) this is a young topic that still requires a large effort to reach general solutions able to cope with the problems that affect traditional approaches.Comment: Submitted to Pattern Recognitio

    OSC-CO\u3csup\u3e2\u3c/sup\u3e: Coattention and Cosegmentation Framework for Plant State Change with Multiple Features

    Get PDF
    Cosegmentation and coattention are extensions of traditional segmentation methods aimed at detecting a common object (or objects) in a group of images. Current cosegmentation and coattention methods are ineffective for objects, such as plants, that change their morphological state while being captured in different modalities and views. The Object State Change using Coattention-Cosegmentation (OSC-CO2) is an end-to-end unsupervised deep-learning framework that enhances traditional segmentation techniques, processing, analyzing, selecting, and combining suitable segmentation results that may contain most of our target object’s pixels, and then displaying a final segmented image. The framework leverages coattention-based convolutional neural networks (CNNs) and cosegmentation-based dense Conditional Random Fields (CRFs) to address segmentation accuracy in high-dimensional plant imagery with evolving plant objects. The efficacy of OSC-CO2 is demonstrated using plant growth sequences imaged with infrared, visible, and fluorescence cameras in multiple views using a remote sensing, high-throughput phenotyping platform, and is evaluated using Jaccard index and precision measures. We also introduce CosegPP+, a dataset that is structured and can provide quantitative information on the efficacy of our framework. Results show that OSC-CO2 out performed state-of-the art segmentation and cosegmentation methods by improving segmentation accuracy by 3% to 45%

    University entry selection framework using rule-based and back-propagation

    Get PDF
    Processing thousands of applications can be a challenging task, especially when the applicant does not consider the university requirements and their qualification. The selection officer will have to check the program requirements and calculate the merit score of the applicants. This process is based on rules determined by the Ministry of Education and the institution will have to select the qualified applicants among thousands of applications. In recent years, several student selection methods have been proposed using the fuzzy multiple decision making and decision trees. These approaches have produced high accuracy and good detection rates on closed domain university data. However, current selection procedure requires the admission officers to manually evaluate the applications and match the applicants’ qualifications with the program they applied. Because the selection process is tedious and very prone to mistakes, a comprehensive approach to detect and identify qualified applicants for university enrollment is highly desired. In this work, a student selection framework using rule-based and backpropagation neural network is presented. Two processes are involved in this work; the first phase known as pre-processing uses rule-based for checking the university requirements, merit calculation and data conversion to serve as input for the next phase. The second phase uses back-propagation neural network model to evaluate the qualified candidates for admission to particular programs. This means only selected data of the qualified applicants from the first phase will be sent to the next phase for further processing. The dataset consists of 3,790 datasets from Universiti Pendidikan Sultan Idris. The experiments have shown that the proposed method of ruled-based and back-propagation neural network produced better performance, where the framework has successfully been implemented and validated with the average performance of more than 95% accuracy for student selection across all sets of the test data

    High resolution trichromatic road surface scanning with a line scan camera and light emitting diode lighting for road-kill detection

    Get PDF
    This paper presents a road surface scanning system that operates with a trichromatic line scan camera with light emitting diode (LED) lighting achieving road surface resolution under a millimeter. It was part of a project named Roadkills-Intelligent systems for surveying mortality of amphibians in Portuguese roads, sponsored by the Portuguese Science and Technology Foundation. A trailer was developed in order to accommodate the complete system with standalone power generation, computer image capture and recording, controlled lighting to operate day or night without disturbance, incremental encoder with 5000 pulses per revolution attached to one of the trailer wheels, under a meter Global Positioning System (GPS) localization, easy to utilize with any vehicle with a trailer towing system and focused on a complete low cost solution. The paper describes the system architecture of the developed prototype, its calibration procedure, the performed experimentation and some obtained results, along with a discussion and comparison with existing systems. Sustained operating trailer speeds of up to 30 km/h are achievable without loss of quality at 4096 pixels' image width (1 m width of road surface) with 250 µm/pixel resolution. Higher scanning speeds can be achieved by lowering the image resolution (120 km/h with 1 mm/pixel). Computer vision algorithms are under development to operate on the captured images in order to automatically detect road-kills of amphibians.This work was financed by FEDER Funds, through the Operational Programme for Competitiveness Factors-COMPETE, and by National Funds through FCT-Foundation for Science and Technology of Portugal, under the project PTDC/BIA-BIC/4296/2012 with the name-Roadkills: Intelligent systems for mapping amphibian mortality on Portuguese roads. C.S. and M.F. are supported by Research Grants contracts by FCT (UMINHO/BI/172/2013 and UMINHO/BI/175/2013 respectively). N.S. is supported by an IF (Investigador FCT) contract by FCT (IF/01526/2013). The authors also wish to thank the entities involved, in particular, School of Engineering of the University of Minho and the Algoritmi research center, the Faculty of Sciences of the University of Porto and the University Institute of Maia.info:eu-repo/semantics/publishedVersio

    Técnicas mixtas de seguimiento y aprendizaje para tracking en secuencias de video

    Get PDF
    Dentro del campo de visión por computadora, el problema de tracking consiste en seguir la posición de un objeto en una secuencia de video. Este ha sido abordado de variadas maneras, principalmente con métodos basados en flujo óptico y, más recientemente, empleando técnicas de detección y aprendizaje automático. Particularmente, el tracker TLD ha demostrado buenos resultados combinando ambos esquemas. Sin embargo, el algoritmo de clasificación Nearest Neighbour utilizado en TLD tiene requerimientos altos en memoria y poder de cómputo. En este trabajo se exploran otros métodos de clasificación bajo el mismo esquema usado en TLD, de manera de mantener buen desempeño de tracking pero sin dichas limitaciones. Más específicamente, se utiliza un clasificador lineal junto con representaciones basadas en vectores de Fisher, cuyo uso de recursos es considerablemente menor a NN. Se compara experimentalmente el desempeño de tracking de este nuevo esquema con TLD original, validando el uso de este nuevo esquema para abordar el problema de trackin
    corecore