24 research outputs found
Discriminative tracking using tensor pooling
How to effectively organize local descriptors to build a global representation has a critical impact on the performance of vision tasks. Recently, local sparse representation has been successfully applied to visual tracking, owing to its discriminative nature and robustness against local noise and partial occlusions. Local sparse codes computed with a template actually form a three-order tensor according to their original layout, although most existing pooling operators convert the codes to a vector by concatenating or computing statistics on them. We argue that, compared to pooling vectors, the tensor form could deliver more intrinsic structural information for the target appearance, and can also avoid high dimensionality learning problems suffered in concatenation-based pooling methods. Therefore, in this paper, we propose to represent target templates and candidates directly with sparse coding tensors, and build the appearance model by incrementally learning on these tensors. We propose a discriminative framework to further improve robustness of our method against drifting and environmental noise. Experiments on a recent comprehensive benchmark indicate that our method performs better than state-of-the-art trackers
When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)
Registration is the process that computes the transformation that aligns sets
of data. Commonly, a registration process can be divided into four main steps:
target selection, feature extraction, feature matching, and transform
computation for the alignment. The accuracy of the result depends on multiple
factors, the most significant are the quantity of input data, the presence of
noise, outliers and occlusions, the quality of the extracted features,
real-time requirements and the type of transformation, especially those ones
defined by multiple parameters, like non-rigid deformations.
Recent advancements in machine learning could be a turning point in these
issues, particularly with the development of deep learning (DL) techniques,
which are helping to improve multiple computer vision problems through an
abstract understanding of the input data. In this paper, a review of deep
learning-based registration methods is presented. We classify the different
papers proposing a framework extracted from the traditional registration
pipeline to analyse the new learning-based proposal strengths. Deep
Registration Networks (DRNs) try to solve the alignment task either replacing
part of the traditional pipeline with a network or fully solving the
registration problem. The main conclusions extracted are, on the one hand, 1)
learning-based registration techniques cannot always be clearly classified in
the traditional pipeline. 2) These approaches allow more complex inputs like
conceptual models as well as the traditional 3D datasets. 3) In spite of the
generality of learning, the current proposals are still ad hoc solutions.
Finally, 4) this is a young topic that still requires a large effort to reach
general solutions able to cope with the problems that affect traditional
approaches.Comment: Submitted to Pattern Recognitio
OSC-CO\u3csup\u3e2\u3c/sup\u3e: Coattention and Cosegmentation Framework for Plant State Change with Multiple Features
Cosegmentation and coattention are extensions of traditional segmentation methods aimed at detecting a common object (or objects) in a group of images. Current cosegmentation and coattention methods are ineffective for objects, such as plants, that change their morphological state while being captured in different modalities and views. The Object State Change using Coattention-Cosegmentation (OSC-CO2) is an end-to-end unsupervised deep-learning framework that enhances traditional segmentation techniques, processing, analyzing, selecting, and combining suitable segmentation results that may contain most of our target object’s pixels, and then displaying a final segmented image. The framework leverages coattention-based convolutional neural networks (CNNs) and cosegmentation-based dense Conditional Random Fields (CRFs) to address segmentation accuracy in high-dimensional plant imagery with evolving plant objects. The efficacy of OSC-CO2 is demonstrated using plant growth sequences imaged with infrared, visible, and fluorescence cameras in multiple views using a remote sensing, high-throughput phenotyping platform, and is evaluated using Jaccard index and precision measures. We also introduce CosegPP+, a dataset that is structured and can provide quantitative information on the efficacy of our framework. Results show that OSC-CO2 out performed state-of-the art segmentation and cosegmentation methods by improving segmentation accuracy by 3% to 45%
University entry selection framework using rule-based and back-propagation
Processing thousands of applications can be a challenging task, especially when the applicant does not consider the university requirements and their qualification. The selection officer will have to check the program requirements and calculate the merit score of the applicants. This process is based on rules determined by the Ministry of Education and the institution will have to select the qualified applicants among thousands of applications.
In recent years, several student selection methods have been proposed using the fuzzy multiple decision making and decision trees. These approaches have produced high accuracy and good detection rates on closed domain university data. However, current selection procedure requires the admission officers to manually evaluate the applications and match the applicants’ qualifications with the program they applied. Because the selection process is tedious and very prone to mistakes, a comprehensive approach to detect and identify qualified
applicants for university enrollment is highly desired.
In this work, a student selection framework using rule-based and backpropagation neural network is presented. Two processes are involved in this work; the first phase known as pre-processing uses rule-based for checking the
university requirements, merit calculation and data conversion to serve as input for the next phase. The second phase uses back-propagation neural network model to evaluate the qualified candidates for admission to particular programs. This means only selected data of the qualified applicants from the first phase will be sent to the next phase for further processing. The dataset consists of 3,790 datasets from Universiti Pendidikan Sultan Idris.
The experiments have shown that the proposed method of ruled-based and back-propagation neural network produced better performance, where the framework has successfully been implemented and validated with the average performance of more than 95% accuracy for student selection across all sets of the test data
High resolution trichromatic road surface scanning with a line scan camera and light emitting diode lighting for road-kill detection
This paper presents a road surface scanning system that operates with a trichromatic line scan camera with light emitting diode (LED) lighting achieving road surface resolution under a millimeter. It was part of a project named Roadkills-Intelligent systems for surveying mortality of amphibians in Portuguese roads, sponsored by the Portuguese Science and Technology Foundation. A trailer was developed in order to accommodate the complete system with standalone power generation, computer image capture and recording, controlled lighting to operate day or night without disturbance, incremental encoder with 5000 pulses per revolution attached to one of the trailer wheels, under a meter Global Positioning System (GPS) localization, easy to utilize with any vehicle with a trailer towing system and focused on a complete low cost solution. The paper describes the system architecture of the developed prototype, its calibration procedure, the performed experimentation and some obtained results, along with a discussion and comparison with existing systems. Sustained operating trailer speeds of up to 30 km/h are achievable without loss of quality at 4096 pixels' image width (1 m width of road surface) with 250 µm/pixel resolution. Higher scanning speeds can be achieved by lowering the image resolution (120 km/h with 1 mm/pixel). Computer vision algorithms are under development to operate on the captured images in order to automatically detect road-kills of amphibians.This work was financed by FEDER Funds, through the Operational Programme for Competitiveness Factors-COMPETE, and by National Funds through FCT-Foundation for Science and Technology of Portugal, under the project PTDC/BIA-BIC/4296/2012 with the name-Roadkills: Intelligent systems for mapping amphibian mortality on Portuguese roads. C.S. and M.F. are supported by Research Grants contracts by FCT (UMINHO/BI/172/2013 and UMINHO/BI/175/2013 respectively). N.S. is supported by an IF (Investigador FCT) contract by FCT (IF/01526/2013). The authors also wish to thank the entities involved, in particular, School of Engineering of the University of Minho and the Algoritmi research center, the Faculty of Sciences of the University of Porto and the University Institute of Maia.info:eu-repo/semantics/publishedVersio
Técnicas mixtas de seguimiento y aprendizaje para tracking en secuencias de video
Dentro del campo de visión por computadora, el problema de tracking consiste en seguir la posición de un objeto en una secuencia de video. Este ha sido abordado de variadas maneras, principalmente con métodos basados en flujo óptico y, más recientemente, empleando técnicas de detección y aprendizaje automático. Particularmente, el tracker TLD ha demostrado buenos resultados combinando ambos esquemas. Sin embargo, el algoritmo de clasificación Nearest Neighbour utilizado en TLD tiene requerimientos altos en memoria y poder de cómputo. En este trabajo se exploran otros métodos de clasificación bajo el mismo esquema usado en TLD, de manera de mantener buen desempeño de tracking pero sin dichas limitaciones. Más específicamente, se utiliza un clasificador lineal junto con representaciones basadas en vectores de Fisher, cuyo uso de recursos es considerablemente menor a NN. Se compara experimentalmente el desempeño de tracking de este nuevo esquema con TLD original, validando el uso de este nuevo esquema para abordar el problema de trackin