1,639 research outputs found

    Classification of Broadcast News Audio Data Employing Binary Decision Architecture

    Get PDF
    A novel binary decision architecture (BDA) for broadcast news audio classification task is presented in this paper. The idea of developing such architecture came from the fact that the appropriate combination of multiple binary classifiers for two-class discrimination problem can reduce a miss-classification error without rapid increase in computational complexity. The core element of classification architecture is represented by a binary decision (BD) algorithm that performs discrimination between each pair of acoustic classes, utilizing two types of decision functions. The first one is represented by a simple rule-based approach in which the final decision is made according to the value of selected discrimination parameter. The main advantage of this solution is relatively low processing time needed for classification of all acoustic classes. The cost for that is low classification accuracy. The second one employs support vector machine (SVM) classifier. In this case, the overall classification accuracy is conditioned by finding the optimal parameters for decision function resulting in higher computational complexity and better classification performance. The final form of proposed BDA is created by combining four BD discriminators supplemented by decision table. The effectiveness of proposed BDA, utilizing rule-based approach and the SVM classifier, is compared with two most popular strategies for multiclass classification, namely the binary decision trees (BDT) and the One-Against-One SVM (OAOSVM). Experimental results show that the proposed classification architecture can decrease the overall classification error in comparison with the BDT architecture. On the contrary, an optimization technique for selecting the optimal set of training data is needed in order to overcome the OAOSVM

    Concepção e realização de um framework para sistemas embarcados baseados em FPGA aplicado a um classificador Floresta de Caminhos Ótimos

    Get PDF
    Orientadores: Eurípedes Guilherme de Oliveira Nóbrega, Isabelle Fantoni-Coichot, Vincent FrémontTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Mecânica, Université de Technologie de CompiègneResumo: Muitas aplicações modernas dependem de métodos de Inteligência Artificial, tais como classificação automática. Entretanto, o alto custo computacional associado a essas técnicas limita seu uso em plataformas embarcadas com recursos restritos. Grandes quantidades de dados podem superar o poder computacional disponível em tais ambientes, o que torna o processo de projetá-los uma tarefa desafiadora. As condutas de processamento mais comuns usam muitas funções de custo computacional elevadas, o que traz a necessidade de combinar alta capacidade computacional com eficiência energética. Uma possível estratégia para superar essas limitações e prover poder computacional suficiente aliado ao baixo consumo de energia é o uso de hardware especializado como, por exemplo, FPGA. Esta classe de dispositivos é amplamente conhecida por sua boa relação desempenho/consumo, sendo uma alternativa interessante para a construção de sistemas embarcados eficazes e eficientes. Esta tese propõe um framework baseado em FPGA para a aceleração de desempenho de um algoritmo de classificação a ser implementado em um sistema embarcado. A aceleração do desempenho foi atingida usando o esquema de paralelização SIMD, aproveitando as características de paralelismo de grão fino dos FPGA. O sistema proposto foi implementado e testado em hardware FPGA real. Para a validação da arquitetura, um classificador baseado em Teoria dos Grafos, o OPF, foi avaliado em uma proposta de aplicação e posteriormente implementado na arquitetura proposta. O estudo do OPF levou à proposição de um novo algoritmo de aprendizagem para o mesmo, usando conceitos de Computação Evolutiva, visando a redução do tempo de processamento de classificação, que, combinada à implementação em hardware, oferece uma aceleração de desempenho suficiente para ser aplicada em uma variedade de sistemas embarcadosAbstract: Many modern applications rely on Artificial Intelligence methods such as automatic classification. However, the computational cost associated with these techniques limit their use in resource constrained embedded platforms. A high amount of data may overcome the computational power available in such embedded environments while turning the process of designing them a challenging task. Common processing pipelines use many high computational cost functions, which brings the necessity of combining high computational capacity with energy efficiency. One of the strategies to overcome this limitation and provide sufficient computational power allied with low energy consumption is the use of specialized hardware such as FPGA. This class of devices is widely known for their performance to consumption ratio, being an interesting alternative to building capable embedded systems. This thesis proposes an FPGA-based framework for performance acceleration of a classification algorithm to be implemented in an embedded system. Acceleration is achieved using SIMD-based parallelization scheme, taking advantage of FPGA characteristics of fine-grain parallelism. The proposed system is implemented and tested in actual FPGA hardware. For the architecture validation, a graph-based classifier, the OPF, is evaluated in an application proposition and afterward applied to the proposed architecture. The OPF study led to a proposition of a new learning algorithm using evolutionary computation concepts, aiming at classification processing time reduction, which combined to the hardware implementation offers sufficient performance acceleration to be applied in a variety of embedded systemsDoutoradoMecanica dos Sólidos e Projeto MecanicoDoutor em Engenharia Mecânica3077/2013-09CAPE

    Sparse Binary Features for Image Classification

    Get PDF
    In this work a new method for automatic image classification is proposed. It relies on a compact representation of images using sets of sparse binary features. This work first evaluates the Fast Retina Keypoint binary descriptor and proposes improvements based on an efficient descriptor representation. The efficient representation is created using dimensionality reduction techniques, entropy analysis and decorrelated sampling. In a second part, the problem of image classification is tackled. The traditional approach uses machine learning algorithms to create classifiers, and some works already propose to use a compact image representation using feature extraction as preprocessing. The second contribution of this work is to show that binary features, while being very compact and low dimensional (compared to traditional representation of images), still provide a very high discriminant power. This is shown using various learning algorithms and binary descriptors. These years a scheme has been widely used to perform object recognition on images, or equivalently image classification. It is based on the concept of Bag of Visual Words. More precisely, an image is described using an unordered set of visual words, that are generally represented by feature descriptions. The last contribution of this work is to use binary features with a simple Bag of Visual Words classifier. Tests of performance for the image classification are performed on a large database of images

    Computer Vision for Microscopy Applications

    Get PDF

    In-hand object detection and tracking using 2D and 3D information

    Get PDF
    As robots are introduced increasingly in human-inhabited areas, they would need a perception system able to detect the actions the humans around it are performing. This information is crucial in order to act accordingly in this changing environment. Humans utilize different objects and tools in various tasks and hence, one of the most useful informations that could be extracted to recognize the actions are the objects that the person is using. As an example, if a person is holding a book, he is probably reading. The information about the objects the humans are holding is useful to determine the activities they are undergoing. This thesis presents a system that is able to track the user’s hand and learn and recognize the object being held. When instructed to learn, the software extracts key information about the object and stores it with a unique identification number for later recognition. If the user triggers the recognition mode, the system compares the current object’s information with the data previously stored and outputs the best match. The system uses both 2D and 3D descriptors to improve the recognition stage. In order to reduce the noise, there are two separate matching procedures for 2D and 3D that output a preliminary prediction at a rate of 30 predictions per second. Finally, a weighted average is performed with these 30 predictions for both 2D and 3D and the final prediction of the system is obtained. The experiments carried out to validate the system reveal that it is capable of recognizing objects from a pool of 6 different objects with a F1 score value near 80% for each case. The experiments demonstrate that the system performs better when combines the information of 2D and 3D descriptors than when used 2D or 3D descriptors separately. The performance tests show that the system is able to run on real time with minimum computer requirements of roughly one physical core (at 2.4GHz) and less than 1 GB of RAM memory. Also, it is possible to implement the software in a distributed system since the bandwidth measurements carried out disclose a maximum bandwidth lower than 7 MB/s. This system is, to the best of my knowledge, the first in the art to implement an in-hand object learning and recognition algorithm using 2D and 3D information. The introduction of both types of data and the inclusion of a posterior decision step improves the robustness and the accuracy of the system. The software developed in this thesis is to serve as a building block for further research on the topic in order to create a more natural human-robot interaction an understanding. This creation of a human- like interaction with the environment for robots is a crucial step towards their complete autonomy and acceptance in human areas.La tendencia a introducir robots asistenciales en nuestra vida cotidiana es cada vez mayor. Esto hace necesaria la incorporación de un sistema de percepción en los robots capaz de detectar las tareas que las personas están realizando. Para ello, el reconocimiento de los objetos que se utilizan es una de las informaciones más útiles que se pueden extraer. Por ejemplo, si una persona está sosteniendo un libro, probablemente esté leyendo. La información acerca de los objetos que las personas utilizan sirve para identificar lo que están haciendo. Esta tesis presenta un sistema que es capaz de seguir la mano del usuario y aprender y reconocer el objeto que ésta sostiene. Durante el modo de aprendizaje, el programa extrae información importante sobre el objeto y la guarda con un número de identificación único. El modo de reconocimiento, por su parte, compara la información extraída del objeto actual con la guardada previamente. La salida del sistema es el número de identificación del objeto aprendido más parecido al actual. El sistema utiliza descriptores 2D y 3D para mejorar la fase de reconocimiento. Para reducir el ruido, se compara la información 2D y 3D por separado y se extrae una predicción preliminar a una velocidad de 30 predicciones por segundo. Posteriormente, se realiza una media ponderada de esas 30 predicciones para obtener el resultado final. Los experimentos realizados para validar el sistema revelan que es capaz de reconocer objetos de un conjunto total de 6 con un valor F cercano al 80% en todos los casos. Los resultados demuestran que el valor F obtenido por el sistema es mejor que aquel obtenido por las predicciones individuales en 2D y 3D. Los tests de rendimiento que se han realizado en el sistema indican que es capaz de operar en tiempo real. Para ello necesita un ordenador con unos requerimientos mínimos de un núcleo (a 2.4 GHz) y 1 GB de memoria RAM. También señalan que es posible implementar el programa en un sistema distribuído debido a que el máximo de ancho de banda obtenido es menor de 7 MB/s. Este sistema es, según los datos de que dispongo, el primero en incorporar un reconocimiento y aprendizaje de objetos sostenidos por una mano utilizando información 2D y 3D. La introducción de ambos tipos de datos y de una posterior etapa de decisión mejora la robustez y la precisión del sistema. El programa desarrollado en esta tesis sirve como un primer paso para incentivar la investigación en este campo, con la intención de crear una interacción más natural entre humanos y robots. La introducción en los robots de una capacidad de relación con el entorno similar a la humana es un paso decisivo hacia su completa autonomía y su aceptación en áreas habitadas por humanos.Ingeniería Electrónica Industrial y Automátic

    Histopathological image analysis : a review

    Get PDF
    Over the past decade, dramatic increases in computational power and improvement in image analysis algorithms have allowed the development of powerful computer-assisted analytical approaches to radiological data. With the recent advent of whole slide digital scanners, tissue histopathology slides can now be digitized and stored in digital image form. Consequently, digitized tissue histopathology has now become amenable to the application of computerized image analysis and machine learning techniques. Analogous to the role of computer-assisted diagnosis (CAD) algorithms in medical imaging to complement the opinion of a radiologist, CAD algorithms have begun to be developed for disease detection, diagnosis, and prognosis prediction to complement the opinion of the pathologist. In this paper, we review the recent state of the art CAD technology for digitized histopathology. This paper also briefly describes the development and application of novel image analysis technology for a few specific histopathology related problems being pursued in the United States and Europe

    Using information content to select keypoints for UAV image matching

    Get PDF
    Image matching is one of the most important tasks in Unmanned Arial Vehicles (UAV) photogrammetry applications. The number and distribution of extracted keypoints play an essential role in the reliability and accuracy of image matching and orientation results. Conventional detectors generally produce too many redundant keypoints. In this paper, we study the effect of applying various information content criteria to keypoint selection tasks. For this reason, the quality measures of entropy, spatial saliency and texture coefficient are used to select keypoints extracted using SIFT, SURF, MSER and BRISK operators. Experiments are conducted using several synthetic and real UAV image pairs. Results show that the keypoint selection methods perform differently based on the applied detector and scene type, but in most cases, the precision of the matching results is improved by an average of 15%. In general, it can be said that applying proper keypoint selection techniques can improve the accuracy and efficiency of UAV image matching and orientation results. In addition to the evaluation, a new hybrid keypoint selection is proposed that combines all of the information content criteria discussed in this paper. This new screening method was also compared with those of SIFT, which showed 22% to 40% improvement for the bundle adjustment of UAV images

    Methods for efficient object categorization, detection, scene recognition, and image search

    Get PDF
    In the past few years there has been a tremendous growth in the usage of digital images. Users can now access millions of photos, a fact that poses the need of having methods that can efficiently and effectively search the visual information of interest. In this thesis, we propose methods to learn image representations to compactly represent a large collection of images, enabling accurate image recognition with linear classification models which offer the advantage of being efficient to both train and test. The entries of our descriptors are the output of a set of basis classifiers evaluated on the image, which capture the presence or absence of a set of high-level visual concepts. We propose two different techniques to automatically discover the visual concepts and learn the basis classifiers from a given labeled dataset of pictures, producing descriptors that highly-discriminate the original categories of the dataset. We empirically show that these descriptors are able to encode new unseen pictures, and produce state-of-the-art results in conjunct with cheap linear classifiers. We describe several strategies to aggregate the outputs of basis classifiers evaluated on multiple subwindows of the image in order to handle cases when the photo contains multiple objects and large amounts of clutter. We extend this framework for the task of object detection, where the goal is to spatially localize an object within an image. We use the output of a collection of detectors trained in an offline stage as features for new detection problems, showing competitive results with the current state of the art. Since generating rich manual annotations for an image dataset is a crucial limit of modern methods in object localization and detection, in this thesis we also propose a method to automatically generate training data for an object detector in a weakly-supervised fashion, yielding considerable savings in human annotation efforts. We show that our automatically-generated regions can be used to train object detectors with recognition results remarkably close to those obtained by training on manually annotated bounding boxes

    Evaluation of process-structure-property relationships of carbon nanotube forests using simulation and deep learning

    Get PDF
    This work is aimed to explore process-structure-property relationships of carbon nanotube (CNT) forests. CNTs have superior mechanical, electrical and thermal properties that make them suitable for many applications. Yet, due to lack of manufacturing control, there is a huge performance gap between promising properties of individual CNTs and CNT forest properties that hinders their adoption into potential industrial applications. In this research, computational modelling, in-situ electron microscopy for CNT synthesis, and data-driven and high-throughput deep convolutional neural networks are employed to not only accelerate implementing CNTs in various applications but also to establish a framework to make validated predictive models that can be easily extended to achieve application-tailored synthesis of any materials. A time-resolved and physics-based finite-element simulation tool is modelled in MATLAB to investigate synthesis of CNT forests, specially to study the CNT-CNT interactions and generated mechanical forces and their role in ensemble structure and properties. A companion numerical model with similar construct is then employed to examine forest mechanical properties in compression. In addition, in-situ experiments are carried out inside Environmental Scanning Electron Microscope (ESEM) to nucleate and synthesize CNTs. Findings may primarily be used to expand the forest growth and self-assembly knowledge and to validate the assumptions of simulation package. Also, SEM images can be used as feed database to construct a deep learning model to grow CNTs by design. The chemical vapor deposition parameter space of CNT synthesis is so vast that it is not possible to investigate all conceivable combinations in terms of time and costs. Hence, simulated CNT forest morphology images are used to train machine learning and learning algorithms that are able to predict CNT synthesis conditions based on desired properties. Exceptionally high prediction accuracies of R2 > 0.94 is achieved for buckling load and stiffness, as well as accuracies of > 0.91 for the classification task. This high classification accuracy promotes discovering the CNT forest synthesis-structure relationships so that their promising performance can be adopted in real world applications. We foresee this work as a meaningful step towards creating an unsupervised simulation using machine learning techniques that can seek out the desired CNT forest synthesis parameters to achieve desired property sets for diverse applications.Includes bibliographical reference
    corecore