51 research outputs found

    BRAZILIAN ARCHIVES OF BIOLOGY AND TECHNOLOGY A N I N T E R N A T I O N A L J O U R N A L Novel Image Classification technique using Particle Filter Framework optimised by Multikernel Sparse Representation

    Get PDF
    ABSTRACT The robustness and speed of image classification is still a challenging task in satellite image processing. This paper introduces a novel image classification technique that uses the particle filter framework (PFF)-based optimisation technique for satellite image classification. The framework uses a template-matching algorithm, comprising fast marching algorithm (FMA) and level set method (LSM)-based segmentation which assists in creating the initial templates for comparison with other test images. The created templates are trained and used as inputs for the optimisation. The optimisation technique used in this proposed work is multikernel sparse representation (MKSR). The combined execution of FMA, LSM, PFF and MKSR approaches has resulted in a substantial reduction in processing time for various classes in a satellite image which is small when compared wit

    Human-robot interaction and computer-vision-based services for autonomous robots

    Get PDF
    L'Aprenentatge per Imitació (IL), o Programació de robots per Demostració (PbD), abasta mètodes pels quals un robot aprèn noves habilitats a través de l'orientació humana i la imitació. La PbD s'inspira en la forma en què els éssers humans aprenen noves habilitats per imitació amb la finalitat de desenvolupar mètodes pels quals les noves tasques es poden transferir als robots. Aquesta tesi està motivada per la pregunta genèrica de "què imitar?", Que es refereix al problema de com extreure les característiques essencials d'una tasca. Amb aquesta finalitat, aquí adoptem la perspectiva del Reconeixement d'Accions (AR) per tal de permetre que el robot decideixi el què cal imitar o inferir en interactuar amb un ésser humà. L'enfoc proposat es basa en un mètode ben conegut que prové del processament del llenguatge natural: és a dir, la bossa de paraules (BoW). Aquest mètode s'aplica a grans bases de dades per tal d'obtenir un model entrenat. Encara que BoW és una tècnica d'aprenentatge de màquines que s'utilitza en diversos camps de la investigació, en la classificació d'accions per a l'aprenentatge en robots està lluny de ser acurada. D'altra banda, se centra en la classificació d'objectes i gestos en lloc d'accions. Per tant, en aquesta tesi es demostra que el mètode és adequat, en escenaris de classificació d'accions, per a la fusió d'informació de diferents fonts o de diferents assajos. Aquesta tesi fa tres contribucions: (1) es proposa un mètode general per fer front al reconeixement d'accions i per tant contribuir a l'aprenentatge per imitació; (2) la metodologia pot aplicar-se a grans bases de dades, que inclouen diferents modes de captura de les accions; i (3) el mètode s'aplica específicament en un projecte internacional d'innovació real anomenat Vinbot.El Aprendizaje por Imitación (IL), o Programación de robots por Demostración (PbD), abarca métodos por los cuales un robot aprende nuevas habilidades a través de la orientación humana y la imitación. La PbD se inspira en la forma en que los seres humanos aprenden nuevas habilidades por imitación con el fin de desarrollar métodos por los cuales las nuevas tareas se pueden transferir a los robots. Esta tesis está motivada por la pregunta genérica de "qué imitar?", que se refiere al problema de cómo extraer las características esenciales de una tarea. Con este fin, aquí adoptamos la perspectiva del Reconocimiento de Acciones (AR) con el fin de permitir que el robot decida lo que hay que imitar o inferir al interactuar con un ser humano. El enfoque propuesto se basa en un método bien conocido que proviene del procesamiento del lenguaje natural: es decir, la bolsa de palabras (BoW). Este método se aplica a grandes bases de datos con el fin de obtener un modelo entrenado. Aunque BoW es una técnica de aprendizaje de máquinas que se utiliza en diversos campos de la investigación, en la clasificación de acciones para el aprendizaje en robots está lejos de ser acurada. Además, se centra en la clasificación de objetos y gestos en lugar de acciones. Por lo tanto, en esta tesis se demuestra que el método es adecuado, en escenarios de clasificación de acciones, para la fusión de información de diferentes fuentes o de diferentes ensayos. Esta tesis hace tres contribuciones: (1) se propone un método general para hacer frente al reconocimiento de acciones y por lo tanto contribuir al aprendizaje por imitación; (2) la metodología puede aplicarse a grandes bases de datos, que incluyen diferentes modos de captura de las acciones; y (3) el método se aplica específicamente en un proyecto internacional de innovación real llamado Vinbot.Imitation Learning (IL), or robot Programming by Demonstration (PbD), covers methods by which a robot learns new skills through human guidance and imitation. PbD takes its inspiration from the way humans learn new skills by imitation in order to develop methods by which new tasks can be transmitted to robots. This thesis is motivated by the generic question of “what to imitate?” which concerns the problem of how to extract the essential features of a task. To this end, here we adopt Action Recognition (AR) perspective in order to allow the robot to decide what has to be imitated or inferred when interacting with a human kind. The proposed approach is based on a well-known method from natural language processing: namely, Bag of Words (BoW). This method is applied to large databases in order to obtain a trained model. Although BoW is a machine learning technique that is used in various fields of research, in action classification for robot learning it is far from accurate. Moreover, it focuses on the classification of objects and gestures rather than actions. Thus, in this thesis we show that the method is suitable in action classification scenarios for merging information from different sources or different trials. This thesis makes three contributions: (1) it proposes a general method for dealing with action recognition and thus to contribute to imitation learning; (2) the methodology can be applied to large databases which include different modes of action captures; and (3) the method is applied specifically in a real international innovation project called Vinbot

    Robust visual tracking via speedup multiple kernel ridge regression

    Get PDF
    Most of the tracking methods attempt to build up feature spaces to represent the appearance of a target. However, limited by the complex structure of the distribution of features, the feature spaces constructed in a linear manner cannot characterize the nonlinear structure well. We propose an appearance model based on kernel ridge regression for visual tracking. Dense sampling is fulfilled around the target image patches to collect the training samples. In order to obtain a kernel space in favor of describing the target appearance, multiple kernel learning is introduced into the selection of kernels. Under the framework, instead of a single kernel, a linear combination of kernels is learned from the training samples to create a kernel space. Resorting to the circulant property of a kernel matrix, a fast interpolate iterative algorithm is developed to seek coefficients that are assigned to these kernels so as to give an optimal combination. After the regression function is learned, all candidate image patches gathered are taken as the input of the function, and the candidate with the maximal response is regarded as the object image patch. Extensive experimental results demonstrate that the proposed method outperforms other state-of-the-art tracking methods

    Human Action Recognition via Fused Kinematic Structure and Surface Representation

    Get PDF
    Human action recognition from visual data has remained a challenging problem in the field of computer vision and pattern recognition. This dissertation introduces a new methodology for human action recognition using motion features extracted from kinematic structure, and shape features extracted from surface representation of human body. Motion features are used to provide sufficient information about human movement, whereas shape features are used to describe the structure of silhouette. These features are fused at the kernel level using Multikernel Learning (MKL) technique to enhance the overall performance of human action recognition. In fact, there are advantages in using multiple types of features for human action recognition, especially, if the features are complementary to each other (e.g. kinematic/motion features and shape features). For instance, challenging problems such as inter-class similarity among actions and performance variation, which cannot be resolved easily by using a single type of feature, can be handled by fusing multiple types of features. This dissertation presents a new method for representing the human body surface provided by depth map (3-D) using spherical harmonics representation. The advantage of using the spherical harmonics representation is to represent the whole body surface into a nite series of spherical harmonics coefficients. Furthermore, these series can be used to describe the pose of the body using the phase information encoded inside the coefficients. Another method for detecting/tracking distal limb segments using the kinematic structure is developed. The advantage of using the distal limb segments is to extract discriminative features that can provide sufficient and compact information to recognize human actions. Our experimental results show that the aforementioned methods for human action description are complementary to each other. Hence, combining both features can enhance the robustness of action recognition. In this context, a framework to fuse multiple features using MKL technique is developed. The experimental results show that this framework is promising in incorporating multiple features in different domain for automated recognition of human action

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Identification of infants at high-risk for autism spectrum disorder using multiparameter multiscale white matter connectivity networks: Identification of Infants at High-Risk for ASD

    Get PDF
    Autism spectrum disorder (ASD) is a wide range of disabilities that cause life-long cognitive impairment and social, communication, and behavioral challenges. Early diagnosis and medical intervention are important for improving the life quality of autistic patients. However, in the current practice, diagnosis often has to be delayed until the behavioral symptoms become evident during childhood. In this study, we demonstrate the feasibility of using machine learning techniques for identifying high-risk ASD infants at as early as six months after birth. This is based on the observation that ASD-induced abnormalities in white matter (WM) tracts and whole-brain connectivity have already started to appear within 24 months after birth. In particular, we propose a novel multikernel support vector machine classification framework by using the connectivity features gathered from WM connectivity networks, which are generated via multiscale regions of interest (ROIs) and multiple diffusion statistics such as fractional anisotropy, mean diffusivity, and average fiber length. Our proposed framework achieves an accuracy of 76% and an area of 0.80 under the receiver operating characteristic curve (AUC), in comparison to the accuracy of 70% and the AUC of 70% provided by the best single-parameter single-scale network. The improvement in accuracy is mainly due to the complementary information provided by multiparameter multiscale networks. In addition, our framework also provides the potential imaging connectomic markers and an objective means for early ASD diagnosis

    Event-based object detection and tracking for space situational awareness

    Get PDF
    In this work, we present an optical space imaging dataset using a range of event-based neuromorphic vision sensors. The unique method of operation of event-based sensors makes them ideal for space situational awareness (SSA) applications due to the sparseness inherent in space imaging data. These sensors offer significantly lower bandwidth and power requirements making them particularly well suited for use in remote locations and space-based platforms. We present the first publicly-accessible event-based space imaging dataset including recordings using sensors from multiple providers, greatly lowering the barrier to entry for other researchers given the scarcity of such sensors and the expertise required to operate them for SSA applications. The dataset contains both day time and night time recordings, including simultaneous co-collections from different event-based sensors. Recorded at a remote site, and containing 572 labeled targets with a wide range of sizes, trajectories, and signal-to-noise ratios, this real-world event-based dataset represents a challenging detection and tracking task that is not readily solved using previously proposed methods. We propose a highly optimized and robust feature-based detection and tracking method, designed specifically for SSA applications, and implemented via a cascade of increasingly selective event filters. These filters rapidly isolate events associated with space objects, maintaining the high temporal resolution of the sensors. The results from this simple yet highly optimized algorithm on the space imaging dataset demonstrate robust high-speed event-based detection and tracking which can readily be implemented on sensor platforms in space as well as terrestrial environments
    corecore