17 research outputs found

    Recursive multikernel filters exploiting nonlinear temporal structure

    Get PDF
    In kernel methods, temporal information on the data is commonly included by using time-delayed embeddings as inputs. Recently, an alternative formulation was proposed by defining a Ī³-filter explicitly in a reproducing kernel Hilbert space, giving rise to a complex model where multiple kernels operate on different temporal combinations of the input signal. In the original formulation, the kernels are then simply combined to obtain a single kernel matrix (for instance by averaging), which provides computational benefits but discards important information on the temporal structure of the signal. Inspired by works on multiple kernel learning, we overcome this drawback by considering the different kernels separately. We propose an efficient strategy to adaptively combine and select these kernels during the training phase. The resulting batch and online algorithms automatically learn to process highly nonlinear temporal information extracted from the input signal, which is implicitly encoded in the kernel values. We evaluate our proposal on several artificial and real tasks, showing that it can outperform classical approaches both in batch and online settings.S. Van Vaerenbergh is supported by the Spanish Ministry of Economy and Competitiveness (under project TEC2014-57402-JIN). S. Scardapane is supported in part by Italian MIUR, ā€œProgetti di Ricerca di Rilevante Interesse Nazionaleā€, GAUChO project, under Grant 2015YPXH4W-004

    Advances in Hyperspectral Image Classification Methods for Vegetation and Agricultural Cropland Studies

    Get PDF
    Hyperspectral data are becoming more widely available via sensors on airborne and unmanned aerial vehicle (UAV) platforms, as well as proximal platforms. While space-based hyperspectral data continue to be limited in availability, multiple spaceborne Earth-observing missions on traditional platforms are scheduled for launch, and companies are experimenting with small satellites for constellations to observe the Earth, as well as for planetary missions. Land cover mapping via classification is one of the most important applications of hyperspectral remote sensing and will increase in significance as time series of imagery are more readily available. However, while the narrow bands of hyperspectral data provide new opportunities for chemistry-based modeling and mapping, challenges remain. Hyperspectral data are high dimensional, and many bands are highly correlated or irrelevant for a given classification problem. For supervised classification methods, the quantity of training data is typically limited relative to the dimension of the input space. The resulting Hughes phenomenon, often referred to as the curse of dimensionality, increases potential for unstable parameter estimates, overfitting, and poor generalization of classifiers. This is particularly problematic for parametric approaches such as Gaussian maximum likelihoodbased classifiers that have been the backbone of pixel-based multispectral classification methods. This issue has motivated investigation of alternatives, including regularization of the class covariance matrices, ensembles of weak classifiers, development of feature selection and extraction methods, adoption of nonparametric classifiers, and exploration of methods to exploit unlabeled samples via semi-supervised and active learning. Data sets are also quite large, motivating computationally efficient algorithms and implementations. This chapter provides an overview of the recent advances in classification methods for mapping vegetation using hyperspectral data. Three data sets that are used in the hyperspectral classification literature (e.g., Botswana Hyperion satellite data and AVIRIS airborne data over both Kennedy Space Center and Indian Pines) are described in Section 3.2 and used to illustrate methods described in the chapter. An additional high-resolution hyperspectral data set acquired by a SpecTIR sensor on an airborne platform over the Indian Pines area is included to exemplify the use of new deep learning approaches, and a multiplatform example of airborne hyperspectral data is provided to demonstrate transfer learning in hyperspectral image classification. Classical approaches for supervised and unsupervised feature selection and extraction are reviewed in Section 3.3. In particular, nonlinearities exhibited in hyperspectral imagery have motivated development of nonlinear feature extraction methods in manifold learning, which are outlined in Section 3.3.1.4. Spatial context is also important in classification of both natural vegetation with complex textural patterns and large agricultural fields with significant local variability within fields. Approaches to exploit spatial features at both the pixel level (e.g., co-occurrencebased texture and extended morphological attribute profiles [EMAPs]) and integration of segmentation approaches (e.g., HSeg) are discussed in this context in Section 3.3.2. Recently, classification methods that leverage nonparametric methods originating in the machine learning community have grown in popularity. An overview of both widely used and newly emerging approaches, including support vector machines (SVMs), Gaussian mixture models, and deep learning based on convolutional neural networks is provided in Section 3.4. Strategies to exploit unlabeled samples, including active learning and metric learning, which combine feature extraction and augmentation of the pool of training samples in an active learning framework, are outlined in Section 3.5. Integration of image segmentation with classification to accommodate spatial coherence typically observed in vegetation is also explored, including as an integrated active learning system. Exploitation of multisensor strategies for augmenting the pool of training samples is investigated via a transfer learning framework in Section 3.5.1.2. Finally, we look to the future, considering opportunities soon to be provided by new paradigms, as hyperspectral sensing is becoming common at multiple scales from ground-based and airborne autonomous vehicles to manned aircraft and space-based platforms

    Human-robot interaction and computer-vision-based services for autonomous robots

    Get PDF
    L'Aprenentatge per ImitaciĆ³ (IL), o ProgramaciĆ³ de robots per DemostraciĆ³ (PbD), abasta mĆØtodes pels quals un robot aprĆØn noves habilitats a travĆ©s de l'orientaciĆ³ humana i la imitaciĆ³. La PbD s'inspira en la forma en quĆØ els Ć©ssers humans aprenen noves habilitats per imitaciĆ³ amb la finalitat de desenvolupar mĆØtodes pels quals les noves tasques es poden transferir als robots. Aquesta tesi estĆ  motivada per la pregunta genĆØrica de "quĆØ imitar?", Que es refereix al problema de com extreure les caracterĆ­stiques essencials d'una tasca. Amb aquesta finalitat, aquĆ­ adoptem la perspectiva del Reconeixement d'Accions (AR) per tal de permetre que el robot decideixi el quĆØ cal imitar o inferir en interactuar amb un Ć©sser humĆ . L'enfoc proposat es basa en un mĆØtode ben conegut que provĆ© del processament del llenguatge natural: Ć©s a dir, la bossa de paraules (BoW). Aquest mĆØtode s'aplica a grans bases de dades per tal d'obtenir un model entrenat. Encara que BoW Ć©s una tĆØcnica d'aprenentatge de mĆ quines que s'utilitza en diversos camps de la investigaciĆ³, en la classificaciĆ³ d'accions per a l'aprenentatge en robots estĆ  lluny de ser acurada. D'altra banda, se centra en la classificaciĆ³ d'objectes i gestos en lloc d'accions. Per tant, en aquesta tesi es demostra que el mĆØtode Ć©s adequat, en escenaris de classificaciĆ³ d'accions, per a la fusiĆ³ d'informaciĆ³ de diferents fonts o de diferents assajos. Aquesta tesi fa tres contribucions: (1) es proposa un mĆØtode general per fer front al reconeixement d'accions i per tant contribuir a l'aprenentatge per imitaciĆ³; (2) la metodologia pot aplicar-se a grans bases de dades, que inclouen diferents modes de captura de les accions; i (3) el mĆØtode s'aplica especĆ­ficament en un projecte internacional d'innovaciĆ³ real anomenat Vinbot.El Aprendizaje por ImitaciĆ³n (IL), o ProgramaciĆ³n de robots por DemostraciĆ³n (PbD), abarca mĆ©todos por los cuales un robot aprende nuevas habilidades a travĆ©s de la orientaciĆ³n humana y la imitaciĆ³n. La PbD se inspira en la forma en que los seres humanos aprenden nuevas habilidades por imitaciĆ³n con el fin de desarrollar mĆ©todos por los cuales las nuevas tareas se pueden transferir a los robots. Esta tesis estĆ” motivada por la pregunta genĆ©rica de "quĆ© imitar?", que se refiere al problema de cĆ³mo extraer las caracterĆ­sticas esenciales de una tarea. Con este fin, aquĆ­ adoptamos la perspectiva del Reconocimiento de Acciones (AR) con el fin de permitir que el robot decida lo que hay que imitar o inferir al interactuar con un ser humano. El enfoque propuesto se basa en un mĆ©todo bien conocido que proviene del procesamiento del lenguaje natural: es decir, la bolsa de palabras (BoW). Este mĆ©todo se aplica a grandes bases de datos con el fin de obtener un modelo entrenado. Aunque BoW es una tĆ©cnica de aprendizaje de mĆ”quinas que se utiliza en diversos campos de la investigaciĆ³n, en la clasificaciĆ³n de acciones para el aprendizaje en robots estĆ” lejos de ser acurada. AdemĆ”s, se centra en la clasificaciĆ³n de objetos y gestos en lugar de acciones. Por lo tanto, en esta tesis se demuestra que el mĆ©todo es adecuado, en escenarios de clasificaciĆ³n de acciones, para la fusiĆ³n de informaciĆ³n de diferentes fuentes o de diferentes ensayos. Esta tesis hace tres contribuciones: (1) se propone un mĆ©todo general para hacer frente al reconocimiento de acciones y por lo tanto contribuir al aprendizaje por imitaciĆ³n; (2) la metodologĆ­a puede aplicarse a grandes bases de datos, que incluyen diferentes modos de captura de las acciones; y (3) el mĆ©todo se aplica especĆ­ficamente en un proyecto internacional de innovaciĆ³n real llamado Vinbot.Imitation Learning (IL), or robot Programming by Demonstration (PbD), covers methods by which a robot learns new skills through human guidance and imitation. PbD takes its inspiration from the way humans learn new skills by imitation in order to develop methods by which new tasks can be transmitted to robots. This thesis is motivated by the generic question of ā€œwhat to imitate?ā€ which concerns the problem of how to extract the essential features of a task. To this end, here we adopt Action Recognition (AR) perspective in order to allow the robot to decide what has to be imitated or inferred when interacting with a human kind. The proposed approach is based on a well-known method from natural language processing: namely, Bag of Words (BoW). This method is applied to large databases in order to obtain a trained model. Although BoW is a machine learning technique that is used in various fields of research, in action classification for robot learning it is far from accurate. Moreover, it focuses on the classification of objects and gestures rather than actions. Thus, in this thesis we show that the method is suitable in action classification scenarios for merging information from different sources or different trials. This thesis makes three contributions: (1) it proposes a general method for dealing with action recognition and thus to contribute to imitation learning; (2) the methodology can be applied to large databases which include different modes of action captures; and (3) the method is applied specifically in a real international innovation project called Vinbot

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Sparse Graphical Linear Dynamical Systems

    Full text link
    Time-series datasets are central in numerous fields of science and engineering, such as biomedicine, Earth observation, and network analysis. Extensive research exists on state-space models (SSMs), which are powerful mathematical tools that allow for probabilistic and interpretable learning on time series. Estimating the model parameters in SSMs is arguably one of the most complicated tasks, and the inclusion of prior knowledge is known to both ease the interpretation but also to complicate the inferential tasks. Very recent works have attempted to incorporate a graphical perspective on some of those model parameters, but they present notable limitations that this work addresses. More generally, existing graphical modeling tools are designed to incorporate either static information, focusing on statistical dependencies among independent random variables (e.g., graphical Lasso approach), or dynamic information, emphasizing causal relationships among time series samples (e.g., graphical Granger approaches). However, there are no joint approaches combining static and dynamic graphical modeling within the context of SSMs. This work proposes a novel approach to fill this gap by introducing a joint graphical modeling framework that bridges the static graphical Lasso model and a causal-based graphical approach for the linear-Gaussian SSM. We present DGLASSO (Dynamic Graphical Lasso), a new inference method within this framework that implements an efficient block alternating majorization-minimization algorithm. The algorithm's convergence is established by departing from modern tools from nonlinear analysis. Experimental validation on synthetic and real weather variability data showcases the effectiveness of the proposed model and inference algorithm

    Distributed adaptive signal processing for frequency estimation

    Get PDF
    It is widely recognised that future smart grids will heavily rely upon intelligent communication and signal processing as enabling technologies for their operation. Traditional tools for power system analysis, which have been built from a circuit theory perspective, are a good match for balanced system conditions. However, the unprecedented changes that are imposed by smart grid requirements, are pushing the limits of these old paradigms. To this end, we provide new signal processing perspectives to address some fundamental operations in power systems such as frequency estimation, regulation and fault detection. Firstly, motivated by our finding that any excursion from nominal power system conditions results in a degree of non-circularity in the measured variables, we cast the frequency estimation problem into a distributed estimation framework for noncircular complex random variables. Next, we derive the required next generation widely linear, frequency estimators which incorporate the so-called augmented data statistics and cater for the noncircularity and a widely linear nature of system functions. Uniquely, we also show that by virtue of augmented complex statistics, it is possible to treat frequency tracking and fault detection in a unified way. To address the ever shortening time-scales in future frequency regulation tasks, the developed distributed widely linear frequency estimators are equipped with the ability to compensate for the fewer available temporal voltage data by exploiting spatial diversity in wide area measurements. This contribution is further supported by new physically meaningful theoretical results on the statistical behavior of distributed adaptive filters. Our approach avoids the current restrictive assumptions routinely employed to simplify the analysis by making use of the collaborative learning strategies of distributed agents. The efficacy of the proposed distributed frequency estimators over standard strictly linear and stand-alone algorithms is illustrated in case studies over synthetic and real-world three-phase measurements. An overarching theme in this thesis is the elucidation of underlying commonalities between different methodologies employed in classical power engineering and signal processing. By revisiting fundamental power system ideas within the framework of augmented complex statistics, we provide a physically meaningful signal processing perspective of three-phase transforms and reveal their intimate connections with spatial discrete Fourier transform (DFT), optimal dimensionality reduction and frequency demodulation techniques. Moreover, under the widely linear framework, we also show that the two most widely used frequency estimators in the power grid are in fact special cases of frequency demodulation techniques. Finally, revisiting classic estimation problems in power engineering through the lens of non-circular complex estimation has made it possible to develop a new self-stabilising adaptive three-phase transformation which enables algorithms designed for balanced operating conditions to be straightforwardly implemented in a variety of real-world unbalanced operating conditions. This thesis therefore aims to help bridge the gap between signal processing and power communities by providing power system designers with advanced estimation algorithms and modern physically meaningful interpretations of key power engineering paradigms in order to match the dynamic and decentralised nature of the smart grid.Open Acces

    Probabilistic prediction of Alzheimerā€™s disease from multimodal image data with Gaussian processes

    Get PDF
    Alzheimerā€™s disease, the most common form of dementia, is an extremely serious health problem, and one that will become even more so in the coming decades as the global population ages. This has led to a massive effort to develop both new treatments for the condition and new methods of diagnosis; in fact the two are intimately linked as future treatments will depend on earlier diagnosis, which in turn requires the development of biomarkers that can be used to identify and track the disease. This is made possible by studies such as the Alzheimerā€™s disease neuroimaging initiative which provides previously unimaginable quantities of imaging and other data freely to researchers. It is the task of early diagnosis that this thesis focuses on. We do so by borrowing modern machine learning techniques, and applying them to image data. In particular, we use Gaussian processes (GPs), a previously neglected tool, and show they can be used in place of the more widely used support vector machine (SVM). As combinations of complementary biomarkers have been shown to be more useful than the biomarkers are individually, we go on to show GPs can also be applied to integrate different types of image and non-image data, and thanks to their properties this improves results further than it does with SVMs. In the final two chapters, we also look at different ways to formulate both the prediction of conversion to Alzheimerā€™s disease as a machine learning problem and the way image data can be used to generate features for input as a machine learning algorithm. Both of these show how unconventional approaches may improve results. The result is an advance in the state-of-the-art for a very clinically important problem, which may prove useful in practice and show a direction of future research to further increase the usefulness of such method

    Neuromorphic perception for greenhouse technology using event-based sensors

    Get PDF
    Event-Based Cameras (EBCs), unlike conventional cameras, feature independent pixels that asynchronously generate outputs upon detecting changes in their field of view. Short calculations are performed on each event to mimic the brain. The output is a sparse sequence of events with high temporal precision. Conventional computer vision algorithms do not leverage these properties. Thus a new paradigm has been devised. While event cameras are very efficient in representing sparse sequences of events with high temporal precision, many approaches are challenged in applications where a large amount of spatially-temporally rich information must be processed in real-time. In reality, most tasks in everyday life take place in complex and uncontrollable environments, which require sophisticated models and intelligent reasoning. Typical hard problems in real-world scenes are detecting various non-uniform objects or navigation in an unknown and complex environment. In addition, colour perception is an essential fundamental property in distinguishing objects in natural scenes. Colour is a new aspect of event-based sensors, which work fundamentally differently from standard cameras, measuring per-pixel brightness changes per colour filter asynchronously rather than measuring ā€œabsoluteā€ brightness at a constant rate. This thesis explores neuromorphic event-based processing methods for high-noise and cluttered environments with imbalanced classes. A fully event-driven processing pipeline was developed for agricultural applications to perform fruits detection and classification to unlock the outstanding properties of event cameras. The nature of features in such data was explored, and methods to represent and detect features were demonstrated. A framework for detecting and classifying features was developed and evaluated on the N-MNIST and Dynamic Vision Sensor (DVS) gesture datasets. The same network was evaluated on laboratory recorded and real-world data with various internal variations for fruits detection such as overlap, variation in size and appearance. In addition, a method to handle highly imbalanced data was developed. We examined the characteristics of spatio-temporal patterns for each colour filter to help expand our understanding of this novel data and explored their applications in classification tasks where colours were more relevant features than shapes and appearances. The results presented in this thesis demonstrate the potential and efficacy of event- based systems by demonstrating the applicability of colour event data and the viability of event-driven classification

    Approximation contexts in addressing graph data structures

    Get PDF
    While the application of machine learning algorithms to practical problems has been expanded from fixed sized input data to sequences, trees or graphs input data, the composition of learning system has developed from a single model to integrated ones. Recent advances in graph based learning algorithms include: the SOMSD (Self Organizing Map for Structured Data), PMGraphSOM (Probability Measure Graph Self Organizing Map,GNN (Graph Neural Network) and GLSVM (Graph Laplacian Support Vector Machine). A main motivation of this thesis is to investigate if such algorithms, whether by themselves individually or modified, or in various combinations, would provide better performance over the more traditional artificial neural networks or kernel machine methods on some practical challenging problems. More succinctly, this thesis seeks to answer the main research question: when or under what conditions/contexts could graph based models be adjusted and tailored to be most efficacious in terms of predictive or classification performance on some challenging practical problems? There emerges a range of sub-questions including: how do we craft an effective neural learning system which can be an integration of several graph and non-graph based models? Integration of various graph based and non graph based kernel machine algorithms; enhancing the capability of the integrated model in working with challenging problems; tackling the problem of long term dependency issues which aggravate the performance of layer-wise graph based neural systems. This thesis will answer these questions. Recent research on multiple staged learning models has demonstrated the efficacy of multiple layers of alternating unsupervised and supervised learning approaches. This underlies the very successful front-end feature extraction techniques in deep neural networks. However much exploration is still possible with the investigation of the number of layers required, and the types of unsupervised or supervised learning models which should be used. Such issues have not been considered so far, when the underlying input data structure is in the form of a graph. We will explore empirically the capabilities of models of increasing complexities, the combination of the unsupervised learning algorithms, SOM, or PMGraphSOM, with or without a cascade connection with a multilayer perceptron, and with or without being followed by multiple layers of GNN. Such studies explore the effects of including or ignoring context. A parallel study involving kernel machines with or without graph inputs has also been conducted empirically
    corecore