Search CORE

4,101 research outputs found

Sequences of regressions and their independences

Author: K. Sadeghi
Kayvan Sadeghi
N. Wermuth
N. Wermuth
Nanny Wermuth
Publication venue
Publication date: 01/01/2011
Field of study

Ordered sequences of univariate or multivariate regressions provide statistical models for analysing data from randomized, possibly sequential interventions, from cohort or multi-wave panel studies, but also from cross-sectional or retrospective studies. Conditional independences are captured by what we name regression graphs, provided the generated distribution shares some properties with a joint Gaussian distribution. Regression graphs extend purely directed, acyclic graphs by two types of undirected graph, one type for components of joint responses and the other for components of the context vector variable. We review the special features and the history of regression graphs, derive criteria to read all implied independences of a regression graph and prove criteria for Markov equivalence that is to judge whether two different graphs imply the same set of independence statements. Knowledge of Markov equivalence provides alternative interpretations of a given sequence of regressions, is essential for machine learning strategies and permits to use the simple graphical criteria of regression graphs on graphs for which the corresponding criteria are in general more complex. Under the known conditions that a Markov equivalent directed acyclic graph exists for any given regression graph, we give a polynomial time algorithm to find one such graph.Comment: 43 pages with 17 figures The manuscript is to appear as an invited discussion paper in the journal TES

arXiv.org e-Print Archive

CiteSeerX

Chalmers Research

Chalmers Publication Library

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Lee N.
Mandic D.
Oseledets I. V.
Phan A-H.
Sugiyama M.
Zhao Q.
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

arXiv.org e-Print Archive

Crossref

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Phan A-H.
Zhao Q.
Lee N.
Oseledets I. V.
Sugiyama M.
Mandic D.
Publication venue
Publication date: 01/01/2017
Field of study

arXiv.org e-Print Archive

Crossref

FigShare

Predicting soil organic carbon in a small farm system using in situ spectral measurements and the random forest regression

Author: Bangelesa Freddy Fefe
Publication venue
Publication date: 01/01/2017
Field of study

A research report submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in partial fulfillment of the requirements for the degree of Master of Science (Geographical Information Sciences and Remote Sensing) Johannesburg, 2017Soil organic carbon is considered as the most determining indicator of soil fertility. The purpose of this research was to predict the soil organic carbon in the Mokhotlong region, eastern of Lesotho using in situ spectral measurements and random forest regression. Soil reflectance spectra were acquired by a portable field spectrometer. The performance of random forest regression was assessed by comparing it with one of the most popular models in spectroscopy, partial least square regression. Laboratory spectroscopy measurements of the soil samples were analysed for assessing the accuracy of in situ spectroscopy based-models. The effect of the Savitzky−Golay first derivative in improving partial least square regression and random forest regression in both spectral data was also assessed. The results indicated that the random forest regression could accurately predict the soil organic carbon contents on an independent dataset using in situ spectroscopy data (RPD = 3.77, Rp2= 0.88, RMSEP = 0.64%). The overall best predictive model was achieved with the derivative laboratory spectral data using random forest with the optimum number of key wavelengths (RPD = 3.77, Rp2= 0.88, RMSEP = 0.64%). In contrast, partial least square regression was likely to overfit the calibration dataset. Important wavelengths to predict soil organic contents were localised around the visible range (400-700 nm). An implication of this research is that soil organic carbon can accurately be estimated using derivative in situ spectroscopy measurements and random forest regression with key wavelengths.MT 201

Wits Institutional Repository on DSPACE

Chemometrics for ion mobility spectrometry data:Recent advances and future prospects

Author: Buydens Lutgarde M C
Davies Antony N.
Szymańska Ewa
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2016
Field of study

Contains fulltext : 161386.pdf (publisher's version ) (Open Access)Historically, advances in the field of ion mobility spectrometry have been hindered by the variation in measured signals between instruments developed by different research laboratories or manufacturers. This has triggered the development and application of chemometric techniques able to reveal and analyze precious information content of ion mobility spectra. Recent advances in multidimensional coupling of ion mobility spectrometry to chromatography and mass spectrometry has created new, unique challenges for data processing, yielding high-dimensional, megavariate datasets. In this paper, a complete overview of available chemometric techniques used in the analysis of ion mobility spectrometry data is given. We describe the current state-of-the-art of ion mobility spectrometry data analysis comprising datasets with different complexities and two different scopes of data analysis, i.e. targeted and non-targeted analyte analyses. Two main steps of data analysis are considered: data preprocessing and pattern recognition. A detailed description of recent advances in chemometric techniques is provided for these steps, together with a list of interesting applications. We demonstrate that chemometric techniques have a significant contribution to the recent and great expansion of ion mobility spectrometry technology into different application fields. We conclude that well-thought out, comprehensive data analysis strategies are currently emerging, including several chemometric techniques and addressing different data challenges. In our opinion, this trend will continue in the near future, stimulating developments in ion mobility spectrometry instrumentation even further

University of South Wales Research Explorer

Radboud Repository

Recommended from our members

Application of temporal streamflow descriptors in hydrologic model parameter estimation

Author: Gupta HV
Imam B
Shamir E
Sorooshian S
Publication venue: eScholarship, University of California
Publication date: 01/06/2005
Field of study

This paper presents a parameter estimation approach based on hydrograph descriptors that capture dominant streamflow characteristics at three timescales (monthly, yearly, and record extent). The scheme, entitled hydrograph descriptors multitemporal sensitivity analyses (HYDMUS), yields an ensemble of model simulations generated from a reduced parameter space, based on a set of streamflow descriptors that emphasize the timescale dynamics of streamflow record. In this procedure the posterior distributions of model parameters derived at coarser timescales are used to sample model parameters for the next finer timescale. The procedure was used to estimate the parameters of the Sacramento soil moisture accounting model (SAC-SMA) for the Leaf River, Mississippi. The results indicated that in addition to a significant reduction in the range of parameter uncertainty, HYDMUS improved parameter identifiability for all 13 of the model parameters. The performance of the procedure was compared to four previous calibration studies on the same watershed. Although our application of HYDMUS did not explicitly consider the error at each simulation time step during the calibration process, the model performance was, in some important respects, found to be better than in previous deterministic studies. Copyright 2005 by the American Geophysical Union

eScholarship - University of California