Search CORE

1,898 research outputs found

Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery

Author: Liu Han
Wang Lie
Zhao Tuo
Publication venue
Publication date: 01/08/2015
Field of study

We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence \cO(1/\epsilon), where

\epsilon

is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package \texttt{camel} implementing the proposed method is available on the Comprehensive R Archive Network \url{http://cran.r-project.org/web/packages/camel/}.Comment: Journal of Machine Learning Research, 201

arXiv.org e-Print Archive

Princeton University Open Access Repository

On the role of pre and post-processing in environmental data mining

Author: Athanasiadis Ioannis
Comas Joaquim
Gibert Karina
Holmes Geoffrey
Izquierdo Joaquin
Sanchez-Marre Miquel
Publication venue: International Environmental Modelling and Software Society
Publication date: 01/01/2008
Field of study

The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

Research Commons@Waikato

Quality data assessment and improvement in pre-processing pipeline to minimize impact of spurious signals in functional magnetic imaging (fMRI)

Author: NIGRI ANNA
Publication venue: country:Italy
Publication date: 01/01/2017
Field of study

In the recent years, the field of quality data assessment and signal denoising in functional magnetic resonance imaging (fMRI) is rapidly evolving and the identification and reduction of spurious signal with pre-processing pipeline is one of the most discussed topic. In particular, subject motion or physiological signals, such as respiratory or/and cardiac pulsatility, were showed to introduce false-positive activations in subsequent statistical analyses. Different measures for the evaluation of the impact of motion related artefacts, such as frame-wise displacement and root mean square of movement parameters, and the reduction of these artefacts with different approaches, such as linear regression of nuisance signals and scrubbing or censoring procedure, were introduced. However, we identify two main drawbacks: i) the different measures used for the evaluation of motion artefacts were based on user-dependent thresholds, and ii) each study described and applied their own pre-processing pipeline. Few studies analysed the effect of these different pipelines on subsequent analyses methods in task-based fMRI.The first aim of the study is to obtain a tool for motion fMRI data assessment, based on auto-calibrated procedures, to detect outlier subjects and outliers volumes, targeted on each investigated sample to ensure homogeneity of data for motion. The second aim is to compare the impact of different pre-processing pipelines on task-based fMRI using GLM based on recent advances in resting state fMRI preprocessing pipelines. Different output measures based on signal variability and task strength were used for the assessment

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Econometrics meets sentiment : an overview of methodology and applications

Author: Algaba Andres
Ardia David
Bluteau Keven
Borms Samuel
Boudt Kris
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software

VU Research Portal

Crossref

Ghent University Academic Bibliography

Deep Neural Networks and Data for Automated Driving

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/07/2022
Field of study

This open access book brings together the latest developments from industry and research on automated driving and artificial intelligence. Environment perception for highly automated driving heavily employs deep neural networks, facing many challenges. How much data do we need for training and testing? How to use synthetic data to save labeling costs for training? How do we increase robustness and decrease memory usage? For inevitably poor conditions: How do we know that the network is uncertain about its decisions? Can we understand a bit more about what actually happens inside neural networks? This leads to a very practical problem particularly for DNNs employed in automated driving: What are useful validation techniques and how about safety? This book unites the views from both academia and industry, where computer vision and machine learning meet environment perception for highly automated driving. Naturally, aspects of data, robustness, uncertainty quantification, and, last but not least, safety are at the core of it. This book is unique: In its first part, an extended survey of all the relevant aspects is provided. The second part contains the detailed technical elaboration of the various questions mentioned above

Directory of Open Access Books (DOAB)