4,615 research outputs found
Hyperparameter Importance Across Datasets
With the advent of automated machine learning, automated hyperparameter
optimization methods are by now routinely used in data mining. However, this
progress is not yet matched by equal progress on automatic analyses that yield
information beyond performance-optimizing hyperparameter settings. In this
work, we aim to answer the following two questions: Given an algorithm, what
are generally its most important hyperparameters, and what are typically good
values for these? We present methodology and a framework to answer these
questions based on meta-learning across many datasets. We apply this
methodology using the experimental meta-data available on OpenML to determine
the most important hyperparameters of support vector machines, random forests
and Adaboost, and to infer priors for all their hyperparameters. The results,
obtained fully automatically, provide a quantitative basis to focus efforts in
both manual algorithm design and in automated hyperparameter optimization. The
conducted experiments confirm that the hyperparameters selected by the proposed
method are indeed the most important ones and that the obtained priors also
lead to statistically significant improvements in hyperparameter optimization.Comment: \c{opyright} 2018. Copyright is held by the owner/author(s).
Publication rights licensed to ACM. This is the author's version of the work.
It is posted here for your personal use, not for redistribution. The
definitive Version of Record was published in Proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery & Data Minin
Spatio-temporal statistical methods in environmental and biometrical problems
This is the editorial letter for the Special Issue dedicated to the VIII International Workshop on Spatio-temporal Modelling (METMAVIII) which took place in Valencia (Spain) from 1 to 3 June 2016, and to the second Galician-Portuguese meeting of Biometry, with applications to Health Sciences, Ecology and Environmental Sciences (BIOAPP2016) held in Santiago de Compostela (Spain), 30–2 July 2016. This special issue summarises and discusses selected peer-reviewed contributions related to spatial and spatio-temporal statistical methodologies comprising both new methodological approaches and a wide range of applications related to environmental and biometrical problems. Point processes, lattice data and geostatistical methods are covered. These methods are illustrated with statistical analyses of animal or plant species in ecological studies, seismic data, temperatures and monthly precipitation, daily ozone concentration values, air pollution data, breast cancer incidence rates, mussels, wildfires, pore structures in pharmaceutical coatings, hake recruitment and cancer mortality data.(undefined)info:eu-repo/semantics/publishedVersio
Multivariate and repeated measures (MRM): A new toolbox for dependent and multimodal group-level neuroimaging data.
Repeated measurements and multimodal data are common in neuroimaging research. Despite this, conventional approaches to group level analysis ignore these repeated measurements in favour of multiple between-subject models using contrasts of interest. This approach has a number of drawbacks as certain designs and comparisons of interest are either not possible or complex to implement. Unfortunately, even when attempting to analyse group level data within a repeated-measures framework, the methods implemented in popular software packages make potentially unrealistic assumptions about the covariance structure across the brain. In this paper, we describe how this issue can be addressed in a simple and efficient manner using the multivariate form of the familiar general linear model (GLM), as implemented in a new MATLAB toolbox. This multivariate framework is discussed, paying particular attention to methods of inference by permutation. Comparisons with existing approaches and software packages for dependent group-level neuroimaging data are made. We also demonstrate how this method is easily adapted for dependency at the group level when multiple modalities of imaging are collected from the same individuals. Follow-up of these multimodal models using linear discriminant functions (LDA) is also discussed, with applications to future studies wishing to integrate multiple scanning techniques into investigating populations of interest.This work was supported by a MRC Centenary Early Career Award (MR/J500410/1). The example datasets were collected using support from an MRC DTP studentship and an MRC grant (G0900593).This is the author accepted manuscript. The final version is available from Elsevier via http://dx.doi.org/10.1016/j.neuroimage.2016.02.05
Shape Outlier Detection and Visualization for Functional Data: the Outliergram
We propose a new method to visualize and detect shape outliers in samples of
curves. In functional data analysis we observe curves defined over a given real
interval and shape outliers are those curves that exhibit a different shape
from the rest of the sample. Whereas magnitude outliers, that is, curves that
exhibit atypically high or low values at some points or across the whole
interval, are in general easy to identify, shape outliers are often masked
among the rest of the curves and thus difficult to detect. In this article we
exploit the relation between two depths for functional data to help visualizing
curves in terms of shape and to develop an algorithm for shape outlier
detection. We illustrate the use of the visualization tool, the outliergram,
through several examples and asses the performance of the algorithm on a
simulation study. We apply them to the detection of outliers in a children
growth dataset in which the girls sample is contaminated with boys curves and
viceversa.Comment: 27 pages, 5 figure
Disentangling causal webs in the brain using functional Magnetic Resonance Imaging: A review of current approaches
In the past two decades, functional Magnetic Resonance Imaging has been used
to relate neuronal network activity to cognitive processing and behaviour.
Recently this approach has been augmented by algorithms that allow us to infer
causal links between component populations of neuronal networks. Multiple
inference procedures have been proposed to approach this research question but
so far, each method has limitations when it comes to establishing whole-brain
connectivity patterns. In this work, we discuss eight ways to infer causality
in fMRI research: Bayesian Nets, Dynamical Causal Modelling, Granger Causality,
Likelihood Ratios, LiNGAM, Patel's Tau, Structural Equation Modelling, and
Transfer Entropy. We finish with formulating some recommendations for the
future directions in this area
Recommended from our members
Multiomics modeling of the immunome, transcriptome, microbiome, proteome and metabolome adaptations during human pregnancy.
MotivationMultiple biological clocks govern a healthy pregnancy. These biological mechanisms produce immunologic, metabolomic, proteomic, genomic and microbiomic adaptations during the course of pregnancy. Modeling the chronology of these adaptations during full-term pregnancy provides the frameworks for future studies examining deviations implicated in pregnancy-related pathologies including preterm birth and preeclampsia.ResultsWe performed a multiomics analysis of 51 samples from 17 pregnant women, delivering at term. The datasets included measurements from the immunome, transcriptome, microbiome, proteome and metabolome of samples obtained simultaneously from the same patients. Multivariate predictive modeling using the Elastic Net (EN) algorithm was used to measure the ability of each dataset to predict gestational age. Using stacked generalization, these datasets were combined into a single model. This model not only significantly increased predictive power by combining all datasets, but also revealed novel interactions between different biological modalities. Future work includes expansion of the cohort to preterm-enriched populations and in vivo analysis of immune-modulating interventions based on the mechanisms identified.Availability and implementationDatasets and scripts for reproduction of results are available through: https://nalab.stanford.edu/multiomics-pregnancy/.Supplementary informationSupplementary data are available at Bioinformatics online
- …