868 research outputs found
Network modeling helps to tackle the complexity of drug-disease systems
From the (patho)physiological point of view, diseases can be considered as emergent properties of living systems stemming from the complexity of these systems. Complex systems display some typical features, including the presence of emergent behavior and the organization in successive hierarchic levels. Drug treatments increase this complexity scenario, and from some years the use of network models has been introduced to describe drug-disease systems and to make predictions about them with regard to several aspects related to drug discovery. Here, we review some recent examples thereof with the aim to illustrate how network science tools can be very effective in addressing both tasks. We will examine the use of bipartite networks that lead to the important concept of "disease module", as well as the introduction of more articulated models, like multi-scale and multiplex networks, able to describe disease systems at increasing levels of organization. Examples of predictive models will then be discussed, considering both those that exploit approaches purely based on graph theory and those that integrate machine learning methods. A short account of both kinds of methodological applications will be provided. Finally, the point will be made on the present situation of modeling complex drug-disease systems highlighting some open issues.This article is categorized under:Neurological Diseases > Computational ModelsInfectious Diseases > Computational ModelsCardiovascular Diseases > Computational Model
A Literature Review of Fault Diagnosis Based on Ensemble Learning
The accuracy of fault diagnosis is an important indicator to ensure the reliability of key equipment systems. Ensemble learning integrates different weak learning methods to obtain stronger learning and has achieved remarkable results in the field of fault diagnosis. This paper reviews the recent research on ensemble learning from both technical and field application perspectives. The paper summarizes 87 journals in recent web of science and other academic resources, with a total of 209 papers. It summarizes 78 different ensemble learning based fault diagnosis methods, involving 18 public datasets and more than 20 different equipment systems. In detail, the paper summarizes the accuracy rates, fault classification types, fault datasets, used data signals, learners (traditional machine learning or deep learning-based learners), ensemble learning methods (bagging, boosting, stacking and other ensemble models) of these fault diagnosis models. The paper uses accuracy of fault diagnosis as the main evaluation metrics supplemented by generalization and imbalanced data processing ability to evaluate the performance of those ensemble learning methods. The discussion and evaluation of these methods lead to valuable research references in identifying and developing appropriate intelligent fault diagnosis models for various equipment. This paper also discusses and explores the technical challenges, lessons learned from the review and future development directions in the field of ensemble learning based fault diagnosis and intelligent maintenance
Auto-adaptive multi-scale Laplacian Pyramids for modeling non-uniform data
Kernel-based techniques have become a common way for describing the local and global relationships of data samples that are generated in real-world processes. In this research, we focus on a multi-scale kernel based technique named Auto-adaptive Laplacian Pyramids (ALP). This method can be useful for function approximation and interpolation. ALP is an extension of the standard Laplacian Pyramids model that incorporates a modified Leave-One-Out Cross Validation procedure, which makes the method stable and automatic in terms of parameters selection without extra cost. This paper introduces a new algorithm that extends ALP to fit datasets that are non-uniformly distributed. In particular, the optimal stopping criterion will be point-dependent with respect to the local noise level and the sample rate. Experimental results over real datasets highlight the advantages of the proposed multi-scale technique for modeling and learning complex, high dimensional dataThey wish to thank Prof. Ronald R. Coifman for helpful remarks. They 525 also gratefully acknowledge the use of the facilities of Centro de Computación Científica (CCC) at Universidad Autónoma de Madrid. Funding: This work was supported by Spanish grants of the Ministerio de Ciencia, Innovación y Universidades [grant numbers: TIN2013-42351-P, TIN2015-70308-REDT, TIN2016-76406-P]; project CASI-CAM-CM supported by Madri+d 530 [grant number: S2013/ICE-2845]; project FACIL supported by Fundación BBVA (2016); and the UAM–ADIC Chair for Data Science and Machine Learnin
Recommended from our members
Essays on Tree-based Methods for Prediction and Causal Inference
The first chapter of this thesis contains an application of causal forests to a residential electricity smart meter trial dataset. Household specific estimates are obtained for the effect of a Time-of-Use pricing scheme on peak demand. The most and least responsive households differ across education, age, employment status, and past electricity consumption. The results suggest that past consumption information is more useful than pre-trial survey information, which includes building characteristics, household characteristics, and responses to appliance usage questions.
The second chapter explores new variations of Bayesian tree-based machine learning algorithms. Bayesian Additive Regression Trees (BART) (Chipman et al. 2010) and Bayesian Causal Forests (BCF) (Hahn et al. 2020) are state-of-the-art machine learning methods for prediction and causal inference. A number of existing implementations of BART make use of Markov Chain Monte Carlo algorithms, which can be computationally expensive when applied to high-dimensional datasets, do not always perform well in terms of mixing of chains, and have limited parallelizability.
The second chapter introduces four variations of BART that do not rely on MCMC:
1. An improved implementation of the existing method BART-BMA (Hernandez et al. 2018), which averages over sum-of-tree models found by a model search algorithm, performs well on high-dimensional datasets, and produces more interpretable output than other BART implementations because the output includes a comparatively small number of sum-of-tree models. %, each of which contains (under the default settings) 5 trees. Improvements are made to the model search algorithm, calculation of predictions, and credible intervals.% The algorithm is entirely deterministic.
2. A treatment effect estimation algorithm that combines the model structure of BCF with the implementation of BART-BMA (BCF-BMA). This method successfully accounts for confounding on observables using the BCF parameterization, while retaining the parsimonious model selection approach of BART-BMA.
3. A simple alternative BART implementation algorithm that uses importance sampling of models (BART-IS). This approach contrasts with existing MCMC and model-search based approaches in that BART-IS makes fast data-independent draws of many sum-of-tree models. The advantages of this approach are that it is straightforward to implement, fast, and trivially parallelizable.
4. Bayesian Causal Forests using Importance Sampling (BCF-IS). This is a combination of the BCF model framework with the BART-IS implementation. BART-IS and BCF-IS exhibit comparable performance to BART-MCMC and BCF across a large number of simulated datasets.
The second chapter also includes some illustrative applications. The methods are extendable to multiple treatments, multivariate outcomes, and panel data methods.
The third chapter of this thesis describes how the methods introduced in the second chapter can be generalized from regression and treatment effect estimation for continuous outcomes, to a range of models with various link functions and outcome variables. As examples of how to apply the general approach, Logit-BART-BMA and Logit-BART-IS are introduced with illustrative applications
A general framework for functional regression modelling
Researchers are increasingly interested in regression models for functional data. This article discusses a comprehensive framework for additive (mixed) models for functional responses and/or functional covariates based on the guiding principle of reframing functional regression in terms of corresponding models for scalar data, allowing the adaptation of a large body of existing methods for these novel tasks. The framework encompasses many existing as well as new models. It includes regression for generalized' functional data, mean regression, quantile regression as well as generalized additive models for location, shape and scale (GAMLSS) for functional data. It admits many flexible linear, smooth or interaction terms of scalar and functional covariates as well as (functional) random effects and allows flexible choices of basesparticularly splines and functional principal componentsand corresponding penalties for each term. It covers functional data observed on common (dense) or curve-specific (sparse) grids. Penalized-likelihood-based and gradient-boosting-based inference for these models are implemented in R packages refund and FDboost, respectively. We also discuss identifiability and computational complexity for the functional regression models covered. A running example on a longitudinal multiple sclerosis imaging study serves to illustrate the flexibility and utility of the proposed model class. Reproducible code for this case study is made available online
- …