17,108 research outputs found
Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data
In domains like bioinformatics, information retrieval and social network
analysis, one can find learning tasks where the goal consists of inferring a
ranking of objects, conditioned on a particular target object. We present a
general kernel framework for learning conditional rankings from various types
of relational data, where rankings can be conditioned on unseen data objects.
We propose efficient algorithms for conditional ranking by optimizing squared
regression and ranking loss functions. We show theoretically, that learning
with the ranking loss is likely to generalize better than with the regression
loss. Further, we prove that symmetry or reciprocity properties of relations
can be efficiently enforced in the learned models. Experiments on synthetic and
real-world data illustrate that the proposed methods deliver state-of-the-art
performance in terms of predictive power and computational efficiency.
Moreover, we also show empirically that incorporating symmetry or reciprocity
properties can improve the generalization performance
Interactions between species introduce spurious associations in microbiome studies
Microbiota contribute to many dimensions of host phenotype, including
disease. To link specific microbes to specific phenotypes, microbiome-wide
association studies compare microbial abundances between two groups of samples.
Abundance differences, however, reflect not only direct associations with the
phenotype, but also indirect effects due to microbial interactions. We found
that microbial interactions could easily generate a large number of spurious
associations that provide no mechanistic insight. Using techniques from
statistical physics, we developed a method to remove indirect associations and
applied it to the largest dataset on pediatric inflammatory bowel disease. Our
method corrected the inflation of p-values in standard association tests and
showed that only a small subset of associations is directly linked to the
disease. Direct associations had a much higher accuracy in separating cases
from controls and pointed to immunomodulation, butyrate production, and the
brain-gut axis as important factors in the inflammatory bowel disease.Comment: 4 main text figures, 15 supplementary figures (i.e appendix) and 6
supplementary tables. Overall 49 pages including reference
Disentangling causal webs in the brain using functional Magnetic Resonance Imaging: A review of current approaches
In the past two decades, functional Magnetic Resonance Imaging has been used
to relate neuronal network activity to cognitive processing and behaviour.
Recently this approach has been augmented by algorithms that allow us to infer
causal links between component populations of neuronal networks. Multiple
inference procedures have been proposed to approach this research question but
so far, each method has limitations when it comes to establishing whole-brain
connectivity patterns. In this work, we discuss eight ways to infer causality
in fMRI research: Bayesian Nets, Dynamical Causal Modelling, Granger Causality,
Likelihood Ratios, LiNGAM, Patel's Tau, Structural Equation Modelling, and
Transfer Entropy. We finish with formulating some recommendations for the
future directions in this area
Structure learning of undirected graphical models for count data
Biological processes underlying the basic functions of a cell involve complex
interactions between genes. From a technical point of view, these interactions
can be represented through a graph where genes and their connections are,
respectively, nodes and edges. The main objective of this paper is to develop a
statistical framework for modelling the interactions between genes when the
activity of genes is measured on a discrete scale. In detail, we define a new
algorithm for learning the structure of undirected graphs, PC-LPGM, proving its
theoretical consistence in the limit of infinite observations. The proposed
algorithm shows promising results when applied to simulated data as well as to
real data
- …