306 research outputs found
Approximate Inference in Continuous Determinantal Point Processes
Determinantal point processes (DPPs) are random point processes well-suited
for modeling repulsion. In machine learning, the focus of DPP-based models has
been on diverse subset selection from a discrete and finite base set. This
discrete setting admits an efficient sampling algorithm based on the
eigendecomposition of the defining kernel matrix. Recently, there has been
growing interest in using DPPs defined on continuous spaces. While the
discrete-DPP sampler extends formally to the continuous case, computationally,
the steps required are not tractable in general. In this paper, we present two
efficient DPP sampling schemes that apply to a wide range of kernel functions:
one based on low rank approximations via Nystrom and random Fourier feature
techniques and another based on Gibbs sampling. We demonstrate the utility of
continuous DPPs in repulsive mixture modeling and synthesizing human poses
spanning activity spaces
Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia
Hyperlinks are an essential feature of the World Wide Web. They are
especially important for online encyclopedias such as Wikipedia: an article can
often only be understood in the context of related articles, and hyperlinks
make it easy to explore this context. But important links are often missing,
and several methods have been proposed to alleviate this problem by learning a
linking model based on the structure of the existing links. Here we propose a
novel approach to identifying missing links in Wikipedia. We build on the fact
that the ultimate purpose of Wikipedia links is to aid navigation. Rather than
merely suggesting new links that are in tune with the structure of existing
links, our method finds missing links that would immediately enhance
Wikipedia's navigability. We leverage data sets of navigation paths collected
through a Wikipedia-based human-computation game in which users must find a
short path from a start to a target article by only clicking links encountered
along the way. We harness human navigational traces to identify a set of
candidates for missing links and then rank these candidates. Experiments show
that our procedure identifies missing links of high quality
Regret analysis for performance metrics in multi-label classification: the case of Hamming and subset zero-one loss
Learning to Infer Social Ties in Large Networks
Abstract. In online social networks, most relationships are lack of meaning labels (e.g., “colleague ” and “intimate friends”), simply because users do not take the time to label them. An interesting question is: can we automatically infer the type of social relationships in a large network? what are the fundamental factors that imply the type of social relation-ships? In this work, we formalize the problem of social relationship learn-ing into a semi-supervised framework, and propose a Partially-labeled Pairwise Factor Graph Model (PLP-FGM) for learning to infer the type of social ties. We tested the model on three different genres of data sets: Publication, Email and Mobile. Experimental results demonstrate that the proposed PLP-FGM model can accurately infer 92.7 % of advisor-advisee relationships from the coauthor network (Publication), 88.0 % of manager-subordinate relationships from the email network (Email), and 83.1 % of the friendships from the mobile network (Mobile). Finally, we develop a distributed learning algorithm to scale up the model to real large networks.
Computer-aided ventilator resetting is feasible on the basis of a physiological profile.
BACKGROUND: Ventilator resetting is frequently needed to adjust tidal volume, pressure and gas exchange. The system comprising lungs and ventilator is so complex that a trial and error strategy is often applied. Comprehensive characterization of lung physiology is feasible by monitoring. The hypothesis that the effect of ventilator resetting could be predicted by computer simulation based on a physiological profile was tested in healthy pigs. METHODS: Flow, pressure and CO2 signals were recorded in 7 ventilated pigs. Elastic recoil pressure was measured at postinspiratory and post-expiratory pauses. Inspiratory and expiratory resistance as a function of volume and compliance were calculated. CO2 elimination per breath was expressed as a function of tidal volume. Calculating pressure and flow moment by moment simulated the effect of ventilator action, when respiratory rate was varied between 10 and 30 min(-1) and minute volume was changed so as to maintain PaCO2. Predicted values of peak airway pressure, plateau pressure, and CO2 elimination were compared to values measured after resetting. RESULTS: With 95% confidence, predicted pressures and CO2 elimination deviated from measured values with < 1 cm H2O and < 6%, respectively. CONCLUSION: It is feasible to predict effects of ventilator resetting on the basis of a physiological profile at least in health
User Identity Linkage by Latent User Space Modelling
National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ
Predicting zinc binding at the proteome level
BACKGROUND: Metalloproteins are proteins capable of binding one or more metal ions, which may be required for their biological function, for regulation of their activities or for structural purposes. Metal-binding properties remain difficult to predict as well as to investigate experimentally at the whole-proteome level. Consequently, the current knowledge about metalloproteins is only partial. RESULTS: The present work reports on the development of a machine learning method for the prediction of the zinc-binding state of pairs of nearby amino-acids, using predictors based on support vector machines. The predictor was trained using chains containing zinc-binding sites and non-metalloproteins in order to provide positive and negative examples. Results based on strong non-redundancy tests prove that (1) zinc-binding residues can be predicted and (2) modelling the correlation between the binding state of nearby residues significantly improves performance. The trained predictor was then applied to the human proteome. The present results were in good agreement with the outcomes of previous, highly manually curated, efforts for the identification of human zinc-binding proteins. Some unprecedented zinc-binding sites could be identified, and were further validated through structural modelling. The software implementing the predictor is freely available at: CONCLUSION: The proposed approach constitutes a highly automated tool for the identification of metalloproteins, which provides results of comparable quality with respect to highly manually refined predictions. The ability to model correlations between pairwise residues allows it to obtain a significant improvement over standard 1D based approaches. In addition, the method permits the identification of unprecedented metal sites, providing important hints for the work of experimentalists
- …