Search CORE

1 research outputs found

Weighted Semi-Supervised Approaches for Predictive Modeling and Truth Discovery

Author: Chandrasekaran Sai Nivedita
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2017
Field of study

Multi-View Learning (MVL) is a framework which combines data from heteroge- neous sources in an efficient manner in which the different views learn from each other, thereby improving the overall prediction of the task. By not combining the data from different views together, we preserve the underlying statistical property of each view thereby learning from data in their original feature space. Additionally, MVL also mitigates the problem of high dimensionality when data from multiple sources are integrated. We have exploited this property of MVL to predict chemical-target and drug-disease associations. Every chemical or drug can be represented in diverse feature spaces that could be viewed as multiple views. Similarly multi-task learning (MTL) frameworks enables the joint learning of related tasks that improves the overall performances of the tasks than learning them individually. This factor allows us to learn related targets and related diseases together. An empirical study has been carried out to study the combined effects of multi-view multi-task learning (MVMTL) to pre- dict chemical-target interactions and drug-disease associations. The first half of the thesis focuses on two methods that closely resemble MVMTL. We first explain the weighted Multi-View learning (wMVL) framework that systemat- ically learns from heterogeneous data sources by weighting the views in terms of their predictive power. We extend the work to include multi-task learning and formulate the second method called Multi-Task with weighted Multi-View Learning (MTwMVL). The performance of these two methods have been evaluated by cheminformatics data sets. iiWe change gears for the second part of this thesis towards truth discovery (TD). Truth discovery closely resembles a multi-view setting but the two strongly differ in certain aspects. While the underlying assumption in multi-view learning is that the different views have label consistency, truth finding differs in its setup where the main objective is to find the true value of an object given that different sources might conflict with each other and claim different values for that object. The sources could be considered as views and the primary strategy in truth finding is to estimate the reliability of each source and its contribution to the truth. There are many methods that address various challenges and aspects of truth discovery and we have in this thesis looked at TD in a semi-supervised setting. As the third contribution to this dissertation, we adopt a semi-supervised truth dis- covery framework in which we consider the labeled objects and unlabeled objects as two closely related tasks with one task having strong labels while the other task hav- ing weak labels. We show that a small set of ground truth helps in achieving better accuracy than the unsupervised methods

KU ScholarWorks