5 research outputs found
Discovery Science : 11th International Conference, DS 2008, Budapest, Hungary, October 13-16, 2008, Proceedings
International audienceThis book constitutes the refereed proceedings of the 11th International Conference on Discovery Science, DS 2008, held in Budapest, Hungary, in October 2008, co-located with the 19th International Conference on Algorithmic Learning Theory, ALT 2008.The 26 revised long papers presented together with 5 invited papers were carefully reviewed and selected from 58 submissions. The papers address all current issues in the area of development and analysis of methods for intelligent data analysis, knowledge discovery and machine learning, as well as their application to scientific knowledge discovery. The papers are organized in topical sections on learning, feature selection, associations, discovery processes, learning and chemistry, clustering, structured data, and text analysis
Discovery Science : 11th International Conference, DS 2008, Budapest, Hungary, October 13-16, 2008, Proceedings
International audienceThis book constitutes the refereed proceedings of the 11th International Conference on Discovery Science, DS 2008, held in Budapest, Hungary, in October 2008, co-located with the 19th International Conference on Algorithmic Learning Theory, ALT 2008.The 26 revised long papers presented together with 5 invited papers were carefully reviewed and selected from 58 submissions. The papers address all current issues in the area of development and analysis of methods for intelligent data analysis, knowledge discovery and machine learning, as well as their application to scientific knowledge discovery. The papers are organized in topical sections on learning, feature selection, associations, discovery processes, learning and chemistry, clustering, structured data, and text analysis
Proof of Concept for a Visual Analytics Dashboard for Transportation Network Analysis
This paper discusses the latest developments in the field of visual analytics, and the role of network analysis for transportation systems. Multilayer and multiplex based visualizations are considered reliable solutions for handling the information overload the decision makers are facing in the addressed domain. The existing tools matching these requirements are briefly reviewed. Then, a proof of concept for a dashboard is presented focusing on a transportation network analysis with multiple network measures and indices in a multiplex visualization
Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing
abstract: Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and side effect prediction. The software and benchmark are available at https://github.com/hansaimlim/REMAP.The article is published at http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.100513
Improving Stability in Decision Tree Models
Owing to their inherently interpretable structure, decision trees are
commonly used in applications where interpretability is essential. Recent work
has focused on improving various aspects of decision trees, including their
predictive power and robustness; however, their instability, albeit
well-documented, has been addressed to a lesser extent. In this paper, we take
a step towards the stabilization of decision tree models through the lens of
real-world health care applications due to the relevance of stability and
interpretability in this space. We introduce a new distance metric for decision
trees and use it to determine a tree's level of stability. We propose a novel
methodology to train stable decision trees and investigate the existence of
trade-offs that are inherent to decision tree models - including between
stability, predictive power, and interpretability. We demonstrate the value of
the proposed methodology through an extensive quantitative and qualitative
analysis of six case studies from real-world health care applications, and we
show that, on average, with a small 4.6% decrease in predictive power, we gain
a significant 38% improvement in the model's stability