613 research outputs found

    Towards Personalized Medicine Using Systems Biology And Machine Learning

    Get PDF
    The rate of acquiring biological data has greatly surpassed our ability to interpret it. At the same time, we have started to understand that evolution of many diseases such as cancer, are the results of the interplay between the disease itself and the immune system of the host. It is now well accepted that cancer is not a single disease, but a “complex collection of distinct genetic diseases united by common hallmarks”. Understanding the differences between such disease subtypes is key not only in providing adequate treatments for known subtypes but also identifying new ones. These unforeseen disease subtypes are one of the main reasons high-profile clinical trials fail. To identify such cases, we proposed a classification technique, based on Support Vector Machines, that is able to automatically identify samples that are dissimilar from the classes used for training. We assessed the performance of this approach both with artificial data and data from the UCI machine learning repository. Moreover, we showed in a leukemia experiment that our method is able to identify 65% of the MLL patients when it was trained only on AML vs. ALL. In addition, to augment our ability to understand the disease mechanism in each subgroup, we proposed a systems biology approach able to consider all measured gene expressing changes, thus eliminating the possibility that small but important gene changes (e.g. transcription factors) are omitted from the analysis. We showed that this approach provides consistent results that do not depend on the choice of an arbitrary threshold for the differential regulation. We also showed in a multiple sclerosis study that this approach is able to obtain consistent results across multiple experiments performed by different groups on different technologies, that could not be achieved based solely using differential expression. The cut-off free impact analysis was released as part of the ROntoTools Bioconductor package

    Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics

    Get PDF
    Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts. In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact pp-values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited. In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical R2R^2 in least squares regression. In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions

    On understanding, modeling and predicting user behavior in web search

    Get PDF

    Behavioral Economics - Enhanced: Machine Learning and Decision Making

    Get PDF
    In this thesis, I investigate decision-making in the fields of behavioral economics, experimental economics, and law and economics. The research questions I ask are: Can we nudge people towards being more honest? Can we use language to find out who lies? Which factors influence a judge’s decision, and how do people cooperate? Specifically, I investigate contributions in a public goods game, (dis-)honest decision-making in a die-in-the-cup and tax compliance game. Furthermore, I investigate the bounds of rational decision-making in the context of the law. To answer the posed questions, I apply – alongside traditional econometrics – machine learning methods: I use natural language classification to predict decisions based on text data. Furthermore, I use time-series clustering to reduce complexity and thereby enable theory building and interpretation
    • …
    corecore