448 research outputs found

    構造化データに対する予測手法:グラフ,順序,時系列

    Get PDF
    京都大学新制・課程博士博士(情報学)甲第23439号情博第769号新制||情||131(附属図書館)京都大学大学院情報学研究科知能情報学専攻(主査)教授 鹿島 久嗣, 教授 山本 章博, 教授 阿久津 達也学位規則第4条第1項該当Doctor of InformaticsKyoto UniversityDFA

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    Presentation Bias in movie recommendation algorithms

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization Information Analysis and ManagementThe emergence of video on demand (VOD) has transformed the way the content finds its audience. Several improvements have been made on algorithms to provide better movie recommendations to individuals. Given the huge variety of elements that characterize a film (such as casting, genre, soundtrack, amongst others artistic and technical aspects) and that characterize individuals, most of the improvements relied on accomplishing those characteristics to do a better job regarding matching potential clients to each product. However, little attention has been given to evaluate how the algorithms’ result selection are affected by presentation bias. Understanding bias is key to choosing which algorithms will be used by the companies. The existence of a system with presentation bias and feedback loop is already a problem stated by Netflix. In this sense, this research will fill that gap providing a comparative analysis of the bias of the major movie recommendation algorithms

    Collaborative-demographic hybrid for financial: product recommendation

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsDue to the increased availability of mature data mining and analysis technologies supporting CRM processes, several financial institutions are striving to leverage customer data and integrate insights regarding customer behaviour, needs, and preferences into their marketing approach. As decision support systems assisting marketing and commercial efforts, Recommender Systems applied to the financial domain have been gaining increased attention. This thesis studies a Collaborative- Demographic Hybrid Recommendation System, applied to the financial services sector, based on real data provided by a Portuguese private commercial bank. This work establishes a framework to support account managers’ advice on which financial product is most suitable for each of the bank’s corporate clients. The recommendation problem is further developed by conducting a performance comparison for both multi-output regression and multiclass classification prediction approaches. Experimental results indicate that multiclass architectures are better suited for the prediction task, outperforming alternative multi-output regression models on the evaluation metrics considered. Withal, multiclass Feed-Forward Neural Networks, combined with Recursive Feature Elimination, is identified as the topperforming algorithm, yielding a 10-fold cross-validated F1 Measure of 83.16%, and achieving corresponding values of Precision and Recall of 84.34%, and 85.29%, respectively. Overall, this study provides important contributions for positioning the bank’s commercial efforts around customers’ future requirements. By allowing for a better understanding of customers’ needs and preferences, the proposed Recommender allows for more personalized and targeted marketing contacts, leading to higher conversion rates, corporate profitability, and customer satisfaction and loyalty

    Machine Learning Methods To Identify Hidden Phenotypes In The Electronic Health Record

    Get PDF
    The widespread adoption of Electronic Health Records (EHRs) means an unprecedented amount of patient treatment and outcome data is available to researchers. Research is a tertiary priority in the EHR, where the priorities are patient care and billing. Because of this, the data is not standardized or formatted in a manner easily adapted to machine learning approaches. Data may be missing for a large variety of reasons ranging from individual input styles to differences in clinical decision making, for example, which lab tests to issue. Few patients are annotated at a research quality, limiting sample size and presenting a moving gold standard. Patient progression over time is key to understanding many diseases but many machine learning algorithms require a snapshot, at a single time point, to create a usable vector form. In this dissertation, we develop new machine learning methods and computational workflows to extract hidden phenotypes from the Electronic Health Record (EHR). In Part 1, we use a semi-supervised deep learning approach to compensate for the low number of research quality labels present in the EHR. In Part 2, we examine and provide recommendations for characterizing and managing the large amount of missing data inherent to EHR data. In Part 3, we present an adversarial approach to generate synthetic data that closely resembles the original data while protecting subject privacy. We also introduce a workflow to enable reproducible research even when data cannot be shared. In Part 4, we introduce a novel strategy to first extract sequential data from the EHR and then demonstrate the ability to model these sequences with deep learning

    Automatic machine learning:methods, systems, challenges

    Get PDF

    Automatic machine learning:methods, systems, challenges

    Get PDF
    This open access book presents the first comprehensive overview of general methods in Automatic Machine Learning (AutoML), collects descriptions of existing systems based on these methods, and discusses the first international challenge of AutoML systems. The book serves as a point of entry into this quickly-developing field for researchers and advanced students alike, as well as providing a reference for practitioners aiming to use AutoML in their work. The recent success of commercial ML applications and the rapid growth of the field has created a high demand for off-the-shelf ML methods that can be used easily and without expert knowledge. Many of the recent machine learning successes crucially rely on human experts, who select appropriate ML architectures (deep learning architectures or more traditional ML workflows) and their hyperparameters; however the field of AutoML targets a progressive automation of machine learning, based on principles from optimization and machine learning itself
    corecore