5,367 research outputs found
Accelerated Parallel Non-conjugate Sampling for Bayesian Non-parametric Models
Inference of latent feature models in the Bayesian nonparametric setting is
generally difficult, especially in high dimensional settings, because it
usually requires proposing features from some prior distribution. In special
cases, where the integration is tractable, we could sample new feature
assignments according to a predictive likelihood. However, this still may not
be efficient in high dimensions. We present a novel method to accelerate the
mixing of latent variable model inference by proposing feature locations from
the data, as opposed to the prior. First, we introduce our accelerated feature
proposal mechanism that we will show is a valid Bayesian inference algorithm
and next we propose an approximate inference strategy to perform accelerated
inference in parallel. This sampling method is efficient for proper mixing of
the Markov chain Monte Carlo sampler, computationally attractive, and is
theoretically guaranteed to converge to the posterior distribution as its
limiting distribution.Comment: Previously known as "Accelerated Inference for Latent Variable
Models
A Literature Review on Predictive Monitoring of Business Processes
Oleme läbi vaadanud mitmesuguseid ennetava jälgimise meetodeid äriprotsessides. Prognoositavate seirete eesmärk on aidata ettevõtetel oma eesmärke saavutada, aidata neil valida õige ärimudel, prognoosida tulemusi ja aega ning muuta äriprotsessid riskantsemaks. Antud väitekirjaga oleme hoolikalt kogunud ja üksikasjalikult läbi vaadanud selle väitekirja teemal oleva kirjanduse. Kirjandusuuringu tulemustest ja tähelepanekutest lähtuvalt oleme hoolikalt kavandanud ennetava jälgimisraamistiku. Raamistik on juhendiks ettevõtetele ja teadlastele, teadustöötajatele, kes uurivad selles valdkonnas ja ettevõtetele, kes soovivad neid tehnikaid oma valdkonnas rakendada.The goal of predictive monitoring is to help the business achieve their goals, help them take the right business path, predict outcomes, estimate delivery time, and make business processes risk aware. In this thesis, we have carefully collected and reviewed in detail all literature which falls in this process mining category. The objective of the thesis is to design a Predictive Monitoring Framework and classify the different predictive monitoring techniques. The framework acts as a guide for researchers and businesses. Researchers who are investigating in this field and businesses who want to apply these techniques in their respective field
The Intuitive Appeal of Explainable Machines
Algorithmic decision-making has become synonymous with inexplicable decision-making, but what makes algorithms so difficult to explain? This Article examines what sets machine learning apart from other ways of developing rules for decision-making and the problem these properties pose for explanation. We show that machine learning models can be both inscrutable and nonintuitive and that these are related, but distinct, properties. Calls for explanation have treated these problems as one and the same, but disentangling the two reveals that they demand very different responses. Dealing with inscrutability requires providing a sensible description of the rules; addressing nonintuitiveness requires providing a satisfying explanation for why the rules are what they are. Existing laws like the Fair Credit Reporting Act (FCRA), the Equal Credit Opportunity Act (ECOA), and the General Data Protection Regulation (GDPR), as well as techniques within machine learning, are focused almost entirely on the problem of inscrutability. While such techniques could allow a machine learning system to comply with existing law, doing so may not help if the goal is to assess whether the basis for decision-making is normatively defensible. In most cases, intuition serves as the unacknowledged bridge between a descriptive account and a normative evaluation. But because machine learning is often valued for its ability to uncover statistical relationships that defy intuition, relying on intuition is not a satisfying approach. This Article thus argues for other mechanisms for normative evaluation. To know why the rules are what they are, one must seek explanations of the process behind a model’s development, not just explanations of the model itself
Personalized marketing campaign for upselling using predictive modeling in the health insurance sector
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsNowadays, with the oversupply of several different solutions in the private Health Insurance
sector and the constantly increasing demand for value for money services from the client’s
perspective, it becomes clear that Insurance Companies shouldn’t only strive for excellence
but also engage their client base by offering solutions that are more suitable to their needs.
This project aims, using the power that predictive models can provide, to predict the existing
Health Insurance clients who are willing to move in a higher tier product. The case presented
above could be described under the term of upselling. The final model will be used for a
personalized marketing campaign in one of the most prominent bancassurances in Portugal.
At the moment the ongoing upselling campaign, uses only few eligibility criteria.
The outcome of the model has as a goal to assign a probability to each client who is eligible to
be contacted for this campaign. The data that were retrieved to train the model, had a buffer
period of one week from when the ‘event’ took place. This is crucial for the business, because
there is always the time-to-market parameter which should be taken into consideration in the
real world.
The tools that were used for completing this Data Mining project were mostly SAS Enterprise
Guide and SAS Enterprise Miner. All the Data Marts that were needed for the particular
project, were built and loaded in SAS, so there were no obstacles or connectivity issues. For
data visualization and reporting, Microsoft PowerBI was used.
Some of the tables in the Data Marts, are being updated in a daily and other in a monthly
basis. Of course, all the historical information is being stored in separate tables, so there is no
information loss or discrepancies.
Finally, the methodology that was followed for the implementation of the Data Mining project
was a hybrid framework between the SEMMA approach as it is the one that is proposed by
SAS Institute to carry out the core tasks of model development and CRISP-DM
Interpretability of machine learning solutions in public healthcare : the CRISP-ML approach
Public healthcare has a history of cautious adoption for artificial intelligence (AI) systems. The rapid growth of data collection and linking capabilities combined with the increasing diversity of the data-driven AI techniques, including machine learning (ML), has brought both ubiquitous opportunities for data analytics projects and increased demands for the regulation and accountability of the outcomes of these projects. As a result, the area of interpretability and explainability of ML is gaining significant research momentum. While there has been some progress in the development of ML methods, the methodological side has shown limited progress. This limits the practicality of using ML in the health domain: the issues with explaining the outcomes of ML algorithms to medical practitioners and policy makers in public health has been a recognized obstacle to the broader adoption of data science approaches in this domain. This study builds on the earlier work which introduced CRISP-ML, a methodology that determines the interpretability level required by stakeholders for a successful real-world solution and then helps in achieving it. CRISP-ML was built on the strengths of CRISP-DM, addressing the gaps in handling interpretability. Its application in the Public Healthcare sector follows its successful deployment in a number of recent real-world projects across several industries and fields, including credit risk, insurance, utilities, and sport. This study elaborates on the CRISP-ML methodology on the determination, measurement, and achievement of the necessary level of interpretability of ML solutions in the Public Healthcare sector. It demonstrates how CRISP-ML addressed the problems with data diversity, the unstructured nature of data, and relatively low linkage between diverse data sets in the healthcare domain. The characteristics of the case study, used in the study, are typical for healthcare data, and CRISP-ML managed to deliver on these issues, ensuring the required level of interpretability of the ML solutions discussed in the project. The approach used ensured that interpretability requirements were met, taking into account public healthcare specifics, regulatory requirements, project stakeholders, project objectives, and data characteristics. The study concludes with the three main directions for the development of the presented cross-industry standard process
Predictive maintenance of electrical grid assets: internship at EDP Distribuição - Energia S.A
Internship Report presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThis report will describe the activities developed during an internship at EDP Distribuição, focusing on a Predictive Maintenance analytics project directed at high voltage electrical grid assets including Overhead Lines, Power Transformers and Circuit Breakers. The project’s main goal is to support EDP’s asset management processes by improving maintenance and investing planning. The project’s main deliverables are the Probability of Failure metric that forecast asset failures 15 days ahead of time, estimated through supervised machine learning models; the Health Index metric that indicates asset’s current state and condition, implemented though the Ofgem methodology; and two asset management dashboards. The project was implemented by an external service provider, a consultant company, and during the internship it was possible to integrate the team, and participate in the development activities
Big Data and Analytics: Issues and Challenges for the Past and Next Ten Years
In this paper we continue the minitrack series of papers recognizing issues and challenges identified in the field of Big Data and Analytics, from the past and going forward. As this field has evolved, it has begun to encompass other analytical regimes, notably AI/ML systems. In this paper we focus on two areas: continuing main issues for which some progress has been made and new and emerging issues which we believe form the basis for near-term and future research in Big Data and Analytics. The Bottom Line: Big Data and Analytics is healthy, is growing in scope and evolving in capability, and is finding applicability in more problem domains than ever before
The Explanation Matters: Enhancing AI Adoption in Human Resource Management
Artificial intelligence (AI) has ubiquitous applications in companies, permeating multiple business divisions like human resource management (HRM). Yet, in these high-stakes domains where transparency and interpretability of results are of utmost importance, the black-box characteristic of AI is even more of a threat to AI adoption. Hence, explainable AI (XAI), which is regular AI equipped with or complemented by techniques to explain it, comes in. We present a systematic literature review of n=62 XAI in HRM papers. Further, we conducted an experiment among a German sample (n=108) of HRM personnel regarding a turnover prediction task with or without (X)AI-support. We find that AI-support leads to better task performance, self-assessment accuracy and response characteristics toward the AI, and XAI, i.e., transparent models allow for more accurate self-assessment of one’s performance. Future studies could enhance our research by employing local explanation techniques on real-world data with a larger and international sample
- …