364 research outputs found

    Send frequency prediction on email marketing

    Get PDF
    O E-mail Marketing é uma forma de marketing direta que utiliza o e-mail como um meio de comunicação comercial pelo que numa perspetiva mais ampla, qualquer e-mail enviado a um potencial subscritor e atuais subscritores também pode ser considerado e-mail marketing. Assim sendo, o subscritor vai receber várias comunicações ao longo do dia, reduzindo a visibilidade dos e-mails mais antigos com a entrada de novas comunicações e consequentemente, reduzindo as taxas de aberturas. Tendo em conta que existem subscritores que preferem abrir e ler as suas comunicações de manhã, outros de tarde e alguns durante a noite, é necessário enviar uma comunicação que proporcione uma maior visibilidade que perpetue maiores taxas de abertura e uma maior captação de interesse do subscritor com a entidade que enviou uma comunicação. Esta tese apresenta uma solução para enviar comunicações de marketing na altura certa aos subscritores ou potenciais subscritores. A sua contribuição consiste num modelo segmentado que utiliza um algoritmo tradicional de clustering baseado na informação trocada entre as empresas e os seus subscritores. O modelo implementa posteriormente uma abordagem de ensemble paralelo utilizando técnicas como simple averaging e stacking com algoritmos de regressão treinados (RF, Linear Regression, KNN e SVR) e com um algoritmo de deep learning (RNNs) para determinar a melhor altura para enviar comunicações de e-mail. A implementação é executada utilizando um dataset fornecido pela empresa E-goi para treinar e testar a abordagem mencionada. Os resultados obtidos nesta tese indicam que o algoritmo KNN é mais adequado para prever o melhor momento para enviar comunicações de e-mail dos algoritmos ML treinados. Das duas técnicas utilizadas para a abordagem do ensemble paralelo, o stacking é o mais adequado para prever o melhor momento para o envio das comunicações de e-mail.Email Marketing is a form of direct marketing that uses email as a means of commercial communication. In a broader perspective, any email sent to a potential subscriber and current subscribers can also be considered email marketing. Therefore, the subscriber will receive several communications throughout the day, reducing the visibility of older emails with the entry of new communications and consequently reducing open rates. Considering that there are subscribers who prefer to open and read their communications in the morning, others in the afternoon, and some at night, it is necessary to send a communication that provides the visibility that leads to higher open rates and capture the subscribers’ interest with the entity that sent the communication. This thesis presents a solution to send marketing communications at the right time to subscribers or potential subscribers. Its contribution consists of a segmented model that uses a traditional clustering algorithm based on the information exchanged between companies and subscribers. The model then implements a parallel ensemble approach using simple averaging and stacking techniques with trained regression algorithms (RF, Linear Regression, KNN, and SVR) and a deep learning algorithm (RNNs) to determine the best time to send email communications. The implementation is executed using a dataset provided by the company E-goi to train and test the mentioned approach. The results obtained in this thesis indicate that the KNN algorithm is better suited to predict the best time to send email communications of all the trained ML algorithms. Stacking is the most suitable for predicting the best time to send email communications of the two techniques used for the parallel ensemble approach

    Effective and Secure Healthcare Machine Learning System with Explanations Based on High Quality Crowdsourcing Data

    Get PDF
    Affordable cloud computing technologies allow users to efficiently outsource, store, and manage their Personal Health Records (PHRs) and share with their caregivers or physicians. With this exponential growth of the stored large scale clinical data and the growing need for personalized care, researchers are keen on developing data mining methodologies to learn efficient hidden patterns in such data. While studies have shown that those progresses can significantly improve the performance of various healthcare applications for clinical decision making and personalized medicine, the collected medical datasets are highly ambiguous and noisy. Thus, it is essential to develop a better tool for disease progression and survival rate predictions, where dataset needs to be cleaned before it is used for predictions and useful feature selection techniques need to be employed before prediction models can be constructed. In addition, having predictions without explanations prevent medical personnel and patients from adopting such healthcare deep learning models. Thus, any prediction models must come with some explanations. Finally, despite the efficiency of machine learning systems and their outstanding prediction performance, it is still a risk to reuse pre-trained models since most machine learning modules that are contributed and maintained by third parties lack proper checking to ensure that they are robust to various adversarial attacks. We need to design mechanisms for detection such attacks. In this thesis, we focus on addressing all the above issues: (i) Privacy Preserving Disease Treatment & Complication Prediction System (PDTCPS): A privacy-preserving disease treatment, complication prediction scheme (PDTCPS) is proposed, which allows authorized users to conduct searches for disease diagnosis, personalized treatments, and prediction of potential complications. (ii) Incentivizing High Quality Crowdsourcing Data For Disease Prediction: A new incentive model with individual rationality and platform profitability features is developed to encourage different hospitals to share high quality data so that better prediction models can be constructed. We also explore how data cleaning and feature selection techniques affect the performance of the prediction models. (iii) Explainable Deep Learning Based Medical Diagnostic System: A deep learning based medical diagnosis system (DL-MDS) is present which integrates heterogeneous medical data sources to produce better disease diagnosis with explanations for authorized users who submit their personalized health related queries. (iv) Attacks on RNN based Healthcare Learning Systems and Their Detection & Defense Mechanisms: Potential attacks on Recurrent Neural Network (RNN) based ML systems are identified and low-cost detection & defense schemes are designed to prevent such adversarial attacks. Finally, we conduct extensive experiments using both synthetic and real-world datasets to validate the feasibility and practicality of our proposed systems

    Characterizing Productive Perseverance Using Sensor-Free Detectors of Student Knowledge, Behavior, and Affect

    Get PDF
    Failure is a necessary step in the process of learning. For this reason, there has been a myriad of research dedicated to the study of student perseverance in the presence of failure, leading to several commonly-cited theories and frameworks to characterize productive and unproductive representations of the construct of persistence. While researchers are in agreement that it is important for students to persist when struggling to learn new material, there can be both positive and negative aspects of persistence. What is it, then, that separates productive from unproductive persistence? The purpose of this work is to address this question through the development, extension, and study of data-driven models of student affect, behavior, and knowledge. The increased adoption of computer-based learning platforms in real classrooms has led to unique opportunities to study student learning at both fine levels of granularity and longitudinally at scale. Prior work has leveraged machine learning methods, existing learning theory, and previous education research to explore various aspects of student learning. These include the development of sensor-free detectors that utilize only the student interaction data collected through such learning platforms. Building off of the considerable amount of prior research, this work employs state-of-the-art machine learning methods in conjunction with the large scale granular data collected by computer-based learning platforms in alignment with three goals. First, this work focuses on the development of student models that study learning through the use of advancements in student modeling and deep learning methodologies. Second, this dissertation explores the development of tools that incorporate such models to support teachers in taking action in real classrooms to promote productive approaches to learning. Finally, this work aims to complete the loop in utilizing these detector models to better understand the underlying constructs that are being measured through their application and their connection to productive perseverance and commonly-observed learning outcomes

    Advances in Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic

    Proceedings, MSVSCC 2018

    Get PDF
    Proceedings of the 12th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 19, 2018 at VMASC in Suffolk, Virginia. 155 pp

    Essays in Identity and Urban Economics.

    Full text link
    This dissertation explores aspects of identity choice and change in an economic context, and how choice of location can help predict “quality of life”. The first chapter studies the malleability of race for those that are mixed-race. Many modern surveys that collect demographic information now allow one or more racial categories to be chosen for one person. I construct a simple model of racial identity choice which implies that cultural and socioeconomic factors will influence the racial choices of those with multiracial ancestry. Using nationally representative data from the Census and the American Community Survey (ACS) I show that factors such as region, year, age, employment, and wages are associated with race selection among this population. Therefore, measuring socioeconomic outcomes of multiracial groups may be complicated if these same socioeconomic outcomes influence self-reported race. The second chapter examines relative group size, or whether a group is in the minority, a changeable social identity. We study how laboratory-created majorities and minorities interact, and how changing relative group size affects behavior. Our novel design allows us to examine whether two groups of unequal size exhibit differences in levels of trust and of trustworthiness and test whether causing the majority group to become the minority group, and vice-versa, changes behavior. We find that real-world majority race interacts with laboratory-created minority identity. We also find that subjects do not change their behavior when their relative group sizes change; behavior is driven by initial group size differences. In the third chapter we examine variation in local rents, wage levels, commuting costs, household characteristics, and amenities for 2071 areas covering the United States, within metropolitan areas, by density and central-city status. We demonstrate the sensibility of estimating wage levels by workplace, not residence, and recover decentralized rent gradients that fall with commuting costs. We construct and map a willingness-to-pay index, which indicates the “quality of life” typical households receive from local amenities, when households are similar, mobile, and informed. This index varies considerably within metros, and is typically high in areas that are dense, suburban, sunny, mild, safe, entertaining, and have elevated school-funding.PhDEconomicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113392/1/bertlue_1.pd
    corecore