3,079 research outputs found
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
Improving customer churn prediction by data augmentation using pictorial stimulus-choice data
The purpose of this paper is to determine the added value of pictorial stimulus-choice data in customer churn prediction. Using Random Forests and 5 times 2 fold cross-validation, this study analyzes how much pictorial stimulus choice data and survey data increase the AUC of a churn model over and above administrative, operational and complaints data. The finding is that pictorial-stimulus choice data significantly increases AUC of models with administrative and operational data. The practical implication of this finding is that companies should start considering mining pictorial data from social media sites (e.g. Pinterest), in order to augment their internal customer database. This study is original in that it is the first that assesses the added value of pictorial stimulus-choice data in predictive models. This is important because more and more social media websites are focusing on pictures
Privacy and Confidentiality in an e-Commerce World: Data Mining, Data Warehousing, Matching and Disclosure Limitation
The growing expanse of e-commerce and the widespread availability of online
databases raise many fears regarding loss of privacy and many statistical
challenges. Even with encryption and other nominal forms of protection for
individual databases, we still need to protect against the violation of privacy
through linkages across multiple databases. These issues parallel those that
have arisen and received some attention in the context of homeland security.
Following the events of September 11, 2001, there has been heightened attention
in the United States and elsewhere to the use of multiple government and
private databases for the identification of possible perpetrators of future
attacks, as well as an unprecedented expansion of federal government data
mining activities, many involving databases containing personal information. We
present an overview of some proposals that have surfaced for the search of
multiple databases which supposedly do not compromise possible pledges of
confidentiality to the individuals whose data are included. We also explore
their link to the related literature on privacy-preserving data mining. In
particular, we focus on the matching problem across databases and the concept
of ``selective revelation'' and their confidentiality implications.Comment: Published at http://dx.doi.org/10.1214/088342306000000240 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Statistical foundations of ecological rationality
If we reassess the rationality question under the assumption that the uncertainty of the natural world is largely unquantifiable, where do we end up? In this article the author argues that we arrive at a statistical, normative, and cognitive theory of ecological rationality. The main casualty of this rebuilding process is optimality. Once we view optimality as a formal implication of quantified uncertainty rather than an ecologically meaningful objective, the rationality question shifts from being axiomatic/probabilistic in nature to being algorithmic/predictive in nature. These distinct views on rationality mirror fundamental and long-standing divisions in statistics
Personalized information systems : enabling technologies and architecture
Personalized Information Systems are ICT applications that exhibit personalized behaviour, adjusted to the preferences or needs of their users. Personalized Information Systems can bring up business benefits related to customer relationship or to increased efficiency of organizational work.
This work characterizes the Personalized Information Systems in what concerns types of personalized features, presents the enabling technologies, and suggests a general architecture for Personalized Information Systems.Fundação para a Ciência e a Tecnologia (FCT
Observing and recommending from a social web with biases
The research question this report addresses is: how, and to what extent,
those directly involved with the design, development and employment of a
specific black box algorithm can be certain that it is not unlawfully
discriminating (directly and/or indirectly) against particular persons with
protected characteristics (e.g. gender, race and ethnicity)?Comment: Technical Report, University of Southampton, March 201
An Analytical Review of Privacy Preservation Using K Anonymity along with Bayesian Classifier in Data Mining
Privacy Preservation for social media is one of most trending research subject around the world. In addition to its trendiness, it is a very sensitive issue also. People around the world share their private information on the social media without thinking that it may affect their privacy. In such a condition it becomes the unrest duty to prevent the user information which is private. A lot of research workers have already put their ideas on table for the same issue. This paper studies the effect of Bayesian network in contrast to the prevention of the private data over social media. This paper also describes the pros and cons of using Bayesian Network for privacy preservation and also it compares some of the ethical prevention algorithms for the same. The evaluation has been done on the basis of ethical data mining parameters like Precision, Recall, F-Measur
The Role of the Mangement Sciences in Research on Personalization
We present a review of research studies that deal with personalization. We synthesize current knowledge about these areas, and identify issues that we envision will be of interest to researchers working in the management sciences. We take an interdisciplinary approach that spans the areas of economics, marketing, information technology, and operations. We present an overarching framework for personalization that allows us to identify key players in the personalization process, as well as, the key stages of personalization. The framework enables us to examine the strategic role of personalization in the interactions between a firm and other key players in the firm's value system. We review extant literature in the strategic behavior of firms, and discuss opportunities for analytical and empirical research in this regard. Next, we examine how a firm can learn a customer's preferences, which is one of the key components of the personalization process. We use a utility-based approach to formalize such preference functions, and to understand how these preference functions could be learnt based on a customer's interactions with a firm. We identify well-established techniques in management sciences that can be gainfully employed in future research on personalization.CRM, Persoanlization, Marketing, e-commerce,
- …