Search CORE

65,072 research outputs found

A Random Forest Guided Tour

Author: Biau Gérard
Scornet Erwan
Publication venue
Publication date: 18/11/2015
Field of study

The random forest algorithm, proposed by L. Breiman in 2001, has been extremely successful as a general-purpose classification and regression method. The approach, which combines several randomized decision trees and aggregates their predictions by averaging, has shown excellent performance in settings where the number of variables is much larger than the number of observations. Moreover, it is versatile enough to be applied to large-scale problems, is easily adapted to various ad-hoc learning tasks, and returns measures of variable importance. The present article reviews the most recent theoretical and methodological developments for random forests. Emphasis is placed on the mathematical forces driving the algorithm, with special attention given to the selection of parameters, the resampling mechanism, and variable importance measures. This review is intended to provide non-experts easy access to the main ideas

arXiv.org e-Print Archive

Can Deep Learning Predict Risky Retail Investors? A Case Study in Financial Risk Behavior Forecasting

Author: Johnson Johnnie E. V.
Kolesnikova Alisa
Lessmann Stefan
Ma Tiejun
Sung Ming-Chien
Yang Yaodong
Publication venue
Publication date: 17/11/2019
Field of study

The paper examines the potential of deep learning to support decisions in financial risk management. We develop a deep learning model for predicting whether individual spread traders secure profits from future trades. This task embodies typical modeling challenges faced in risk and behavior forecasting. Conventional machine learning requires data that is representative of the feature-target relationship and relies on the often costly development, maintenance, and revision of handcrafted features. Consequently, modeling highly variable, heterogeneous patterns such as trader behavior is challenging. Deep learning promises a remedy. Learning hierarchical distributed representations of the data in an automatic manner (e.g. risk taking behavior), it uncovers generative features that determine the target (e.g., trader's profitability), avoids manual feature engineering, and is more robust toward change (e.g. dynamic market conditions). The results of employing a deep network for operational risk forecasting confirm the feature learning capability of deep learning, provide guidance on designing a suitable network architecture and demonstrate the superiority of deep learning over machine learning and rule-based benchmarks.Comment: Within the "equal" contribution, Yaodong Yang contributed the core deep learning algorithm along with its experimental results, and the first draft of the manuscript (including Figure 1,2,3,4,7,8,9,11, and Table 3

arXiv.org e-Print Archive

On-the-Job Learning with Bayesian Decision Theory

Author: Chaganty Arun
Liang Percy
Manning Chris
Werling Keenon
Publication venue
Publication date: 07/12/2015
Field of study

Our goal is to deploy a high-accuracy system starting with zero training examples. We consider an "on-the-job" setting, where as inputs arrive, we use real-time crowdsourcing to resolve uncertainty where needed and output our prediction when confident. As the model improves over time, the reliance on crowdsourcing queries decreases. We cast our setting as a stochastic game based on Bayesian decision theory, which allows us to balance latency, cost, and accuracy objectives in a principled way. Computing the optimal policy is intractable, so we develop an approximation based on Monte Carlo Tree Search. We tested our approach on three datasets---named-entity recognition, sentiment classification, and image classification. On the NER task we obtained more than an order of magnitude reduction in cost compared to full human annotation, while boosting performance relative to the expert provided labels. We also achieve a 8% F1 improvement over having a single human label the whole set, and a 28% F1 improvement over online learning.Comment: As appearing in NIPS 201

arXiv.org e-Print Archive

FECBench: A Holistic Interference-aware Approach for Application Performance Modeling

Author: Barve Yogesh D.
Bhattacharjee Anirban
Chhokra Ajay Dev
Gokhale Aniruddha
Kang Zhuangwei
Khare Shweta
Shekhar Shashank
Sun Hongyang
Publication venue
Publication date: 12/04/2019
Field of study

Services hosted in multi-tenant cloud platforms often encounter performance interference due to contention for non-partitionable resources, which in turn causes unpredictable behavior and degradation in application performance. To grapple with these problems and to define effective resource management solutions for their services, providers often must expend significant efforts and incur prohibitive costs in developing performance models of their services under a variety of interference scenarios on different hardware. This is a hard problem due to the wide range of possible co-located services and their workloads, and the growing heterogeneity in the runtime platforms including the use of fog and edge-based resources, not to mention the accidental complexity in performing application profiling under a variety of scenarios. To address these challenges, we present FECBench, a framework to guide providers in building performance interference prediction models for their services without incurring undue costs and efforts. The contributions of the paper are as follows. First, we developed a technique to build resource stressors that can stress multiple system resources all at once in a controlled manner to gain insights about the interference on an application's performance. Second, to overcome the need for exhaustive application profiling, FECBench intelligently uses the design of experiments (DoE) approach to enable users to build surrogate performance models of their services. Third, FECBench maintains an extensible knowledge base of application combinations that create resource stresses across the multi-dimensional resource design space. Empirical results using real-world scenarios to validate the efficacy of FECBench show that the predicted application performance has a median error of only 7.6% across all test cases, with 5.4% in the best case and 13.5% in the worst case

arXiv.org e-Print Archive

A Survey of Prediction Using Social Media

Author: Kak Subhash
Yu Sheng
Publication venue
Publication date: 07/03/2012
Field of study

Social media comprises interactive applications and platforms for creating, sharing and exchange of user-generated contents. The past ten years have brought huge growth in social media, especially online social networking services, and it is changing our ways to organize and communicate. It aggregates opinions and feelings of diverse groups of people at low cost. Mining the attributes and contents of social media gives us an opportunity to discover social structure characteristics, analyze action patterns qualitatively and quantitatively, and sometimes the ability to predict future human related events. In this paper, we firstly discuss the realms which can be predicted with current social media, then overview available predictors and techniques of prediction, and finally discuss challenges and possible future directions.Comment: 20 page

arXiv.org e-Print Archive

CiteSeerX

Attacking Machine Learning models as part of a cyber kill chain

Author: Nguyen Tam N.
Publication venue
Publication date: 06/04/2018
Field of study

Machine learning is gaining popularity in the network security domain as many more network-enabled devices get connected, as malicious activities become stealthier, and as new technologies like Software Defined Networking emerge. Compromising machine learning model is a desirable goal. In fact, spammers have been quite successful getting through machine learning enabled spam filters for years. While previous works have been done on adversarial machine learning, none has been considered within a defense-in-depth environment, in which correct classification alone may not be good enough. For the first time, this paper proposes a cyber kill-chain for attacking machine learning models together with a proof of concept. The intention is to provide a high level attack model that inspire more secure processes in research/design/implementation of machine learning based security solutions.Comment: 8 page

arXiv.org e-Print Archive

A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit

Author: Burtini Giuseppe
Lawrence Ramon
Loeppky Jason
Publication venue
Publication date: 03/11/2015
Field of study

Adaptive and sequential experiment design is a well-studied area in numerous domains. We survey and synthesize the work of the online statistical learning paradigm referred to as multi-armed bandits integrating the existing research as a resource for a certain class of online experiments. We first explore the traditional stochastic model of a multi-armed bandit, then explore a taxonomic scheme of complications to that model, for each complication relating it to a specific requirement or consideration of the experiment design context. Finally, at the end of the paper, we present a table of known upper-bounds of regret for all studied algorithms providing both perspectives for future theoretical work and a decision-making tool for practitioners looking for theoretical guarantees.Comment: 49 pages, 1 figur

arXiv.org e-Print Archive

A Bayesian Perspective of Statistical Machine Learning for Big Data

Author: Das Sourish
Sahu Sujit K
Sambasivan Rajiv
Publication venue
Publication date: 12/11/2018
Field of study

Statistical Machine Learning (SML) refers to a body of algorithms and methods by which computers are allowed to discover important features of input data sets which are often very large in size. The very task of feature discovery from data is essentially the meaning of the keyword `learning' in SML. Theoretical justifications for the effectiveness of the SML algorithms are underpinned by sound principles from different disciplines, such as Computer Science and Statistics. The theoretical underpinnings particularly justified by statistical inference methods are together termed as statistical learning theory. This paper provides a review of SML from a Bayesian decision theoretic point of view -- where we argue that many SML techniques are closely connected to making inference by using the so called Bayesian paradigm. We discuss many important SML techniques such as supervised and unsupervised learning, deep learning, online learning and Gaussian processes especially in the context of very large data sets where these are often employed. We present a dictionary which maps the key concepts of SML from Computer Science and Statistics. We illustrate the SML techniques with three moderately large data sets where we also discuss many practical implementation issues. Thus the review is especially targeted at statisticians and computer scientists who are aspiring to understand and apply SML for moderately large to big data sets.Comment: 26 pages, 3 figures, Review pape

arXiv.org e-Print Archive

Student Success Prediction in MOOCs

Author: Brooks Christopher
Gardner Josh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/04/2018
Field of study

Predictive models of student success in Massive Open Online Courses (MOOCs) are a critical component of effective content personalization and adaptive interventions. In this article we review the state of the art in predictive models of student success in MOOCs and present a categorization of MOOC research according to the predictors (features), prediction (outcomes), and underlying theoretical model. We critically survey work across each category, providing data on the raw data source, feature engineering, statistical model, evaluation method, prediction architecture, and other aspects of these experiments. Such a review is particularly useful given the rapid expansion of predictive modeling research in MOOCs since the emergence of major MOOC platforms in 2012. This survey reveals several key methodological gaps, which include extensive filtering of experimental subpopulations, ineffective student model evaluation, and the use of experimental data which would be unavailable for real-world student success prediction and intervention, which is the ultimate goal of such models. Finally, we highlight opportunities for future research, which include temporal modeling, research bridging predictive and explanatory student models, work which contributes to learning theory, and evaluating long-term learner success in MOOCs

arXiv.org e-Print Archive

A Study of WhatsApp Usage Patterns and Prediction Models without Message Content

Author: Avidov Or
Kraus Sarit
Rosenfeld Avi
Sarne David
Sina Sigal
Publication venue
Publication date: 09/02/2018
Field of study

Internet social networks have become a ubiquitous application allowing people to easily share text, pictures, and audio and video files. Popular networks include WhatsApp, Facebook, Reddit and LinkedIn. We present an extensive study of the usage of the WhatsApp social network, an Internet messaging application that is quickly replacing SMS messaging. In order to better understand people's use of the network, we provide an analysis of over 6 million messages from over 100 users, with the objective of building demographic prediction models using activity data. We performed extensive statistical and numerical analysis of the data and found significant differences in WhatsApp usage across people of different genders and ages. We also inputted the data into the Weka data mining package and studied models created from decision tree and Bayesian network algorithms. We found that different genders and age demographics had significantly different usage habits in almost all message and group attributes. We also noted differences in users' group behavior and created prediction models, including the likelihood a given group would have relatively more file attachments, if a group would contain a larger number of participants, a higher frequency of activity, quicker response times and shorter messages. We were successful in quantifying and predicting a user's gender and age demographic. Similarly, we were able to predict different types of group usage. All models were built without analyzing message content. We present a detailed discussion about the specific attributes that were contained in all predictive models and suggest possible applications based on these results.Comment: 24 page

arXiv.org e-Print Archive