141,814 research outputs found
Risk-based audits in a behavioural model.
The tools of predictive analytics are widely used in the analysis of large data sets to predict future patterns in the system. In particular, predictive analytics is used to estimate risk of engaging in certain behavior. Risk-based audits are used by revenue services to target potentially noncompliant taxpayers, but the results of predictive analytics serve predominantly only as a guide rather than a rule. “Auditor judgment” retains an important role in selecting audit targets. This article assesses the effectiveness of using predictive analytics in a model of the compliance decision that incorporates several components from behavioral economics: subjective beliefs about audit probabilities, a social custom reward from honest tax payment, and a degree of risk aversion that increases with age. Simulation analysis shows that predictive analytics are successful in raising compliance and that the resulting pattern of audits is very close to being a cutoff rule
What Types of Predictive Analytics are Being Used in Talent Management Organizations?
[Excerpt] Talent management organizations are increasingly deriving insights from data to make better decisions. Their use of data analytics is advancing from descriptive to predictive and prescriptive analytics. Descriptive analytics is the most basic form, providing the hindsight view of what happened and laying the foundation for turning data into information. More advanced uses are predictive (advanced forecasts and the ability to model future results) and prescriptive (“the top-tier of analytics that leverage machine learning techniques … to both interpret data and recommend actions”) analytics (1). Appendix A illustrates these differences. This report summarizes our most relevant findings about how both academic researchers and HR practitioners are successfully using data analytics to inform decision-making in workforce issues, with a focus on executive assessment and selection
Privacy Tradeoffs in Predictive Analytics
Online services routinely mine user data to predict user preferences, make
recommendations, and place targeted ads. Recent research has demonstrated that
several private user attributes (such as political affiliation, sexual
orientation, and gender) can be inferred from such data. Can a
privacy-conscious user benefit from personalization while simultaneously
protecting her private attributes? We study this question in the context of a
rating prediction service based on matrix factorization. We construct a
protocol of interactions between the service and users that has remarkable
optimality properties: it is privacy-preserving, in that no inference algorithm
can succeed in inferring a user's private attribute with a probability better
than random guessing; it has maximal accuracy, in that no other
privacy-preserving protocol improves rating prediction; and, finally, it
involves a minimal disclosure, as the prediction accuracy strictly decreases
when the service reveals less information. We extensively evaluate our protocol
using several rating datasets, demonstrating that it successfully blocks the
inference of gender, age and political affiliation, while incurring less than
5% decrease in the accuracy of rating prediction.Comment: Extended version of the paper appearing in SIGMETRICS 201
Efficient Scalable Accurate Regression Queries in In-DBMS Analytics
Recent trends aim to incorporate advanced data analytics capabilities within DBMSs. Linear regression queries are fundamental to exploratory analytics and predictive modeling. However, computing their exact answers leaves a lot to be desired in terms of efficiency and scalability. We contribute a novel predictive analytics model and associated regression query processing algorithms, which are efficient, scalable and accurate. We focus on predicting the answers to two key query types that reveal dependencies between the values of different attributes: (i) mean-value queries and (ii) multivariate linear regression queries, both within specific data subspaces defined based on the values of other attributes. Our algorithms achieve many orders of magnitude improvement in query processing efficiency and nearperfect approximations of the underlying relationships among data attributes
Damned Lies & Criminal Sentencing Using Evidence-Based Tools
The boom of big data and predictive analytics has revolutionized business. eHarmony matches customers based on shared likes and expectations for romance, and Target uses similar methods to strategically push its products on shoppers. Courts and Departments of Corrections have also sought to employ similar tools. However, the use of data analytics in sentencing raises a host of constitutional concerns. In State v. Loomis, the Wisconsin Supreme Court was faced with whether the use of an actuarial risk assessment tool based on a proprietary formula violates a defendant’s right to due process where the defendant could not review how the various inputs were weighed. The opinion attempts to save a constitutionally dubious technique and reads as a warning to lower courts in the proper use of predictive analytics. This article explores certain equal protection and due process arguments implicated by Loomis
Predictive Analytics in Information Systems Research
This research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive analytics include empirical methods (statistical and other) that generate data predictions as well as methods for assessing predictive power. Predictive analytics not only assist in creating practically useful models, they also play an important role alongside explanatory modeling in theory building and theory testing. We describe six roles for predictive analytics: new theory generation, measurement development, comparison of competing theories, improvement of existing models, relevance assessment, and assessment of the predictability of empirical phenomena. Despite the importance of predictive analytics, we find that they are rare in the empirical IS literature. Extant IS literature relies nearly exclusively on explanatory statistical modeling, where statistical inference is used to test and evaluate the explanatory power of underlying causal models, and predictive power is assumed to follow automatically from the explanatory model. However, explanatory power does not imply predictive power and thus predictive analytics are necessary for assessing predictive power and for building empirical models that predict well. To show that predictive analytics and explanatory statistical modeling are fundamentally disparate, we show that they are different in each step of the modeling process. These differences translate into different final models, so that a pure explanatory statistical model is best tuned for testing causal hypotheses and a pure predictive model is best in terms of predictive power. We convert a well-known explanatory paper on TAM to a predictive context to illustrate these differences and show how predictive analytics can add theoretical and practical value to IS research
Scalable aggregation predictive analytics: a query-driven machine learning approach
We introduce a predictive modeling solution that provides high quality predictive analytics over aggregation queries in Big Data environments. Our predictive methodology is generally applicable in environments in which large-scale data owners may or may not restrict access to their data and allow only aggregation operators like COUNT to be executed over their data. In this context, our methodology is based on historical queries and their answers to accurately predict ad-hoc queries’ answers. We focus on the widely used set-cardinality, i.e., COUNT, aggregation query, as COUNT is a fundamental operator for both internal data system optimizations and for aggregation-oriented data exploration and predictive analytics. We contribute a novel, query-driven Machine Learning (ML) model whose goals are to: (i) learn the query-answer space from past issued queries, (ii) associate the query space with local linear regression & associative function estimators, (iii) define query similarity, and (iv) predict the cardinality of the answer set of unseen incoming queries, referred to the Set Cardinality Prediction (SCP) problem. Our ML model incorporates incremental ML algorithms for ensuring high quality prediction results. The significance of contribution lies in that it (i) is the only query-driven solution applicable over general Big Data environments, which include restricted-access data, (ii) offers incremental learning adjusted for arriving ad-hoc queries, which is well suited for query-driven data exploration, and (iii) offers a performance (in terms of scalability, SCP accuracy, processing time, and memory requirements) that is superior to data-centric approaches. We provide a comprehensive performance evaluation of our model evaluating its sensitivity, scalability and efficiency for quality predictive analytics. In addition, we report on the development and incorporation of our ML model in Spark showing its superior performance compared to the Spark’s COUNT method
- …
