85 research outputs found

    An Online Causal Inference Framework for Modeling and Designing Systems Involving User Preferences: A State-Space Approach

    Get PDF
    We provide a causal inference framework to model the effects of machine learning algorithms on user preferences. We then use this mathematical model to prove that the overall system can be tuned to alter those preferences in a desired manner. A user can be an online shopper or a social media user, exposed to digital interventions produced by machine learning algorithms. A user preference can be anything from inclination towards a product to a political party affiliation. Our framework uses a state-space model to represent user preferences as latent system parameters which can only be observed indirectly via online user actions such as a purchase activity or social media status updates, shares, blogs, or tweets. Based on these observations, machine learning algorithms produce digital interventions such as targeted advertisements or tweets. We model the effects of these interventions through a causal feedback loop, which alters the corresponding preferences of the user. We then introduce algorithms in order to estimate and later tune the user preferences to a particular desired form. We demonstrate the effectiveness of our algorithms through experiments in different scenarios. © 2017 Ibrahim Delibalta et al

    Online Anomaly Detection with Nested Trees

    Get PDF
    We introduce an online anomaly detection algorithm that processes data in a sequential manner. At each time, the algorithm makes a new observation, produces a decision, and then adaptively updates all its parameters to enhance its performance. The algorithm mainly works in an unsupervised manner since in most real-life applications labeling the data is costly. Even so, whenever there is a feedback, the algorithm uses it for better adaptation. The algorithm has two stages. In the first stage, it constructs a score function similar to a probability density function to model the underlying nominal distribution (if there is one) or to fit to the observed data. In the second state, this score function is used to evaluate the newly observed data to provide the final decision. The decision is given after the well-known thresholding. We construct the score using a highly versatile and completely adaptive nested decision tree. Nested soft decision trees are used to partition the observation space in a hierarchical manner. We adaptively optimize every component of the tree, i.e., decision regions and probabilistic models at each node as well as the overall structure, based on the sequential performance. This extensive in-time adaptation provides strong modeling capabilities; however, it may cause overfitting. To mitigate the overfitting issues, we first use the intermediate nodes of the tree to produce several subtrees, which constitute all the models from coarser to full extend, and then adaptively combine them. By using a real-life dataset, we show that our algorithm significantly outperforms the state of the art. © 1994-2012 IEEE

    Concern for information privacy:a cross-nation study of the United Kingdom and South Africa

    Get PDF
    Individuals have differing levels of information privacy concern, formed by their expectations and the confidence they have that organisations meet this in practice. Variance in privacy laws and national factors may also play a role. This study analyses individuals’ information privacy expectation and confidence across two nations, the United Kingdom and South Africa, through a survey of 1463 respondents. The findings indicate that the expectation for privacy in both countries are very high. However, numerous significant differences exist between expectations and confidence when examining privacy principles. The overall results for both countries show that there is a gap in terms of the privacy expectations of respondents compared to the confidence they have in whether organisations are meeting their expectations. Governments, regulators, and organisations with an online presence need to consider individuals’ expectations and ensure that controls that meet regulatory requirements, as well as expectations, are in place

    Privacy Protection in Tourism: Where We Are and Where We Should Be Heading For

    Get PDF
    The link between information privacy concerns and privacy behaviours has been a focus of extensive investigation in various disciplines. However, little attention has been devoted to this issue in the tourism literature. Spurred by technological development and shaped by tourism-related environments, emerging privacy issues call for comprehensive yet context-specific studies to ensure tourists are making beneficial privacy choices. This paper first presents a comprehensive review of state-of-the-art research on privacy concerns and behaviours. Then, it suggests a list of overarching research priorities, merging social and technical aspects of privacy protection approaches as they apply to tourism. The priorities include research to measure tourists’ privacy concerns, explore specific biases in tourists’ privacy decisions, experiment with privacy nudges, and explore how to integrate privacy nudges in system design. Thus, this paper contributes to guiding the direction of future research on privacy protection in tourism

    Online text classification for real life tweet analysis [Gerçek Hayat Tweet Analizi için Çevrimiçi Metin Siniflandirmasi]

    No full text
    In this paper, we study multi-class classification of tweets, where we introduce highly efficient dimensionality reduction techniques suitable for online processing of high dimensional feature vectors generated from freely-worded text. As for the real life case study, we work on tweets in the Turkish language, however, our methods are generic and can be used for other languages as clearly explained in the paper. Since we work on a real life application and the tweets are freely worded, we introduce text correction, normalization and root finding algorithms. Although text processing and classification are highly important due to many applications such as emotion recognition, advertisement selection, etc., online classification and regression algorithms over text are limited due to need for high dimensional vectors to represent natural text inputs. We overcome such limitations by showing that randomized projections and piecewise linear models can be efficiently leveraged to significantly reduce the computational cost for feature vector extraction from the tweets. Hence, we can perform multi-class tweet classification and regression in real time. We demonstrate our results over tweets collected from a real life case study where the tweets are freely-worded, e.g., with emoticons, shortened words, special characters, etc., and are unstructured. We implement several well-known machine learning algorithms as well as novel regression methods and demonstrate that we can significantly reduce the computational complexity with insignificant change in the classification and regression performance. © 2016 IEEE
    corecore