8,346 research outputs found

    Survey Expectations

    Get PDF
    This paper focuses on survey expectations and discusses their uses for testing and modeling of expectations.Alternative models of expectations formation are reviewed and the importance of allowing for heterogeneity of expectations is emphasized. A weak form of the rational expectations hypothesis which focuses on average expectationsrather than individual expectations is advanced. Other models of expectations formation, such as the adaptive expectations hypothesis, are briefly discussed. Testable implications of rational and extrapolative models of expectationsare reviewed and the importance of the loss function for the interpretation of the test results is discussed. The paper thenprovides an account of the various surveys of expectations, reviews alternative methods of quantifying the qualitative surveys, and discusses the use of aggregate and individual survey responses in the analysis of expectations and for forecasting

    Topic-Specific Sentiment Analysis Can Help Identify Political Ideology

    Get PDF
    Ideological leanings of an individual can often be gauged by the sentiment one expresses about different issues. We propose a simple framework that represents a political ideology as a distribution of sentiment polarities towards a set of topics. This representation can then be used to detect ideological leanings of documents (speeches, news articles, etc.) based on the sentiments expressed towards different topics. Experiments performed using a widely used dataset show the promise of our proposed approach that achieves comparable performance to other methods despite being much simpler and more interpretable.Comment: Presented at EMNLP Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, 201

    Online Optimization Methods for the Quantification Problem

    Full text link
    The estimation of class prevalence, i.e., the fraction of a population that belongs to a certain class, is a very useful tool in data analytics and learning, and finds applications in many domains such as sentiment analysis, epidemiology, etc. For example, in sentiment analysis, the objective is often not to estimate whether a specific text conveys a positive or a negative sentiment, but rather estimate the overall distribution of positive and negative sentiments during an event window. A popular way of performing the above task, often dubbed quantification, is to use supervised learning to train a prevalence estimator from labeled data. Contemporary literature cites several performance measures used to measure the success of such prevalence estimators. In this paper we propose the first online stochastic algorithms for directly optimizing these quantification-specific performance measures. We also provide algorithms that optimize hybrid performance measures that seek to balance quantification and classification performance. Our algorithms present a significant advancement in the theory of multivariate optimization and we show, by a rigorous theoretical analysis, that they exhibit optimal convergence. We also report extensive experiments on benchmark and real data sets which demonstrate that our methods significantly outperform existing optimization techniques used for these performance measures.Comment: 26 pages, 6 figures. A short version of this manuscript will appear in the proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 201

    Disparity between the Programmatic Views and the User Perceptions of Mobile Apps

    Get PDF
    User perception in any mobile-app ecosystem, is represented as user ratings of apps. Unfortunately, the user ratings are often biased and do not reflect the actual usability of an app. To address the challenges associated with selection and ranking of apps, we need to use a comprehensive and holistic view about the behavior of an app. In this paper, we present and evaluate Trust based Rating and Ranking (TRR) approach. It relies solely on an apps' internal view that uses programmatic artifacts. We compute a trust tuple (Belief, Disbelief, Uncertainty - B, D, U) for each app based on the internal view and use it to rank the order apps offering similar functionality. Apps used for empirically evaluating the TRR approach are collected from the Google Play Store. Our experiments compare the TRR ranking with the user review-based ranking present in the Google Play Store. Although, there are disparities between the two rankings, a slightly deeper investigation indicates an underlying similarity between the two alternatives

    Information measure for financial time series: quantifying short-term market heterogeneity

    Get PDF
    A well-interpretable measure of information has been recently proposed based on a partition obtained by intersecting a random sequence with its moving average. The partition yields disjoint sets of the sequence, which are then ranked according to their size to form a probability distribution function and finally fed in the expression of the Shannon entropy. In this work, such entropy measure is implemented on the time series of prices and volatilities of six financial markets. The analysis has been performed, on tick-by-tick data sampled every minute for six years of data from 1999 to 2004, for a broad range of moving average windows and volatility horizons. The study shows that the entropy of the volatility series depends on the individual market, while the entropy of the price series is practically a market-invariant for the six markets. Finally, a cumulative information measure - the `Market Heterogeneity Index'- is derived from the integral of the proposed entropy measure. The values of the Market Heterogeneity Index are discussed as possible tools for optimal portfolio construction and compared with those obtained by using the Sharpe ratio a traditional risk diversity measure

    A Framework for Twitter Events Detection, Differentiation and its Application for Retail Brands

    Get PDF
    We propose a framework for Twitter events detection, differentiation and quantification of their significance for predicting spikes in sales. In previous approaches, the differentiation between Twitter events has mainly been done based on spatial, temporal or topic information. We suggest a novel approach that performs clustering of Twitter events based on their shapes (taking into account growth and relaxation signatures). Our study provides empirical evidence that through events differentiation based on their shape one can clearly identify clusters of Twitter events that contain more information about future sales than the non-clustered Twitter signal. We also propose a method for automatic identification of the optimum event window, solving a task of window selection, which is a common problem in the event study field. The framework described in this paper was tested on a large-scale dataset of 150 million Tweets and sales data of 75 brands, and can be applied to the analysis of time series from other domains

    The Extreme Risk of Personal Data Breaches & The Erosion of Privacy

    Full text link
    Personal data breaches from organisations, enabling mass identity fraud, constitute an \emph{extreme risk}. This risk worsens daily as an ever-growing amount of personal data are stored by organisations and on-line, and the attack surface surrounding this data becomes larger and harder to secure. Further, breached information is distributed and accumulates in the hands of cyber criminals, thus driving a cumulative erosion of privacy. Statistical modeling of breach data from 2000 through 2015 provides insights into this risk: A current maximum breach size of about 200 million is detected, and is expected to grow by fifty percent over the next five years. The breach sizes are found to be well modeled by an \emph{extremely heavy tailed} truncated Pareto distribution, with tail exponent parameter decreasing linearly from 0.57 in 2007 to 0.37 in 2015. With this current model, given a breach contains above fifty thousand items, there is a ten percent probability of exceeding ten million. A size effect is unearthed where both the frequency and severity of breaches scale with organisation size like s0.6s^{0.6}. Projections indicate that the total amount of breached information is expected to double from two to four billion items within the next five years, eclipsing the population of users of the Internet. This massive and uncontrolled dissemination of personal identities raises fundamental concerns about privacy.Comment: 16 pages, 3 sets of figures, and 4 table
    • …
    corecore