14,724 research outputs found
Hybrid Statistical Estimation of Mutual Information for Quantifying Information Flow
International audienceAnalysis of a probabilistic system often requires to learn the joint probability distribution of its random variables. The computation of the exact distribution is usually an exhaustive precise analysis on all executions of the system. To avoid the high computational cost of such an exhaustive search, statistical analysis has been studied to efficiently obtain approximate estimates by analyzing only a small but representative subset of the system's behavior. In this paper we propose a hybrid statistical estimation method that combines precise and statistical analyses to estimate mutual information and its confidence interval. We show how to combine the analyses on different components of the system with different precision to obtain an estimate for the whole system. The new method performs weighted statistical analysis with different sample sizes over different components and dynamically finds their optimal sample sizes. Moreover it can reduce sample sizes by using prior knowledge about systems and a new abstraction-then-sampling technique based on qualitative analysis. We show the new method outperforms the state of the art in quantifying information leakage
Hybrid Statistical Estimation of Mutual Information for Quantifying Information Flow
Analysis of a probabilistic system often requires to learn the joint probability distribution of its random variables. The computation of the exact distribution is usually an exhaustive precise analysis on all executions of the system. To avoid the high computational cost of such an exhaustive search, statistical analysis has been studied to efficiently obtain approximate estimates by analyzing only a small but representative subset of the system's behavior. In this paper we propose a hybrid statistical estimation method that combines precise and statistical analyses to estimate mutual information and its confidence interval. We show how to combine the analyses on different components of the system with different precision to obtain an estimate for the whole system. The new method performs weighted statistical analysis with different sample sizes over different components and dynamically finds their optimal sample sizes. Moreover it can reduce sample sizes by using prior knowledge about systems and a new abstraction-then-sampling technique based on qualitative analysis. We show the new method outperforms the state of the art in quantifying information leakage
High quality topic extraction from business news explains abnormal financial market volatility
Understanding the mutual relationships between information flows and social
activity in society today is one of the cornerstones of the social sciences. In
financial economics, the key issue in this regard is understanding and
quantifying how news of all possible types (geopolitical, environmental,
social, financial, economic, etc.) affect trading and the pricing of firms in
organized stock markets. In this article, we seek to address this issue by
performing an analysis of more than 24 million news records provided by
Thompson Reuters and of their relationship with trading activity for 206 major
stocks in the S&P US stock index. We show that the whole landscape of news that
affect stock price movements can be automatically summarized via simple
regularized regressions between trading activity and news information pieces
decomposed, with the help of simple topic modeling techniques, into their
"thematic" features. Using these methods, we are able to estimate and quantify
the impacts of news on trading. We introduce network-based visualization
techniques to represent the whole landscape of news information associated with
a basket of stocks. The examination of the words that are representative of the
topic distributions confirms that our method is able to extract the significant
pieces of information influencing the stock market. Our results show that one
of the most puzzling stylized fact in financial economies, namely that at
certain times trading volumes appear to be "abnormally large," can be partially
explained by the flow of news. In this sense, our results prove that there is
no "excess trading," when restricting to times when news are genuinely novel
and provide relevant financial information.Comment: The previous version of this article included an error. This is a
revised versio
Star power: the effect of Morningstar ratings on mutual fund flows
Morningstar, Inc., has been hailed in both academic and practitioner circles as having the most influential rating system in the mutual fund industry. We investigate Morningstar’s influence by estimating the value of a star in terms of the asset flow it generates for the typical fund. We use event-study methods on a sample of 3,388 domestic equity mutual funds from November 1996 to October 1999 to isolate the “Morningstar effect” from other influences on fund flow. ; We separately study initial rating events, whereby a fund is rated for the first time on its 36-month anniversary, and rating change events. An initial five-star rating results in average six-month abnormal flow of $26 million, or 53 percent above normal expected flow. Following rating changes, we find economically and statistically significant abnormal flow in the expected direction, positive for rating upgrades and negative for rating downgrades. Furthermore, we observe an immediate flow response, suggesting that some investors vigilantly monitor this information and view the rating change as “new” information on fund quality. Overall, our results indicate that Morningstar ratings have unique power to affect asset flow.Mutual funds
On the Measurement of Privacy as an Attacker's Estimation Error
A wide variety of privacy metrics have been proposed in the literature to
evaluate the level of protection offered by privacy enhancing-technologies.
Most of these metrics are specific to concrete systems and adversarial models,
and are difficult to generalize or translate to other contexts. Furthermore, a
better understanding of the relationships between the different privacy metrics
is needed to enable more grounded and systematic approach to measuring privacy,
as well as to assist systems designers in selecting the most appropriate metric
for a given application.
In this work we propose a theoretical framework for privacy-preserving
systems, endowed with a general definition of privacy in terms of the
estimation error incurred by an attacker who aims to disclose the private
information that the system is designed to conceal. We show that our framework
permits interpreting and comparing a number of well-known metrics under a
common perspective. The arguments behind these interpretations are based on
fundamental results related to the theories of information, probability and
Bayes decision.Comment: This paper has 18 pages and 17 figure
- …