Search CORE

17,120 research outputs found

Stochastic Attribute-Value Grammars

Author: Abney Steven
Publication venue
Publication date: 23/10/1996
Field of study

Probabilistic analogues of regular and context-free grammars are well-known in computational linguistics, and currently the subject of intensive research. To date, however, no satisfactory probabilistic analogue of attribute-value grammars has been proposed: previous attempts have failed to define a correct parameter-estimation algorithm. In the present paper, I define stochastic attribute-value grammars and give a correct algorithm for estimating their parameters. The estimation algorithm is adapted from Della Pietra, Della Pietra, and Lafferty (1995). To estimate model parameters, it is necessary to compute the expectations of certain functions under random fields. In the application discussed by Della Pietra, Della Pietra, and Lafferty (representing English orthographic constraints), Gibbs sampling can be used to estimate the needed expectations. The fact that attribute-value grammars generate constrained languages makes Gibbs sampling inapplicable, but I show how a variant of Gibbs sampling, the Metropolis-Hastings algorithm, can be used instead.Comment: 23 pages, 21 Postscript figures, uses rotate.st

arXiv.org e-Print Archive

CiteSeerX

Criticality in Formal Languages and Statistical Physics

Author: Lin Henry W.
Tegmark Max
Publication venue: 'MDPI AG'
Publication date: 23/06/2017
Field of study

We show that the mutual information between two symbols, as a function of the number of symbols between the two, decays exponentially in any probabilistic regular grammar, but can decay like a power law for a context-free grammar. This result about formal languages is closely related to a well-known result in classical statistical mechanics that there are no phase transitions in dimensions fewer than two. It is also related to the emergence of power-law correlations in turbulence and cosmological inflation through recursive generative processes. We elucidate these physics connections and comment on potential applications of our results to machine learning tasks like training artificial recurrent neural networks. Along the way, we introduce a useful quantity which we dub the rational mutual information and discuss generalizations of our claims involving more complicated Bayesian networks.Comment: Replaced to match final published version. Discussion improved, references adde

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Realized volatility and absolute return volatility: a comparison indicating market risk

Author: Li Baowen
Qiao Zhi
Stanley Harry Eugene
Takaishi Tetsuya
Zheng Zeyu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Measuring volatility in financial markets is a primary challenge in the theory and practice of risk management and is essential when developing investment strategies. Although the vast literature on the topic describes many different models, two nonparametric measurements have emerged and received wide use over the past decade: realized volatility and absolute return volatility. The former is strongly favored in the financial sector and the latter by econophysicists. We examine the memory and clustering features of these two methods and find that both enable strong predictions. We compare the two in detail and find that although realized volatility has a better short-term effect that allows predictions of near-future market behavior, absolute return volatility is easier to calculate and, as a risk indicator, has approximately the same sensitivity as realized volatility. Our detailed empirical analysis yields valuable guidelines for both researchers and market participants because it provides a significantly clearer comparison of the strengths and weaknesses of the two methods.ZZ, ZQ, BL thank "Econophysics and Complex Networks" fund number R-144-000-313-133 from National University of Singapore (www.nus.sg). TT thanks Japan Society for the Promotion of Science Grant (www.jsps.go.jp/english/e-grants/) Number 25330047. HES thanks Defense Threat Reduction Agency (www.dtra.mil) (Grant HDTRA-1-10-1-0014, Grant HDTRA-1-09-1-0035) and National Science Foundation (www.nsf.gov) (Grant CMMI 1125290). ZZ thanks Chinese Academy of Sciences (english.cas.cn) Grant Number Y4FA030A01. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. (R-144-000-313-133 - National University of Singapore; 25330047 - Japan Society for the Promotion of Science Grant; HDTRA-1-10-1-0014 - Defense Threat Reduction Agency; HDTRA-1-09-1-0035 - Defense Threat Reduction Agency; CMMI 1125290 - National Science Foundation; Y4FA030A01 - Chinese Academy of Sciences)Published versio

CiteSeerX

Public Library of Science (PLOS)

Institutional Repository of Institute of Automation, CAS

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

Shenyang Institute of Automation,Chinese Academy Of Sciences

PubMed Central

A high-reproducibility and high-accuracy method for automated topic classification

Author: Acuna Daniel
Amaral Luís A. Nunes
Körding Konrad
Lancichinetti Andrea
Sirer M. Irmak
Wang Jane X.
Publication venue
Publication date: 03/02/2014
Field of study

Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent search, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA) is the state-of-the-art in topic classification. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results which are not accurate in inferring the most suitable model parameters. Adapting approaches for community detection in networks, we propose a new algorithm which displays high-reproducibility and high-accuracy, and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure. Our algorithm promises to make "big data" text analysis systems more reliable.Comment: 23 pages, 24 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Deconvolution with correct sampling

Author: F. Courbin
Heydari-Malayeri M.
Hook R. N.
Lucy L.
Narayan R.
P. Magain
Richardson W. H. J.
S. Sohy
Shannon C. J.
Skilling J.
Publication venue: 'University of Chicago Press'
Publication date: 01/01/1997
Field of study

A new method for improving the resolution of astronomical images is presented. It is based on the principle that sampled data cannot be fully deconvolved without violating the sampling theorem. Thus, the sampled image should not be deconvolved by the total Point Spread Function, but by a narrower function chosen so that the resolution of the deconvolved image is compatible with the adopted sampling. Our deconvolution method gives results which are, in at least some cases, superior to those of other commonly used techniques: in particular, it does not produce ringing around point sources superimposed on a smooth background. Moreover, it allows to perform accurate astrometry and photometry of crowded fields. These improvements are a consequence of both the correct treatment of sampling and the recognition that the most probable astronomical image is not a flat one. The method is also well adapted to the optimal combination of different images of the same object, as can be obtained, e.g., from infrared observations or via adaptive optics techniques.Comment: 22 pages, LaTex file + 10 color jpg and postscript figures. To be published in ApJ, Vol 484 (1997 Feb.

arXiv.org e-Print Archive

CiteSeerX

Crossref

Open Repository and Bibliography - Liège

CERN Document Server