90,625 research outputs found
Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization
We consider the search for a maximum likelihood assignment of hidden derivations and grammar weights for a probabilistic context-free grammar, the problem approximately solved by “Viterbi training.” We show that solving and even approximating Viterbi training for PCFGs is NP-hard. We motivate the use of uniformat-random initialization for Viterbi EM as an optimal initializer in absence of further information about the correct model parameters, providing an approximate bound on the log-likelihood.
Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning
Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate
Empirical Risk Minimization with Approximations of Probabilistic Grammars
Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of the parameters of a fixed probabilistic grammar using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting.
Joint Morphological and Syntactic Disambiguation
In morphologically rich languages, should morphological and syntactic disambiguation be treated sequentially or as a single problem? We describe several efficient, probabilistically interpretable ways to apply joint inference to morphological and syntactic disambiguation using lattice parsing. Joint inference is shown to compare favorably to pipeline parsing methods across a variety of component models. State-of-the-art performance on Hebrew Treebank parsing is demonstrated using the new method. The benefits of joint inference are modest with the current component models, but appear to increase as components themselves improve
Progress in Electroweak Baryogenesis
Recent work on generating the excess of matter over antimatter in the early
universe during the electroweak phase transition is reviewed.Comment: 50 pages (figures on request), uses harvmac (table of contents
correct for "l" format), UCSD-93-2,BU-HEP-93-
First Results of the 74 MHz VLA-Pie Town Link. Hercules A at Low Frequencies
We present the results of the first successful observations of the Pie Town
link with the Very Large Array (VLA) at 74 MHz on Hercules A. The improvement
in resolution from 25 arcsec to 10 arcsec resolves the helical- and ring-like
features seen at higher frequencies. We also present new high dynamic range
images of this powerful radio galaxy at 325 MHz. Our low frequency observations
confirm the multiple outburst interpretation of the spectral index differences
at high frequencies. Comparison between our radio and ROSAT X-ray data does not
reveal any association between the X-ray emission from the cluster and the
radio lobes. There are no extra regions of radio emission at 74 MHz.Comment: 9 pages, 7 figures, accepted for publication in MNRA
A Poset Connected to Artin Monoids of Simply Laced Type
Let W be a Weyl group whose type is a simply laced Dynkin diagram. On several
W-orbits of sets of mutually commuting reflections, a poset is described which
plays a role in linear representatons of the corresponding Artin group A. The
poset generalizes many properties of the usual order on positive roots of W
given by height. In this paper, a linear representation of the positive monoid
of A is defined by use of the poset
BMW algebras of simply laced type
It is known that the recently discovered representations of the Artin groups
of type A_n, the braid groups, can be constructed via BMW algebras. We
introduce similar algebras of type D_n and E_n which also lead to the newly
found faithful representations of the Artin groups of the corresponding types.
We establish finite dimensionality of these algebras. Moreover, they have
ideals I_1 and I_2 with I_2 contained in I_1 such that the quotient with
respect to I_1 is the Hecke algebra and I_1/I_2 is a module for the
corresponding Artin group generalizing the Lawrence-Krammer representation.
Finally we give conjectures on the structure, the dimension and parabolic
subalgebras of the BMW algebra, as well as on a generalization of deformations
to Brauer algebras for simply laced spherical type other than A_n.Comment: 39 page
Solar wind radiation damage effects in lunar material
The research on solar wind radiation damage and other effects in lunar samples which was conducted to understand the optical properties of lunar materials is reported. Papers presented include: solar radiation effects in lunar samples, albedo of the moon, radiation effects in lunar crystalline rocks, valence states of 3rd transition elements in Apollo 11 and 12 rocks, and trace ferric iron in lunar and meteoritic titanaugites
- …