30,603 research outputs found
Speech Recognition by Composition of Weighted Finite Automata
We present a general framework based on weighted finite automata and weighted
finite-state transducers for describing and implementing speech recognizers.
The framework allows us to represent uniformly the information sources and data
structures used in recognition, including context-dependent units,
pronunciation dictionaries, language models and lattices. Furthermore, general
but efficient algorithms can used for combining information sources in actual
recognizers and for optimizing their application. In particular, a single
composition algorithm is used both to combine in advance information sources
such as language models and dictionaries, and to combine acoustic observations
and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st
Beyond Word N-Grams
We describe, analyze, and evaluate experimentally a new probabilistic model
for word-sequence prediction in natural language based on prediction suffix
trees (PSTs). By using efficient data structures, we extend the notion of PST
to unbounded vocabularies. We also show how to use a Bayesian approach based on
recursive priors over all possible PSTs to efficiently maintain tree mixtures.
These mixtures have provably and practically better performance than almost any
single model. We evaluate the model on several corpora. The low perplexity
achieved by relatively small PST mixture models suggests that they may be an
advantageous alternative, both theoretically and practically, to the widely
used n-gram models.Comment: 15 pages, one PostScript figure, uses psfig.sty and fullname.sty.
Revised version of a paper in the Proceedings of the Third Workshop on Very
Large Corpora, MIT, 199
Similarity-Based Models of Word Cooccurrence Probabilities
In many applications of natural language processing (NLP) it is necessary to
determine the likelihood of a given word combination. For example, a speech
recognizer may need to determine which of the two word combinations ``eat a
peach'' and ``eat a beach'' is more likely. Statistical NLP methods determine
the likelihood of a word combination from its frequency in a training corpus.
However, the nature of language is such that many word combinations are
infrequent and do not occur in any given corpus. In this work we propose a
method for estimating the probability of such previously unseen word
combinations using available information on ``most similar'' words.
We describe probabilistic word association models based on distributional
word similarity, and apply them to two tasks, language modeling and pseudo-word
disambiguation. In the language modeling task, a similarity-based model is used
to improve probability estimates for unseen bigrams in a back-off language
model. The similarity-based method yields a 20% perplexity improvement in the
prediction of unseen bigrams and statistically significant reductions in
speech-recognition error.
We also compare four similarity-based estimation methods against back-off and
maximum-likelihood estimation methods on a pseudo-word sense disambiguation
task in which we controlled for both unigram and bigram frequency to avoid
giving too much weight to easy-to-disambiguate high-frequency configurations.
The similarity-based methods perform up to 40% better on this particular task.Comment: 26 pages, 5 figure
Principles and Implementation of Deductive Parsing
We present a system for generating parsers based directly on the metaphor of
parsing as deduction. Parsing algorithms can be represented directly as
deduction systems, and a single deduction engine can interpret such deduction
systems so as to implement the corresponding parser. The method generalizes
easily to parsers for augmented phrase structure formalisms, such as
definite-clause grammars and other logic grammar formalisms, and has been used
for rapid prototyping of parsing algorithms for a variety of formalisms
including variants of tree-adjoining grammars, categorial grammars, and
lexicalized context-free grammars.Comment: 69 pages, includes full Prolog cod
Detection of new eruptions in the Magellanic Clouds LBVs R 40 and R 110
We performed a spectroscopic and photometric analysis to study new eruptions
in two luminous blue variables (LBVs) in the Magellanic Clouds. We detected a
strong new eruption in the LBV R40 that reached in 2016, which is
around mag brighter than the minimum registered in 1985. During this new
eruption, the star changed from an A-type to a late F-type spectrum. Based on
photometric and spectroscopic empirical calibrations and synthetic spectral
modeling, we determine that R\,40 reached ~K
during this new eruption. This object is thereby probably one of the coolest
identified LBVs. We could also identify an enrichment of nitrogen and r- and
s-process elements. We detected a weak eruption in the LBV R 110 with a maximum
of mag in 2011, that is, around mag brighter than in the
quiescent phase. On the other hand, this new eruption is about mag
fainter than the first eruption detected in 1990, but the temperature did not
decrease below 8500 K. Spitzer spectra show indications of cool dust in the
circumstellar environment of both stars, but no hot or warm dust was present,
except by the probable presence of PAHs in R\,110. We also discuss a possible
post-red supergiant nature for both stars
Chemical analysis of giant stars in the young open cluster NGC 3114
Context: Open clusters are very useful targets for examining possible trends
in galactocentric distance and age, especially when young and old open clusters
are compared. Aims: We carried out a detailed spectroscopic analysis to derive
the chemical composition of seven red giants in the young open cluster NGC
3114. Abundances of C, N, O, Li, Na, Mg, Al, Ca, Si, Ti, Ni, Cr, Y, Zr, La, Ce,
and Nd were obtained, as well as the carbon isotopic ratio. Methods: The
atmospheric parameters of the studied stars and their chemical abundances were
determined using high-resolution optical spectroscopy. We employed the
local-thermodynamic-equilibrium model atmospheres of Kurucz and the spectral
analysis code MOOG. The abundances of the light elements were derived using the
spectral synthesis technique. Results: We found that NGC 3114 has a mean
metallicity of [Fe/H] = -0.01+/-0.03. The isochrone fit yielded a turn-off mass
of 4.2 Msun. The [N/C] ratio is in good agreement with the models predicted by
first dredge-up. We found that two stars, HD 87479 and HD 304864, have high
rotational velocities of 15.0 km/s and 11.0 km/s; HD 87526 is a halo star and
is not a member of NGC 3114. Conclusions: The carbon and nitrogen abundance in
NGC 3114 agree with the field and cluster giants. The oxygen abundance in NGC
3114 is lower compared to the field giants. The [O/Fe] ratio is similar to the
giants in young clusters. We detected sodium enrichment in the analyzed cluster
giants. As far as the other elements are concerned, their [X/Fe] ratios follow
the same trend seen in giants with the same metallicity.Comment: 17 pages, 9 figures, 10 tables; accepted for publication in A&
Retrofit Options for Increasing Energy Efficiency in Office Buildings- Methodology Review
Portuguese Buildings represent 35% of primary energy consumption in 2006, with non-residential sector representing almost half of this number globally and around 65% in Lisbon city. Expected to grow 5% yearly in this period, non-residential buildings rehabilitation is a great opportunity for energy rehabilitation for a stock of 800.000
buildings needing medium to high interventions. For this task to be successful it is also urgent that procedures
consider an accurate technical framework, where existing technologies and best case-studies can be considered, in
order to drive passive measures retrofitting forward. This paper presents an overview of a methodology development
which pretends to include the energy component in rehabilitation schemes with an integrated and comprehensive
analysis, achieving all those directly involved in the building process (owners, consumers, public bodies,
construction and project design industry) as well as new important players such as ESCOs
- …