769 research outputs found
Spoken content retrieval: A survey of techniques and technologies
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
On the Unification of Active Galactic Nuclei
The inevitable spread in properties of the toroidal obscuration of active
galactic nuclei (AGNs) invalidates the widespread notion that type 1 and 2 AGNs
are intrinsically the same objects, drawn randomly from the distribution of
torus covering factors. Instead, AGNs are drawn \emph{preferentially} from this
distribution; type 2 are more likely drawn from the distribution higher end,
type 1 from its lower end. Type 2 AGNs have a higher IR luminosity, lower
narrow-line luminosity and a higher fraction of Compton thick X-ray obscuration
than type 1. Meaningful studies of unification statistics cannot be conducted
without first determining the intrinsic distribution function of torus covering
factors.Comment: ApJ Letters, to be published. This is the final, journal version;
minor editing revisions from original on
An Occultation Event in Centaurus A and the Clumpy Torus Model
We have analyzed 16 months of sustained monitoring observations of Cen A from
the Rossi X-ray Timing Explorer to search for changes in the absorbing column
in the line of sight to the central nucleus. We present time-resolved
spectroscopy which indicates that a discrete clump of material transited the
line of sight to the central illuminating source over the course of ~170 days
between 2010 August and 2011 February with a maximum increase in the column
density of about 8.4 x 10^22 cm^-2. This is the best quality data of such an
event that has ever been analyzed with the shape of the ingress and egress
clearly seen. Modeling the clump of material as roughly spherical with a
linearly decreasing density profile and assuming a distance from the central
nucleus commensurate with the dusty torus we found that the clump would have a
diameter of 1.4-2.4 x 10^15 cm with a central number density of n_H = 1.8-3.0 x
10^7 cm^-3. This is consistent with previous results for a similar (though
possibly much longer) occultation event inferred in this source in 2003-2004
and supports models of the molecular torus as a clumpy medium.Comment: 4 pages, 5 figures, Submitted to ApJ
Detecting (Un)Important Content for Single-Document News Summarization
We present a robust approach for detecting intrinsic sentence importance in
news, by training on two corpora of document-summary pairs. When used for
single-document summarization, our approach, combined with the "beginning of
document" heuristic, outperforms a state-of-the-art summarizer and the
beginning-of-article baseline in both automatic and manual evaluations. These
results represent an important advance because in the absence of cross-document
repetition, single document summarizers for news have not been able to
consistently outperform the strong beginning-of-article baseline.Comment: Accepted By EACL 201
Study of sorption properties of lignin-derivatized fibrous coposites for the remediation of oil polluted receiving waters
The sorption properties of lignin-wool composites towards oil pollution at different concentrations of the contamination were studied. The release ability of oil pollutant was studied by a gravimetric method and by determining the chemical oxygen demand of cleaned water. It has been established that technical hydrolysis ligninâwool composites display a low release ability of oil-based pollutants and a slow rate for achieving release equilibrium
Nuclear X-ray properties of the peculiar radio-loud hidden AGN 4C+29.30
We present results from a study of a nuclear emission of a nearby radio
galaxy, 4C+29.30, over a broad 0.5-200 keV X-ray band. This study used new
XMM-Newton (~17 ksec) and Chandra (~300 ksec) data, and archival Swift/BAT data
from the 58-month catalog. The hard (>2 keV) X-ray spectrum of 4C+29.30 can be
decomposed into an intrinsic hard power-law (Gamma ~ 1.56) modified by a cold
absorber with an intrinsic column density N_{H,z} ~ 5x10^{23} cm^{-2}, and its
reflection (|Omega/2pi| ~ 0.3) from a neutral matter including a narrow iron
Kalpha emission line at the rest frame energy ~6.4 keV. The reflected component
is less absorbed than the intrinsic one with an upper limit on the absorbing
column of N^{refl}_{H,z} < 2.5x10^{22} cm^{-2}. The X-ray spectrum varied
between the XMM-Newton and Chandra observations. We show that a scenario
invoking variations of the normalization of the power-law is favored over a
model with variable intrinsic column density. X-rays in the 0.5-2 keV band are
dominated by diffuse emission modeled with a thermal bremsstrahlung component
with temperature ~0.7 keV, and contain only a marginal contribution from the
scattered power-law component. We hypothesize that 4C+29.30 belongs to a class
of `hidden' AGN containing a geometrically thick torus. However, unlike the
majority of them, 4C+29.30 is radio-loud. Correlations between the scattering
fraction and Eddington luminosity ratio, and the one between black hole mass
and stellar velocity dispersion, imply that 4C+29.30 hosts a black hole with
~10^8 M_{Sun} mass.Comment: 13 pages, 5 figures, ApJ in pres
Automatic Text Summarization of Newswire: Lessons Learned from the Document Understanding Conference
Since 2001, the Document Understanding Conferences have been the forum for researchers in automatic text summarization to compare methods and results on common test sets. Over the years, several types of summarization tasks have been addressed--single document summarization, multi-document summarization, summarization focused by question, and headline generation. This paper is an overview of the achieved results in the different types of summarization tasks. We compare both the broader classes of baselines, systems and humans, as well as individual pairs of summarizers (both human and automatic). An analysis of variance model is fitted, with summarizer and input set as independent variables, and the coverage score as the dependent variable, and simulation-based multiple comparisons were performed. The results document the progress in the field as a whole, rather then focusing on a single system, and thus can serve as a future reference on the work done up to date, as well as a starting point in the formulation of future tasks. Results also indicate that most progress in the field has been achieved in generic multi-document summarization and that the most challenging task is that of producing a focused summary in answer to a question/topic
- âŠ