769 research outputs found

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    On the Unification of Active Galactic Nuclei

    Get PDF
    The inevitable spread in properties of the toroidal obscuration of active galactic nuclei (AGNs) invalidates the widespread notion that type 1 and 2 AGNs are intrinsically the same objects, drawn randomly from the distribution of torus covering factors. Instead, AGNs are drawn \emph{preferentially} from this distribution; type 2 are more likely drawn from the distribution higher end, type 1 from its lower end. Type 2 AGNs have a higher IR luminosity, lower narrow-line luminosity and a higher fraction of Compton thick X-ray obscuration than type 1. Meaningful studies of unification statistics cannot be conducted without first determining the intrinsic distribution function of torus covering factors.Comment: ApJ Letters, to be published. This is the final, journal version; minor editing revisions from original on

    An Occultation Event in Centaurus A and the Clumpy Torus Model

    Full text link
    We have analyzed 16 months of sustained monitoring observations of Cen A from the Rossi X-ray Timing Explorer to search for changes in the absorbing column in the line of sight to the central nucleus. We present time-resolved spectroscopy which indicates that a discrete clump of material transited the line of sight to the central illuminating source over the course of ~170 days between 2010 August and 2011 February with a maximum increase in the column density of about 8.4 x 10^22 cm^-2. This is the best quality data of such an event that has ever been analyzed with the shape of the ingress and egress clearly seen. Modeling the clump of material as roughly spherical with a linearly decreasing density profile and assuming a distance from the central nucleus commensurate with the dusty torus we found that the clump would have a diameter of 1.4-2.4 x 10^15 cm with a central number density of n_H = 1.8-3.0 x 10^7 cm^-3. This is consistent with previous results for a similar (though possibly much longer) occultation event inferred in this source in 2003-2004 and supports models of the molecular torus as a clumpy medium.Comment: 4 pages, 5 figures, Submitted to ApJ

    Detecting (Un)Important Content for Single-Document News Summarization

    Full text link
    We present a robust approach for detecting intrinsic sentence importance in news, by training on two corpora of document-summary pairs. When used for single-document summarization, our approach, combined with the "beginning of document" heuristic, outperforms a state-of-the-art summarizer and the beginning-of-article baseline in both automatic and manual evaluations. These results represent an important advance because in the absence of cross-document repetition, single document summarizers for news have not been able to consistently outperform the strong beginning-of-article baseline.Comment: Accepted By EACL 201

    Study of sorption properties of lignin-derivatized fibrous coposites for the remediation of oil polluted receiving waters

    Get PDF
    The sorption properties of lignin-wool composites towards oil pollution at different concentrations of the contamination were studied. The release ability of oil pollutant was studied by a gravimetric method and by determining the chemical oxygen demand of cleaned water. It has been established that technical hydrolysis lignin–wool composites display a low release ability of oil-based pollutants and a slow rate for achieving release equilibrium

    Nuclear X-ray properties of the peculiar radio-loud hidden AGN 4C+29.30

    Full text link
    We present results from a study of a nuclear emission of a nearby radio galaxy, 4C+29.30, over a broad 0.5-200 keV X-ray band. This study used new XMM-Newton (~17 ksec) and Chandra (~300 ksec) data, and archival Swift/BAT data from the 58-month catalog. The hard (>2 keV) X-ray spectrum of 4C+29.30 can be decomposed into an intrinsic hard power-law (Gamma ~ 1.56) modified by a cold absorber with an intrinsic column density N_{H,z} ~ 5x10^{23} cm^{-2}, and its reflection (|Omega/2pi| ~ 0.3) from a neutral matter including a narrow iron Kalpha emission line at the rest frame energy ~6.4 keV. The reflected component is less absorbed than the intrinsic one with an upper limit on the absorbing column of N^{refl}_{H,z} < 2.5x10^{22} cm^{-2}. The X-ray spectrum varied between the XMM-Newton and Chandra observations. We show that a scenario invoking variations of the normalization of the power-law is favored over a model with variable intrinsic column density. X-rays in the 0.5-2 keV band are dominated by diffuse emission modeled with a thermal bremsstrahlung component with temperature ~0.7 keV, and contain only a marginal contribution from the scattered power-law component. We hypothesize that 4C+29.30 belongs to a class of `hidden' AGN containing a geometrically thick torus. However, unlike the majority of them, 4C+29.30 is radio-loud. Correlations between the scattering fraction and Eddington luminosity ratio, and the one between black hole mass and stellar velocity dispersion, imply that 4C+29.30 hosts a black hole with ~10^8 M_{Sun} mass.Comment: 13 pages, 5 figures, ApJ in pres

    Automatic Text Summarization of Newswire: Lessons Learned from the Document Understanding Conference

    Get PDF
    Since 2001, the Document Understanding Conferences have been the forum for researchers in automatic text summarization to compare methods and results on common test sets. Over the years, several types of summarization tasks have been addressed--single document summarization, multi-document summarization, summarization focused by question, and headline generation. This paper is an overview of the achieved results in the different types of summarization tasks. We compare both the broader classes of baselines, systems and humans, as well as individual pairs of summarizers (both human and automatic). An analysis of variance model is fitted, with summarizer and input set as independent variables, and the coverage score as the dependent variable, and simulation-based multiple comparisons were performed. The results document the progress in the field as a whole, rather then focusing on a single system, and thus can serve as a future reference on the work done up to date, as well as a starting point in the formulation of future tasks. Results also indicate that most progress in the field has been achieved in generic multi-document summarization and that the most challenging task is that of producing a focused summary in answer to a question/topic
    • 

    corecore