77 research outputs found
Scientific Information Extraction with Semi-supervised Neural Tagging
This paper addresses the problem of extracting keyphrases from scientific
articles and categorizing them as corresponding to a task, process, or
material. We cast the problem as sequence tagging and introduce semi-supervised
methods to a neural tagging model, which builds on recent advances in named
entity recognition. Since annotated training data is scarce in this domain, we
introduce a graph-based semi-supervised algorithm together with a data
selection scheme to leverage unannotated articles. Both inductive and
transductive semi-supervised learning strategies outperform state-of-the-art
information extraction performance on the 2017 SemEval Task 10 ScienceIE task.Comment: accepted by EMNLP 201
Key Phrase Extraction of Lightly Filtered Broadcast News
This paper explores the impact of light filtering on automatic key phrase
extraction (AKE) applied to Broadcast News (BN). Key phrases are words and
expressions that best characterize the content of a document. Key phrases are
often used to index the document or as features in further processing. This
makes improvements in AKE accuracy particularly important. We hypothesized that
filtering out marginally relevant sentences from a document would improve AKE
accuracy. Our experiments confirmed this hypothesis. Elimination of as little
as 10% of the document sentences lead to a 2% improvement in AKE precision and
recall. AKE is built over MAUI toolkit that follows a supervised learning
approach. We trained and tested our AKE method on a gold standard made of 8 BN
programs containing 110 manually annotated news stories. The experiments were
conducted within a Multimedia Monitoring Solution (MMS) system for TV and radio
news/programs, running daily, and monitoring 12 TV and 4 radio channels.Comment: In 15th International Conference on Text, Speech and Dialogue (TSD
2012
Using term clouds to represent segment-level semantic content of podcasts
Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts
generated by automatic speech recognition (ASR). This paper
examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript
generated by automatic speech recognition (ASR). Quality of
segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries
Automatic population of knowledge bases with multimodal data about named entities
Knowledge bases are of great importance for Web search, recommendations, and many Information Retrieval tasks. However, maintaining them for not so popular entities is often a bottleneck. Typically, such entities have limited textual coverage and only a few ontological facts. Moreover, these entities are not well populated with multimodal data, such as images, videos, or audio recordings.
The goals in this thesis are (1) to populate a given knowledge base with multimodal data about entities, such as images or audio recordings, and (2) to ease the task of maintaining and expanding the textual knowledge about a given entity, by recommending valuable text excerpts to the contributors of knowledge bases.
The thesis makes three main contributions. The first two contributions concentrate on finding images of named entities with high precision, high recall, and high visual diversity. Our main focus are less popular entities, for which the image search engines fail to retrieve good results. Our methods utilize background knowledge about the entity, such as ontological facts or a short description, and a visual-based image similarity to rank and diversify a set of candidate images.
Our third contribution is an approach for extracting text contents related to a given entity. It leverages a language-model-based similarity between a short description of the entity and the text sources, and solves a budget-constraint optimization program without any assumptions on the text structure. Moreover, our approach is also able to reliably extract entity related audio excerpts from news podcasts. We derive the time boundaries from the usually very noisy audio transcriptions.Wissensbasen wird bei der Websuche, bei Empfehlungsdiensten und vielen anderen Information Retrieval Aufgaben eine groĂe Bedeutung zugeschrieben. Allerdings stellt sich deren Unterhalt fĂŒr weniger populĂ€re EntitĂ€ten als schwierig heraus. Ăblicherweise ist die Anzahl an Texten ĂŒber EntitĂ€ten dieser Art begrenzt, und es gibt nur wenige ontologische Fakten. AuĂerdem sind nicht viele multimediale Daten, wie zum Beispiel Bilder, Videos oder Tonaufnahmen, fĂŒr diese EntitĂ€ten verfĂŒgbar.
Die Ziele dieser Dissertation sind daher (1) eine gegebene Wissensbasis mit multimedialen Daten, wie Bilder oder Tonaufnahmen, ĂŒber EntitĂ€ten anzureichern und (2) die Erleichterung der Aufgabe Texte ĂŒber eine gegebene EntitĂ€t zu verwalten und zu erweitern, indem den Beitragenden einer Wissensbasis nĂŒtzliche Textausschnitte vorgeschlagen werden.
Diese Dissertation leistet drei HauptbeitrĂ€ge. Die ersten zwei BeitrĂ€ge sind im Gebiet des Auffindens von Bildern von benannten EntitĂ€ten mit hoher Genauigkeit, hoher Trefferquote, und hoher visueller Vielfalt. Das Hauptaugenmerk liegt auf den weniger populĂ€ren EntitĂ€ten bei denen die Bildersuchmaschinen normalerweise keine guten Ergebnisse liefern. Unsere Verfahren benutzen Hintergrundwissen ĂŒber die EntitĂ€t, zum Beispiel ontologische Fakten oder eine Kurzbeschreibung, so wie ein visuell-basiertes BilderĂ€hnlichkeitsmaĂ um die Bilder nach Rang zu ordnen und um eine Menge von Bilderkandidaten zu diversifizieren.
Der dritte Beitrag ist ein Ansatz um Textinhalte, die sich auf eine gegebene EntitĂ€t beziehen, zu extrahieren. Der Ansatz nutzt ein auf einem Sprachmodell basierendes ĂhnlichkeitsmaĂ zwischen einer Kurzbeschreibung der EntitĂ€t und den Textquellen und löst zudem ein Optimierungsproblem mit Budgetrestriktion, das keine Annahmen an die Textstruktur macht. DarĂŒber hinaus ist der Ansatz in der Lage Tonaufnahmen, welche in Beziehung zu einer EntitĂ€t stehen, zuverlĂ€ssig aus Nachrichten-Podcasts zu extrahieren. DafĂŒr werden zeitliche Abgrenzungen aus den normalerweise sehr verrauschten Audiotranskriptionen hergeleitet
Spoken content retrieval: A survey of techniques and technologies
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
- âŠ