51 research outputs found

    Factoid question answering for spoken documents

    Get PDF
    In this dissertation, we present a factoid question answering system, specifically tailored for Question Answering (QA) on spoken documents. This work explores, for the first time, which techniques can be robustly adapted from the usual QA on written documents to the more difficult spoken documents scenario. More specifically, we study new information retrieval (IR) techniques designed for speech, and utilize several levels of linguistic information for the speech-based QA task. These include named-entity detection with phonetic information, syntactic parsing applied to speech transcripts, and the use of coreference resolution. Our approach is largely based on supervised machine learning techniques, with special focus on the answer extraction step, and makes little use of handcrafted knowledge. Consequently, it should be easily adaptable to other domains and languages. In the work resulting of this Thesis, we have impulsed and coordinated the creation of an evaluation framework for the task of QA on spoken documents. The framework, named QAst, provides multi-lingual corpora, evaluation questions, and answers key. These corpora have been used in the QAst evaluation that was held in the CLEF workshop for the years 2007, 2008 and 2009, thus helping the developing of state-of-the-art techniques for this particular topic. The presentend QA system and all its modules are extensively evaluated on the European Parliament Plenary Sessions English corpus composed of manual transcripts and automatic transcripts obtained by three different Automatic Speech Recognition (ASR) systems that exhibit significantly different word error rates. This data belongs to the CLEF 2009 track for QA on speech transcripts. The main results confirm that syntactic information is very useful for learning to rank question candidates, improving results on both manual and automatic transcripts unless the ASR quality is very low. Overall, the performance of our system is comparable or better than the state-of-the-art on this corpus, confirming the validity of our approach.En aquesta Tesi, presentem un sistema de Question Answering (QA) factual, especialment ajustat per treballar amb documents orals. En el desenvolupament explorem, per primera vegada, quines tècniques de les habitualment emprades en QA per documents escrit són suficientment robustes per funcionar en l'escenari més difícil de documents orals. Amb més especificitat, estudiem nous mètodes de Information Retrieval (IR) dissenyats per tractar amb la veu, i utilitzem diversos nivells d'informació linqüística. Entre aquests s'inclouen, a saber: detecció de Named Entities utilitzant informació fonètica, "parsing" sintàctic aplicat a transcripcions de veu, i també l'ús d'un sub-sistema de detecció i resolució de la correferència. La nostra aproximació al problema es recolza en gran part en tècniques supervisades de Machine Learning, estant aquestes enfocades especialment cap a la part d'extracció de la resposta, i fa servir la menor quantitat possible de coneixement creat per humans. En conseqüència, tot el procés de QA pot ser adaptat a altres dominis o altres llengües amb relativa facilitat. Un dels resultats addicionals de la feina darrere d'aquesta Tesis ha estat que hem impulsat i coordinat la creació d'un marc d'avaluació de la taska de QA en documents orals. Aquest marc de treball, anomenat QAst (Question Answering on Speech Transcripts), proporciona un corpus de documents orals multi-lingüe, uns conjunts de preguntes d'avaluació, i les respostes correctes d'aquestes. Aquestes dades han estat utilitzades en les evaluacionis QAst que han tingut lloc en el si de les conferències CLEF en els anys 2007, 2008 i 2009; d'aquesta manera s'ha promogut i ajudat a la creació d'un estat-de-l'art de tècniques adreçades a aquest problema en particular. El sistema de QA que presentem i tots els seus particulars sumbòduls, han estat avaluats extensivament utilitzant el corpus EPPS (transcripcions de les Sessions Plenaries del Parlament Europeu) en anglès, que cónté transcripcions manuals de tots els discursos i també transcripcions automàtiques obtingudes mitjançant tres reconeixedors automàtics de la parla (ASR) diferents. Els reconeixedors tenen característiques i resultats diferents que permetes una avaluació quantitativa i qualitativa de la tasca. Aquestes dades pertanyen a l'avaluació QAst del 2009. Els resultats principals de la nostra feina confirmen que la informació sintàctica és mol útil per aprendre automàticament a valorar la plausibilitat de les respostes candidates, millorant els resultats previs tan en transcripcions manuals com transcripcions automàtiques, descomptat que la qualitat de l'ASR sigui molt baixa. En general, el rendiment del nostre sistema és comparable o millor que els altres sistemes pertanyents a l'estat-del'art, confirmant així la validesa de la nostra aproximació

    TechNews digests: Jan - Mar 2010

    Get PDF
    TechNews is a technology, news and analysis service aimed at anyone in the education sector keen to stay informed about technology developments, trends and issues. TechNews focuses on emerging technologies and other technology news. TechNews service : digests september 2004 till May 2010 Analysis pieces and News combined publish every 2 to 3 month

    Personal long-term memory aids

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, February 2005.MIT Institute Archives Copy: p. 101-132 bound in reverse order.Includes bibliographical references (p. 126-132).The prevalence and affordability of personal and environmental recording apparatuses are leading to increased documentation of our daily lives. This trend is bound to continue and it follows that academic, industry, and government groups are showing an increased interest in such endeavors for various purposes. In the present case, I assert that such documentation can be used to help remedy common memory problems. Assuming a long-term personal archive exists, when confronted with a memory problem, one faces a new challenge, that of finding relevant memory triggers. This dissertation examines the use of information-retrieval technologies on long-term archives of personal experiences towards remedying certain types of long-term forgetting. The approach focuses on capturing audio for the content. Research on Spoken Document Retrieval examines the pitfalls of information-retrieval techniques on error-prone speech- recognizer-generated transcripts and these challenges carry over to the present task. However, "memory retrieval" can benefit from the person's familiarity of the recorded data and the context in which it was recorded to help guide their effort. To study this, I constructed memory-retrieval tools designed to leverage a person's familiarity of their past to optimize their search task. To evaluate the utility of these towards solving long-term memory problems, I (1) recorded public events and evaluated witnesses' memory-retrieval approaches using these tools; and (2) conducted a longer- term memory-retrieval study based on recordings of several years of my personal and research-related conversations. Subjects succeeded with memory-retrieval tasks in both studies, typically finding answers within minutes.(cont.) This is far less time than the alternate of re-listening to hours of recordings. Subjects' memories of the past events, in particular their ability to narrow the window of time in which past events occurred, improved their ability to find answers. In addition to results from the memory-retrieval studies, I present a technique called "speed listening." By using a transcript (even one with many errors), it allows people to reduce listening time while maintaining comprehension. Finally, I report on my experiences recording events in my life over 2.5 years.by Sunil Vemuri.Ph.D

    Digital imaging technology assessment: Digital document storage project

    Get PDF
    An ongoing technical assessment and requirements definition project is examining the potential role of digital imaging technology at NASA's STI facility. The focus is on the basic components of imaging technology in today's marketplace as well as the components anticipated in the near future. Presented is a requirement specification for a prototype project, an initial examination of current image processing at the STI facility, and an initial summary of image processing projects at other sites. Operational imaging systems incorporate scanners, optical storage, high resolution monitors, processing nodes, magnetic storage, jukeboxes, specialized boards, optical character recognition gear, pixel addressable printers, communications, and complex software processes

    Error analysis in automatic speech recognition and machine translation

    Get PDF
    Automatic speech recognition and machine translation are well-known terms in the translation world nowadays. Systems that carry out these processes are taking over the work of humans more and more. Reasons for this are the speed at which the tasks are performed and their costs. However, the quality of these systems is debatable. They are not yet capable of delivering the same performance as human transcribers or translators. The lack of creativity, the ability to interpret texts and the sense of language is often cited as the reason why the performance of machines is not yet at the level of human translation or transcribing work. Despite this, there are companies that use these machines in their production pipelines. Unbabel, an online translation platform powered by artificial intelligence, is one of these companies. Through a combination of human translators and machines, Unbabel tries to provide its customers with a translation of good quality. This internship report was written with the aim of gaining an overview of the performance of these systems and the errors they produce. Based on this work, we try to get a picture of possible error patterns produced by both systems. The present work consists of an extensive analysis of errors produced by automatic speech recognition and machine translation systems after automatically transcribing and translating 10 English videos into Dutch. Different videos were deliberately chosen to see if there were significant differences in the error patterns between videos. The generated data and results from this work, aims at providing possible ways to improve the quality of the services already mentioned.O reconhecimento automático de fala e a tradução automática são termos conhecidos no mundo da tradução, hoje em dia. Os sistemas que realizam esses processos estão a assumir cada vez mais o trabalho dos humanos. As razões para isso são a velocidade com que as tarefas são realizadas e os seus custos. No entanto, a qualidade desses sistemas é discutível. As máquinas ainda não são capazes de ter o mesmo desempenho dos transcritores ou tradutores humanos. A falta de criatividade, de capacidade de interpretar textos e de sensibilidade linguística são motivos frequentemente usados para justificar o facto de as máquinas ainda não estarem suficientemente desenvolvidas para terem um desempenho comparável com o trabalho de tradução ou transcrição humano. Mesmo assim, existem empresas que fazem uso dessas máquinas. A Unbabel, uma plataforma de tradução online baseada em inteligência artificial, é uma dessas empresas. Através de uma combinação de tradutores humanos e de máquinas, a Unbabel procura oferecer aos seus clientes traduções de boa qualidade. O presente relatório de estágio foi feito com o intuito de obter uma visão geral do desempenho desses sistemas e das falhas que cometem, propondo delinear uma imagem dos possíveis padrões de erro existentes nos mesmos. Para tal, fez-se uma análise extensa das falhas que os sistemas de reconhecimento automático de fala e de tradução automática cometeram, após a transcrição e a tradução automática de 10 vídeos. Foram deliberadamente escolhidos registos videográficos diversos, de modo a verificar possíveis diferenças nos padrões de erro. Através dos dados gerados e dos resultados obtidos, propõe-se encontrar uma forma de melhorar a qualidade dos serviços já mencionados

    Automatic topic detection of multi-lingual news stories.

    Get PDF
    Wong Kam Lai.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaves 92-98).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Our Contributions --- p.5Chapter 1.2 --- Organization of this Thesis --- p.5Chapter 2 --- Literature Review --- p.7Chapter 2.1 --- Dragon Systems --- p.7Chapter 2.2 --- Carnegie Mellon University (CMU) --- p.9Chapter 2.3 --- University of Massachusetts (UMass) --- p.10Chapter 2.4 --- IBM T.J. Watson Research Center --- p.11Chapter 2.5 --- BBN Technologies --- p.12Chapter 2.6 --- National Taiwan University (NTU) --- p.13Chapter 2.7 --- Drawbacks of Existing Approaches --- p.14Chapter 3 --- Overview of Proposed Approach --- p.15Chapter 3.1 --- News Source --- p.15Chapter 3.2 --- Story Preprocessing --- p.18Chapter 3.3 --- Concept Term Generation --- p.20Chapter 3.4 --- Named Entity Extraction --- p.21Chapter 3.5 --- Gross Translation of Chinese to English --- p.21Chapter 3.6 --- Topic Detection method --- p.22Chapter 3.6.1 --- Deferral Period --- p.22Chapter 3.6.2 --- Detection Approach --- p.23Chapter 4 --- Concept Term Model --- p.25Chapter 4.1 --- Background of Contextual Analysis --- p.25Chapter 4.2 --- Concept Term Generation --- p.28Chapter 4.2.1 --- Concept Generation Algorithm --- p.28Chapter 4.2.2 --- Concept Term Representation for Detection --- p.33Chapter 5 --- Topic Detection Model --- p.35Chapter 5.1 --- Text Representation and Term Weights --- p.35Chapter 5.1.1 --- Story Representation --- p.35Chapter 5.1.2 --- Topic Representation --- p.43Chapter 5.1.3 --- Similarity Score --- p.43Chapter 5.1.4 --- Time adjustment scheme --- p.46Chapter 5.2 --- Gross Translation Method --- p.48Chapter 5.3 --- The Detection System --- p.50Chapter 5.3.1 --- Detection Requirement --- p.50Chapter 5.3.2 --- The Top Level Model --- p.52Chapter 5.4 --- The Clustering Algorithm --- p.55Chapter 5.4.1 --- Similarity Calculation --- p.55Chapter 5.4.2 --- Grouping Related Elements --- p.56Chapter 5.4.3 --- Topic Identification --- p.60Chapter 6 --- Experimental Results and Analysis --- p.63Chapter 6.1 --- Evaluation Model --- p.63Chapter 6.1.1 --- Evaluation Methodology --- p.64Chapter 6.2 --- Experiments on the effects of tuning the parameter --- p.68Chapter 6.2.1 --- Experiment Setup --- p.68Chapter 6.2.2 --- Results and Analysis --- p.69Chapter 6.3 --- Experiments on the effects of named entities and concept terms --- p.74Chapter 6.3.1 --- Experiment Setup --- p.74Chapter 6.3.2 --- Results and Analysis --- p.75Chapter 6.4 --- Experiments on the effect of using time adjustment --- p.77Chapter 6.4.1 --- Experiment Setup --- p.77Chapter 6.4.2 --- Results and Analysis --- p.79Chapter 6.5 --- Experiments on mono-lingual detection --- p.80Chapter 6.5.1 --- Experiment Setup --- p.80Chapter 6.5.2 --- Results and Analysis --- p.80Chapter 7 --- Conclusions and Future Work --- p.83Chapter 7.1 --- Conclusions --- p.83Chapter 7.2 --- Future Work --- p.85Chapter A --- List of Topics annotated for TDT3 Corpus --- p.86Chapter B --- Matching evaluation topics to hypothesized topics --- p.90Bibliography --- p.9

    Symposium on the future: focus on firms

    Get PDF
    https://egrove.olemiss.edu/aicpa_comm/1205/thumbnail.jp
    • …
    corecore