593 research outputs found

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Filling Knowledge Gaps in a Broad-Coverage Machine Translation System

    Full text link
    Knowledge-based machine translation (KBMT) techniques yield high quality in domains with detailed semantic models, limited vocabulary, and controlled input grammar. Scaling up along these dimensions means acquiring large knowledge resources. It also means behaving reasonably when definitive knowledge is not yet available. This paper describes how we can fill various KBMT knowledge gaps, often using robust statistical techniques. We describe quantitative and qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Agent-mediated shared conceptualizations in tagging services

    Get PDF
    Some of the most remarkable innovative technologies from the Web 2.0 are the collaborative tagging systems. They allow the use of folksonomies as a useful structure for a number of tasks in the social web, such as navigation and knowledge organization. One of the main deficiencies comes from the tagging behaviour of different users which causes semantic heterogeneity in tagging. As a consequence a user cannot benefit from the adequate tagging of others. In order to solve the problem, an agent-based reconciliation knowledge system, based on Formal Concept Analysis, is applied to facilitate the semantic interoperability between personomies. This article describes experiments that focus on conceptual structures produced by the system when it is applied to a collaborative tagging service, Delicious. Results will show the prevalence of shared tags in the sharing of common resources in the reconciliation process.Ministerio de Ciencia e Innovación TIN2009-09492Ministerio de Ciencia e Innovación TIN2010-20967-C04-0
    corecore