1,827 research outputs found

    Augmenting conversations through context-aware multimedia retrieval based on speech recognition

    Get PDF
    Future’s environments will be sensitive and responsive to the presence of people to support them carrying out their everyday life activities, tasks and rituals, in an easy and natural way. Such interactive spaces will use the information and communication technologies to bring the computation into the physical world, in order to enhance ordinary activities of their users. This paper describes a speech-based spoken multimedia retrieval system that can be used to present relevant video-podcast (vodcast) footage, in response to spontaneous speech and conversations during daily life activities. The proposed system allows users to search the spoken content of multimedia files rather than their associated meta-information and let them navigate to the right portion where queried words are spoken by facilitating within-medium searches of multimedia content through a bag-of-words approach. Finally, we have studied the proposed system on different scenarios by using vodcasts in English from various categories, as the targeted multimedia, and discussed how it would enhance people’s everyday life activities by different scenarios including education, entertainment, marketing, news and workplace

    Utilizing Review Summarization in a Spoken Recommendation System

    Get PDF
    In this paper we present a framework for spoken recommendation systems. To provide reliable recommendations to users, we incorporate a review summarization technique which extracts informative opinion summaries from grass-roots users‘ reviews. The dialogue system then utilizes these review summaries to support both quality-based opinion inquiry and feature- specific entity search. We propose a probabilistic language generation approach to automatically creating recommendations in spoken natural language from the text-based opinion summaries. A user study in the restaurant domain shows that the proposed approaches can effectively generate reliable and helpful recommendations in human-computer conversations.T-Party ProjectQuanta Computer (Firm

    Audio Transcription and Summarization System using Cloud Computing and Artificial Intelligence

    Get PDF
    In the modern era, organizations increasingly rely on virtual meetings to address customer issues promptly and effectively. However, dealing with recorded customer calls can be arduous. This review abstract introduces an innovative methodology to summarize audio data from customer interactions, which can streamline virtual meetings. Leveraging a speech recognizer, like AssemblyAI's API, the methodology converts audio data into text, and then employs a Graph-theoretic approach to generate concise summaries. This review abstract delves into the growing prominence of cloud-based AI and ML services in the tech industry. It underscores the unique competitive strategies and focuses of major players, namely Amazon, Microsoft, and Google, in the realm of AI and ML platform development. The analysis explores these companies' internal applications and external ecosystem, dissecting their respective AI and ML development strategies. Finally, it predicts future directions for AI and ML platforms, including potential business models and emerging trends, while considering how Amazon, Microsoft, and Google align their platform development strategies with these future prospects

    Abstractive Multi-Document Summarization via Phrase Selection and Merging

    Full text link
    We propose an abstraction-based multi-document summarization framework that can construct new sentences by exploring more fine-grained syntactic units than sentences, namely, noun/verb phrases. Different from existing abstraction-based approaches, our method first constructs a pool of concepts and facts represented by phrases from the input documents. Then new sentences are generated by selecting and merging informative phrases to maximize the salience of phrases and meanwhile satisfy the sentence construction constraints. We employ integer linear optimization for conducting phrase selection and merging simultaneously in order to achieve the global optimal solution for a summary. Experimental results on the benchmark data set TAC 2011 show that our framework outperforms the state-of-the-art models under automated pyramid evaluation metric, and achieves reasonably well results on manual linguistic quality evaluation.Comment: 11 pages, 1 figure, accepted as a full paper at ACL 201

    L-Eval: Instituting Standardized Evaluation for Long Context Language Models

    Full text link
    Recently, there has been growing interest in extending the context length of instruction-following models in order to effectively process single-turn long input (e.g. summarizing a paper) and conversations with more extensive histories. While proprietary models such as GPT-4 and Claude have demonstrated considerable advancements in handling tens of thousands of tokens of context, open-sourced models are still in the early stages of experimentation. It also remains unclear whether developing these long context models can offer substantial gains on practical downstream tasks over retrieval-based methods or models simply trained on chunked contexts. To address this challenge, we propose to institute standardized evaluation for long context language models. Concretely, we develop L-Eval which contains 411 long documents and over 2,000 query-response pairs manually annotated and checked by the authors encompassing areas such as law, finance, school lectures, lengthy conversations, news, long-form novels, and meetings. L-Eval also adopts diverse evaluation methods and instruction styles, enabling a more reliable assessment of Long Context Language Models (LCLMs). Our findings indicate that while open-source models typically lag behind their commercial counterparts, they still exhibit impressive performance. LLaMA2 achieves the best results (win 45\% vs turbo-16k) on open-ended tasks with only 4k context length and ChatGLM2 achieves the best results on closed-ended tasks with 8k input tokens. We release our new evaluation suite, code, and all generation results including predictions from all open-sourced LCLMs, GPT4-32k, Cluade-100k at {\url{https://github.com/OpenLMLab/LEval}}

    Bringing a Design to the Table

    Get PDF
    This writing reflects upon an artist and designer’s shared experience of making a film about a design and planning team’s collaborative process. The film, entitled ‘(Re)searching a Welsh Design Vernacular’ documents a meeting that took place on Friday 3rd October 2008 at Grwp Gwalia, a social housing organisation based in Swansea, Wales. The film was exhibited as part of ‘Reflecting Wales: an architectural exhibition of innovative, speculative and built work in Wales’ at the Senedd in October 2008

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
    • …
    corecore