449 research outputs found

    TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer

    Full text link
    Text style is highly abstract, as it encompasses various aspects of a speaker's characteristics, habits, logical thinking, and the content they express. However, previous text-style transfer tasks have primarily focused on data-driven approaches, lacking in-depth analysis and research from the perspectives of linguistics and cognitive science. In this paper, we introduce a novel task called Text Speech-Style Transfer (TSST). The main objective is to further explore topics related to human cognition, such as personality and emotion, based on the capabilities of existing LLMs. Considering the objective of our task and the distinctive characteristics of oral speech in real-life scenarios, we trained multi-dimension (i.e. filler words, vividness, interactivity, emotionality) evaluation models for the TSST and validated their correlation with human assessments. We thoroughly analyze the performance of several large language models (LLMs) and identify areas where further improvement is needed. Moreover, driven by our evaluation models, we have released a new corpus that improves the capabilities of LLMs in generating text with speech-style characteristics. In summary, we present the TSST task, a new benchmark for style transfer and emphasizing human-oriented evaluation, exploring and advancing the performance of current LLMs.Comment: Working in progres

    Multimedia content description framework

    Get PDF
    A framework is provided for describing multimedia content and a system in which a plurality of multimedia storage devices employing the content description methods of the present invention can interoperate. In accordance with one form of the present invention, the content description framework is a description scheme (DS) for describing streams or aggregations of multimedia objects, which may comprise audio, images, video, text, time series, and various other modalities. This description scheme can accommodate an essentially limitless number of descriptors in terms of features, semantics or metadata, and facilitate content-based search, index, and retrieval, among other capabilities, for both streamed or aggregated multimedia objects

    An Investigation of the Persuasive Effects of Rhetorical Questions, Message Framing, and the ELM in Promoting Responsible Cell Phone Usage

    Get PDF
    This study evaluated persuasive messages that advocate support for a ban against cell phones while driving using Petty and Cacioppo\u27s Elaboration Likelihood Model of persuasion as its theoretical framework. Seven hypotheses were tested using a 2 x 2 x 2 factorial design assessing the influence of need for cognition (high vs. low) in tandem with the variables of message framing (gain vs. loss statements) and message form (questions vs. statements) upon assessments of elaboration (ME), cognition message value (CMV), message effectiveness ratings (MEF), and attitude toward the prescribed behavior (ATPB). A significant main effect was found for message framing as positively framed messages produced more positive ratings for CMV, the degree to which individuals found the advocacy to be intellectually stimulating and worthwhile as vehicles for persuasion. A pair of significant two way interactions were detected as: (1) High need for cognition individuals registered a stronger commitment toward the prescribed behavior ( don\u27t use a cell phone while driving ) when exposed to negatively framed messages and (2) Low cognition receivers exposed to negatively framed messages registered a greater willingness to adopt the targeted behavior, future intent not to use a cell phone while driving. This latter result partially contradicted the original hypothesis

    Contextual awareness, messaging and communication in nomadic audio environments

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1998.Includes bibliographical references (p. 119-122).Nitin Sawhney.M.S

    Automatic Mobile Video Remixing and Collaborative Watching Systems

    Get PDF
    In the thesis, the implications of combining collaboration with automation for remix creation are analyzed. We first present a sensor-enhanced Automatic Video Remixing System (AVRS), which intelligently processes mobile videos in combination with mobile device sensor information. The sensor-enhanced AVRS system involves certain architectural choices, which meet the key system requirements (leverage user generated content, use sensor information, reduce end user burden), and user experience requirements. Architecture adaptations are required to improve certain key performance parameters. In addition, certain operating parameters need to be constrained, for real world deployment feasibility. Subsequently, sensor-less cloud based AVRS and low footprint sensorless AVRS approaches are presented. The three approaches exemplify the importance of operating parameter tradeoffs for system design. The approaches cover a wide spectrum, ranging from a multimodal multi-user client-server system (sensor-enhanced AVRS) to a mobile application which can automatically generate a multi-camera remix experience from a single video. Next, we present the findings from the four user studies involving 77 users related to automatic mobile video remixing. The goal was to validate selected system design goals, provide insights for additional features and identify the challenges and bottlenecks. Topics studied include the role of automation, the value of a video remix as an event memorabilia, the requirements for different types of events and the perceived user value from creating multi-camera remix from a single video. System design implications derived from the user studies are presented. Subsequently, sport summarization, which is a specific form of remix creation is analyzed. In particular, the role of content capture method is analyzed with two complementary approaches. The first approach performs saliency detection in casually captured mobile videos; in contrast, the second one creates multi-camera summaries from role based captured content. Furthermore, a method for interactive customization of summary is presented. Next, the discussion is extended to include the role of users’ situational context and the consumed content in facilitating collaborative watching experience. Mobile based collaborative watching architectures are described, which facilitate a common shared context between the participants. The concept of movable multimedia is introduced to highlight the multidevice environment of current day users. The thesis presents results which have been derived from end-to-end system prototypes tested in real world conditions and corroborated with extensive user impact evaluation

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    Creating an iPED Tour of Nantucket

    Get PDF
    To enhance visitor learning and enjoyment, museums are transitioning from the traditional delivery of information via maps and guidebooks to the use of handheld interpretive and wayfinding devices. The Nantucket Historical Association desired a handheld device to disseminate information about its historic sites. To address this desire, we evaluated handheld technologies, tested their acceptability among NHA patrons, developed our own prototype tour, and then tested it. Our project resulted in an expandable prototype tour and recommendations for the NHA

    Activity Report 2002

    Get PDF
    • …
    corecore