14 research outputs found

    Transforming Copied Text Based on Paste Destination

    Get PDF
    This publication describes systems and techniques directed to transforming copied text based on a paste destination. A user selects, via user input at a computing device, a text string that is output by a first application. The computing device analyzes the text string using natural language processing algorithms to identify associations between the text string and entities included in a knowledge base. In one example, an entity annotator analyzes the text string to determine a meaning and context of the text string (e.g., keywords, data types, conditions, etc.). The computing device generates, based on the meaning and context, a structured version of the text string by mapping a unique set of identifiers (IDs) that correspond to entities included in the knowledge base to the text string. The computing device then stores the structured version of the text string at intermediary storage. In response to receiving a paste command, the computing device identifies a destination context associated with the paste command and uses the destination context to identify augmented paste content from the knowledge base. In turn, the computing device pastes the augmented paste content to the destination

    MULTIMODAL CONTENT EDITING

    Get PDF
    This publication describes systems and techniques for multimodal content editing that enables a user to edit or extend content in a given modality using a variety of different input modes while matching the underlying format of the content and while preserving the original input. When the user uses an input mode different from the underlying format of a piece of content to provide input at a computing device to edit or extend the piece of content, the computing device may convert the input provided by the user from the input mode to the underlying format and may edit the piece of content based on the converted input. For example, if the user uses voice input to edit a text document, the computing device may convert the voice input into text and may include the converted text in the text document. In another example, if the user uses text input to edit an audio recording, the computing device may convert the text input into audio using a text-to-speech technique and may include the converted audio in the audio recording. The computing device may also preserve the provided input in its original form, such as by storing the provided input at the computing device, and may associate the stored provided input with the converted input so that the user may be able to refer back to the originally provided input. For example, if the computing device converts text input into audio for inclusion in an audio recording, the computing device may store the text input and may link the audio recording to the text input so that a user may be able to view the text input while listening to the audio recording

    Ontologies for Tracking Ubiquitous Interest

    No full text
    Within a ubiquitous environment, intelligent displays can select the most appropriate material depending on factors such as the audience's preferences and diversity of interest. In addition, such intelligent displays should adapt according to how the audience responds. To do this, they need to determine the composition of the audience, in terms of numbers and diversity of interest. This can affect the choice of video clip shown, by taking into consideration the number of people in the local region, and the preferences of the individuals in that region. In this paper we introduce BluScreen, an agent-oriented market-place that uses ubiquitous wireless technology to determine an audience composition as part of the bidding process, and present an ontology that is used to describe the wireless devices (used to identify and track users) within the local region of a display

    Fulfilling User Queries via the User’s Social and Professional Networks

    Get PDF
    The queries virtual assistants can respond to are limited by the information sources available to the virtual assistant. As a result, when a user query cannot be satisfactorily answered from such sources, the user receives a suboptimal answer or is told that an answer cannot be provided. This disclosure describes techniques that, with user permission, leverage a user’s personal and professional networks to respond to user queries to a virtual assistant. A classifier is used to determine if a particular query is best handled by a person in the user’s networks. In such cases, the virtual assistant facilitates correspondence with the individual and can optionally take action based on the response received. Integrating the user’s networks within the interactive flow of seeking information via a virtual assistant can enhance the user experience (UX) by improving the quality, accuracy, and utility of the provided responses

    AudioLM: a Language Modeling Approach to Audio Generation

    Full text link
    We introduce AudioLM, a framework for high-quality audio generation with long-term consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts audio generation as a language modeling task in this representation space. We show how existing audio tokenizers provide different trade-offs between reconstruction quality and long-term structure, and we propose a hybrid tokenization scheme to achieve both objectives. Namely, we leverage the discretized activations of a masked language model pre-trained on audio to capture long-term structure and the discrete codes produced by a neural audio codec to achieve high-quality synthesis. By training on large corpora of raw audio waveforms, AudioLM learns to generate natural and coherent continuations given short prompts. When trained on speech, and without any transcript or annotation, AudioLM generates syntactically and semantically plausible speech continuations while also maintaining speaker identity and prosody for unseen speakers. Furthermore, we demonstrate how our approach extends beyond speech by generating coherent piano music continuations, despite being trained without any symbolic representation of music

    AudioPaLM: A Large Language Model That Can Speak and Listen

    Full text link
    We introduce AudioPaLM, a large language model for speech understanding and generation. AudioPaLM fuses text-based and speech-based language models, PaLM-2 [Anil et al., 2023] and AudioLM [Borsos et al., 2022], into a unified multimodal architecture that can process and generate text and speech with applications including speech recognition and speech-to-speech translation. AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2. We demonstrate that initializing AudioPaLM with the weights of a text-only large language model improves speech processing, successfully leveraging the larger quantity of text training data used in pretraining to assist with the speech tasks. The resulting model significantly outperforms existing systems for speech translation tasks and has the ability to perform zero-shot speech-to-text translation for many languages for which input/target language combinations were not seen in training. AudioPaLM also demonstrates features of audio language models, such as transferring a voice across languages based on a short spoken prompt. We release examples of our method at https://google-research.github.io/seanet/audiopalm/examplesComment: Technical repor

    RAND appropriateness panel to determine the applicability of UK guidelines on the management of acute respiratory distress syndrome (ARDS) and other strategies in the context of the COVID-19 pandemic.

    Get PDF
    BACKGROUND: COVID-19 has become the most common cause of acute respiratory distress syndrome (ARDS) worldwide. Features of the pathophysiology and clinical presentation partially distinguish it from 'classical' ARDS. A Research and Development (RAND) analysis gauged the opinion of an expert panel about the management of ARDS with and without COVID-19 as the precipitating cause, using recent UK guidelines as a template. METHODS: An 11-person panel comprising intensive care practitioners rated the appropriateness of ARDS management options at different times during hospital admission, in the presence or absence of, or varying severity of SARS-CoV-2 infection on a scale of 1-9 (where 1-3 is inappropriate, 4-6 is uncertain and 7-9 is appropriate). A summary of the anonymised results was discussed at an online meeting moderated by an expert in RAND methodology. The modified online survey comprising 76 questions, subdivided into investigations (16), non-invasive respiratory support (18), basic intensive care unit management of ARDS (20), management of refractory hypoxaemia (8), pharmacotherapy (7) and anticoagulation (7), was completed again. RESULTS: Disagreement between experts was significant only when addressing the appropriateness of diagnostic bronchoscopy in patients with confirmed or suspected COVID-19. Adherence to existing published guidelines for the management of ARDS for relevant evidence-based interventions was recommended. Responses of the experts to the final survey suggested that the supportive management of ARDS should be the same, regardless of a COVID-19 diagnosis. For patients with ARDS with COVID-19, the panel recommended routine treatment with corticosteroids and a lower threshold for full anticoagulation based on a high index of suspicion for venous thromboembolic disease. CONCLUSION: The expert panel found no reason to deviate from the evidence-based supportive strategies for managing ARDS outlined in recent guidelines
    corecore