68 research outputs found

    Toward Interactive Dictation

    Full text link
    Voice dictation is an increasingly important text input modality. Existing systems that allow both dictation and editing-by-voice restrict their command language to flat templates invoked by trigger words. In this work, we study the feasibility of allowing users to interrupt their dictation with spoken editing commands in open-ended natural language. We introduce a new task and dataset, TERTiUS, to experiment with such systems. To support this flexibility in real-time, a system must incrementally segment and classify spans of speech as either dictation or command, and interpret the spans that are commands. We experiment with using large pre-trained language models to predict the edited text, or alternatively, to predict a small text-editing program. Experiments show a natural trade-off between model accuracy and latency: a smaller model achieves 30% end-state accuracy with 1.3 seconds of latency, while a larger model achieves 55% end-state accuracy with 7 seconds of latency.Comment: 17 pages, 5 tables, 4 figures; AC

    Eliciting Human Preferences with Language Models

    Full text link
    Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts. But selecting examples or writing prompts for can be challenging--especially in tasks that involve unusual edge cases, demand precise articulation of nebulous preferences, or require an accurate mental model of LM behavior. We propose to use *LMs themselves* to guide the task specification process. In this paper, we introduce **Generative Active Task Elicitation (GATE)**: a learning framework in which models elicit and infer intended behavior through free-form, language-based interaction with users. We study GATE in three domains: email validation, content recommendation, and moral reasoning. In preregistered experiments, we show that LMs prompted to perform GATE (e.g., by generating open-ended questions or synthesizing informative edge cases) elicit responses that are often more informative than user-written prompts or labels. Users report that interactive task elicitation requires less effort than prompting or example labeling and surfaces novel considerations not initially anticipated by users. Our findings suggest that LM-driven elicitation can be a powerful tool for aligning models to complex human preferences and values.Comment: 26 pages, 15 figure

    Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

    Full text link
    When a neural language model (LM) is adapted to perform a new task, what aspects of the task predict the eventual performance of the model? In NLP, systematic features of LM generalization to individual examples are well characterized, but systematic aspects of LM adaptability to new tasks are not nearly as well understood. We present a large-scale empirical study of the features and limits of LM adaptability using a new benchmark, TaskBench500, built from 500 procedurally generated sequence modeling tasks. These tasks combine core aspects of language processing, including lexical semantics, sequence processing, memorization, logical reasoning, and world knowledge. Using TaskBench500, we evaluate three facets of adaptability, finding that: (1) adaptation procedures differ dramatically in their ability to memorize small datasets; (2) within a subset of task types, adaptation procedures exhibit compositional adaptability to complex tasks; and (3) failure to match training label distributions is explained by mismatches in the intrinsic difficulty of predicting individual labels. Our experiments show that adaptability to new tasks, like generalization to new examples, can be systematically described and understood, and we conclude with a discussion of additional aspects of adaptability that could be studied using the new benchmark.Comment: NAACL 2022; 20 pages, 6 figures, 8 table

    Deep Sequencing of the Nicastrin Gene in Pooled DNA, the Identification of Genetic Variants That Affect Risk of Alzheimer's Disease

    Get PDF
    Nicastrin is an obligatory component of the γ-secretase; the enzyme complex that leads to the production of Aβ fragments critically central to the pathogenesis of Alzheimer's disease (AD). Analyses of the effects of common variation in this gene on risk for late onset AD have been inconclusive. We investigated the effect of rare variation in the coding regions of the Nicastrin gene in a cohort of AD patients and matched controls using an innovative pooling approach and next generation sequencing. Five SNPs were identified and validated by individual genotyping from 311 cases and 360 controls. Association analysis identified a non-synonymous rare SNP (N417Y) with a statistically higher frequency in cases compared to controls in the Greek population (OR 3.994, CI 1.105–14.439, p = 0.035). This finding warrants further investigation in a larger cohort and adds weight to the hypothesis that rare variation explains some of genetic heritability still to be identified in Alzheimer's disease

    Identification and characterization of microRNAs expressed in the African malaria vector Anopheles funestus life stages using high throughput sequencing

    Get PDF
    Background: Over the past several years, thousands of microRNAs (miRNAs) have been identified in the genomes of various insects through cloning and sequencing or even by computational prediction. However, the number of miRNAs identified in anopheline species is low and little is known about their role. The mosquito Anopheles funestus is one of the dominant malaria vectors in Africa, which infects and kills millions of people every year. Therefore, small RNA molecules isolated from the four life stages (eggs, larvae, pupae and unfed adult females) of An. funestus were sequenced using next generation sequencing technology. Results: High throughput sequencing of four replicates in combination with computational analysis identified 107 mature miRNA sequences expressed in the An. funestus mosquito. These include 20 novel miRNAs without sequence identity in any organism and eight miRNAs not previously reported in the Anopheles genus but are known in non-anopheles mosquitoes. Finally, the changes in the expression of miRNAs during the mosquito development were determined and the analysis showed that many miRNAs have stage-specific expression, and are co-transcribed and co-regulated during development. Conclusions: This study presents the first direct experimental evidence of miRNAs in An. funestus and the first profiling study of miRNA associated with the maturation in this mosquito. Overall, the results indicate that miRNAs play important roles during the growth and development. Silencing such molecules in a specific life stage could decrease the vector population and therefore interrupt malaria transmission.IS

    Molecular Evolution of Ultraspiracle Protein (USP/RXR) in Insects

    Get PDF
    Ultraspiracle protein/retinoid X receptor (USP/RXR) is a nuclear receptor and transcription factor which is an essential component of a heterodimeric receptor complex with the ecdysone receptor (EcR). In insects this complex binds ecdysteroids and plays an important role in the regulation of growth, development, metamorphosis and reproduction. In some holometabolous insects, including Lepidoptera and Diptera, USP/RXR is thought to have experienced several important shifts in function. These include the acquisition of novel ligand-binding properties and an expanded dimerization interface with EcR. In light of these recent hypotheses, we implemented codon-based likelihood methods to investigate if the proposed shifts in function are reflected in changes in site-specific evolutionary rates across functional and structural motifs in insect USP/RXR sequences, and if there is any evidence for positive selection at functionally important sites. Our results reveal evidence of positive selection acting on sites within the loop connecting helices H1 and H3, the ligand-binding pocket, and the dimer interface in the holometabolous lineage leading to the Lepidoptera/Diptera/Trichoptera. Similar analyses conducted using EcR sequences did not indicate positive selection. However, analyses allowing for variation across sites demonstrated elevated non-synonymous/synonymous rate ratios (dN/dS), suggesting relaxed constraint, within the dimerization interface of both USP/RXR and EcR as well as within the coactivator binding groove and helix H12 of USP/RXR. Since the above methods are based on the assumption that dS is constant among sites, we also used more recent models which relax this assumption and obtained results consistent with traditional random-sites models. Overall our findings support the evolution of novel function in USP/RXR of more derived holometabolous insects, and are consistent with shifts in structure and function which may have increased USP/RXR reliance on EcR for cofactor recruitment. Moreover, these findings raise important questions regarding hypotheses which suggest the independent activation of USP/RXR by its own ligand

    The development and validation of a scoring tool to predict the operative duration of elective laparoscopic cholecystectomy

    Get PDF
    Background: The ability to accurately predict operative duration has the potential to optimise theatre efficiency and utilisation, thus reducing costs and increasing staff and patient satisfaction. With laparoscopic cholecystectomy being one of the most commonly performed procedures worldwide, a tool to predict operative duration could be extremely beneficial to healthcare organisations. Methods: Data collected from the CholeS study on patients undergoing cholecystectomy in UK and Irish hospitals between 04/2014 and 05/2014 were used to study operative duration. A multivariable binary logistic regression model was produced in order to identify significant independent predictors of long (> 90 min) operations. The resulting model was converted to a risk score, which was subsequently validated on second cohort of patients using ROC curves. Results: After exclusions, data were available for 7227 patients in the derivation (CholeS) cohort. The median operative duration was 60 min (interquartile range 45–85), with 17.7% of operations lasting longer than 90 min. Ten factors were found to be significant independent predictors of operative durations > 90 min, including ASA, age, previous surgical admissions, BMI, gallbladder wall thickness and CBD diameter. A risk score was then produced from these factors, and applied to a cohort of 2405 patients from a tertiary centre for external validation. This returned an area under the ROC curve of 0.708 (SE = 0.013, p  90 min increasing more than eightfold from 5.1 to 41.8% in the extremes of the score. Conclusion: The scoring tool produced in this study was found to be significantly predictive of long operative durations on validation in an external cohort. As such, the tool may have the potential to enable organisations to better organise theatre lists and deliver greater efficiencies in care
    corecore