33 research outputs found

    Execution-based Code Generation using Deep Reinforcement Learning

    Full text link
    The utilization of programming language (PL) models, pretrained on large-scale code corpora, as a means of automating software engineering processes has demonstrated considerable potential in streamlining various code generation tasks such as code completion, code translation, and program synthesis. However, current approaches mainly rely on supervised fine-tuning objectives borrowed from text generation, neglecting specific sequence-level features of code, including but not limited to compilability as well as syntactic and functional correctness. To address this limitation, we propose PPOCoder, a new framework for code generation that combines pretrained PL models with Proximal Policy Optimization (PPO) deep reinforcement learning and employs execution feedback as the external source of knowledge into the model optimization. PPOCoder is transferable across different code generation tasks and PLs. Extensive experiments on three code generation tasks demonstrate the effectiveness of our proposed approach compared to SOTA methods, improving the success rate of compilation and functional correctness over different PLs. Our code can be found at https://github.com/reddy-lab-code-research/PPOCoder

    Identifying TBI Physiological States by Clustering Multivariate Clinical Time-Series Data

    Full text link
    Determining clinically relevant physiological states from multivariate time series data with missing values is essential for providing appropriate treatment for acute conditions such as Traumatic Brain Injury (TBI), respiratory failure, and heart failure. Utilizing non-temporal clustering or data imputation and aggregation techniques may lead to loss of valuable information and biased analyses. In our study, we apply the SLAC-Time algorithm, an innovative self-supervision-based approach that maintains data integrity by avoiding imputation or aggregation, offering a more useful representation of acute patient states. By using SLAC-Time to cluster data in a large research dataset, we identified three distinct TBI physiological states and their specific feature profiles. We employed various clustering evaluation metrics and incorporated input from a clinical domain expert to validate and interpret the identified physiological states. Further, we discovered how specific clinical events and interventions can influence patient states and state transitions.Comment: 10 pages, 7 figures, 2 table

    A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values (SLAC-Time): An Application to TBI Phenotyping

    Full text link
    Self-supervised learning approaches provide a promising direction for clustering multivariate time-series data. However, real-world time-series data often include missing values, and the existing approaches require imputing missing values before clustering, which may cause extensive computations and noise and result in invalid interpretations. To address these challenges, we present a Self-supervised Learning-based Approach to Clustering multivariate Time-series data with missing values (SLAC-Time). SLAC-Time is a Transformer-based clustering method that uses time-series forecasting as a proxy task for leveraging unlabeled data and learning more robust time-series representations. This method jointly learns the neural network parameters and the cluster assignments of the learned representations. It iteratively clusters the learned representations with the K-means method and then utilizes the subsequent cluster assignments as pseudo-labels to update the model parameters. To evaluate our proposed approach, we applied it to clustering and phenotyping Traumatic Brain Injury (TBI) patients in the Transforming Research and Clinical Knowledge in Traumatic Brain Injury (TRACK-TBI) study. Our experiments demonstrate that SLAC-Time outperforms the baseline K-means clustering algorithm in terms of silhouette coefficient, Calinski Harabasz index, Dunn index, and Davies Bouldin index. We identified three TBI phenotypes that are distinct from one another in terms of clinically significant variables as well as clinical outcomes, including the Extended Glasgow Outcome Scale (GOSE) score, Intensive Care Unit (ICU) length of stay, and mortality rate. The experiments show that the TBI phenotypes identified by SLAC-Time can be potentially used for developing targeted clinical trials and therapeutic strategies.Comment: Submitted to the Journal of Biomedical Informatic

    Psychopathology predicts the outcome of medial branch blocks with corticosteroid for chronic axial low back or cervical pain: a prospective cohort study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Comorbid psychopathology is an important predictor of poor outcome for many types of treatments for back or neck pain. But it is unknown if this applies to the results of medial branch blocks (MBBs) for chronic low back or neck pain, which involves injecting the medial branch of the dorsal ramus nerves that innervate the facet joints. The objective of this study was to determine whether high levels of psychopathology are predictive of pain relief after MBB injections in the lumbar or cervical spine.</p> <p>Methods</p> <p>This was a prospective cohort study. Consecutive patients in a pain medicine practice undergoing MBBs of the lumbar or cervical facets with corticosteroids were recruited to participate. Subjects were selected for a MBB based on operationalized selection criteria and the procedure was performed in a standardized manner. Subjects completed the Brief Pain Inventory (BPI) and the Hospital Anxiety and Depression Scale (HADS) just prior to the procedure and at one-month follow up. Scores on the HADS classified the subjects into three groups based on psychiatric symptoms, which formed the primary predictor variable: <it>Low</it>, <it>Moderate</it>, or <it>High </it>levels of psychopathology. The primary outcome measure was the percent improvement in average daily pain rating one-month following an injection. Analysis of variance and chi-square were used to analyze the analgesia and functional rating differences between groups, and to perform a responder analysis.</p> <p>Results</p> <p>Eighty six (86) subjects completed the study. The <it>Low </it>psychopathology group (n = 37) reported a mean of 23% improvement in pain at one-month while the <it>High </it>psychopathology group (n = 29) reported a mean worsening of -5.8% in pain (p < .001). Forty five percent (45%) of the <it>Low </it>group had at least 30% improvement in pain versus 10% in the <it>High </it>group (p < .001). Using an analysis of covariance, no baseline demographic, social, or medical variables were significant predictors of pain improvement, nor did they mitigate the effect of psychopathology on the outcome.</p> <p>Conclusion</p> <p>Psychiatric comorbidity is associated with diminished pain relief after a MBB injection performed with steroid at one-month follow-up. These findings illustrate the importance of assessing comorbid psychopathology as part of a spine care evaluation.</p

    Analyzing a fake news authorship network

    Get PDF
    This project synthesizes a set of 246 fake news websites previously identified in three earlier research projects. From this dataset, we extract a set of all authors who have written for these sites in 2016. This authorcentric dataset is itself a contribution that will allow future analysis of the fake news ecosystem. Based on the data we collected, we construct a network of fake news sites, linking them if they shared a common author. Our analysis shows a tight cluster of author-sharing sites, with a small core set of sites sharing dozens of authors

    International Consensus Statement on Rhinology and Allergy: Rhinosinusitis

    Get PDF
    Background: The 5 years since the publication of the first International Consensus Statement on Allergy and Rhinology: Rhinosinusitis (ICAR‐RS) has witnessed foundational progress in our understanding and treatment of rhinologic disease. These advances are reflected within the more than 40 new topics covered within the ICAR‐RS‐2021 as well as updates to the original 140 topics. This executive summary consolidates the evidence‐based findings of the document. Methods: ICAR‐RS presents over 180 topics in the forms of evidence‐based reviews with recommendations (EBRRs), evidence‐based reviews, and literature reviews. The highest grade structured recommendations of the EBRR sections are summarized in this executive summary. Results: ICAR‐RS‐2021 covers 22 topics regarding the medical management of RS, which are grade A/B and are presented in the executive summary. Additionally, 4 topics regarding the surgical management of RS are grade A/B and are presented in the executive summary. Finally, a comprehensive evidence‐based management algorithm is provided. Conclusion: This ICAR‐RS‐2021 executive summary provides a compilation of the evidence‐based recommendations for medical and surgical treatment of the most common forms of RS

    StructCoder: Structure-Aware Transformer for Code Generation

    Full text link
    There has been a recent surge of interest in automating software engineering tasks using deep learning. This work addresses the problem of code generation where the goal is to generate target code given source code in a different language or a natural language description. Most of the state-of-the-art deep learning models for code generation use training strategies that are primarily designed for natural language. However, understanding and generating code requires a more rigorous comprehension of the code syntax and semantics. With this motivation, we develop an encoder-decoder Transformer model where both the encoder and decoder are trained to recognize the syntax and data flow in the source and target codes, respectively. We not only make the encoder structure-aware by leveraging the source code's syntax tree and data flow graph, but we also ensure that our decoder preserves the syntax and data flow of the target code by introducing two auxiliary tasks: AST (Abstract Syntax Tree) paths prediction and data flow prediction. To the best of our knowledge, this is the first work to introduce a structure-aware Transformer decoder to enhance the quality of generated code by modeling target syntax and data flow. The proposed StructCoder model achieves state-of-the-art performance on code translation and text-to-code generation tasks in the CodeXGLUE benchmark

    WindowSHAP: An Efficient Framework for Explaining Time-series Classifiers based on Shapley Values

    Full text link
    Unpacking and comprehending how black-box machine learning algorithms make decisions has been a persistent challenge for researchers and end-users. Explaining time-series predictive models is useful for clinical applications with high stakes to understand the behavior of prediction models. However, existing approaches to explain such models are frequently unique to data where the features do not have a time-varying component. In this paper, we introduce WindowSHAP, a model-agnostic framework for explaining time-series classifiers using Shapley values. We intend for WindowSHAP to mitigate the computational complexity of calculating Shapley values for long time-series data as well as improve the quality of explanations. WindowSHAP is based on partitioning a sequence into time windows. Under this framework, we present three distinct algorithms of Stationary, Sliding and Dynamic WindowSHAP, each evaluated against baseline approaches, KernelSHAP and TimeSHAP, using perturbation and sequence analyses metrics. We applied our framework to clinical time-series data from both a specialized clinical domain (Traumatic Brain Injury - TBI) as well as a broad clinical domain (critical care medicine). The experimental results demonstrate that, based on the two quantitative metrics, our framework is superior at explaining clinical time-series classifiers, while also reducing the complexity of computations. We show that for time-series data with 120 time steps (hours), merging 10 adjacent time points can reduce the CPU time of WindowSHAP by 80% compared to KernelSHAP. We also show that our Dynamic WindowSHAP algorithm focuses more on the most important time steps and provides more understandable explanations. As a result, WindowSHAP not only accelerates the calculation of Shapley values for time-series data, but also delivers more understandable explanations with higher quality.Comment: Submitted to the Journal of Biomedical Informatic
    corecore