556 research outputs found

    Automated Refactoring of Nested-IF Formulae in Spreadsheets

    Full text link
    Spreadsheets are the most popular end-user programming software, where formulae act like programs and also have smells. One well recognized common smell of spreadsheet formulae is nest-IF expressions, which have low readability and high cognitive cost for users, and are error-prone during reuse or maintenance. However, end users usually lack essential programming language knowledge and skills to tackle or even realize the problem. The previous research work has made very initial attempts in this aspect, while no effective and automated approach is currently available. This paper firstly proposes an AST-based automated approach to systematically refactoring nest-IF formulae. The general idea is two-fold. First, we detect and remove logic redundancy on the AST. Second, we identify higher-level semantics that have been fragmented and scattered, and reassemble the syntax using concise built-in functions. A comprehensive evaluation has been conducted against a real-world spreadsheet corpus, which is collected in a leading IT company for research purpose. The results with over 68,000 spreadsheets with 27 million nest-IF formulae reveal that our approach is able to relieve the smell of over 99\% of nest-IF formulae. Over 50% of the refactorings have reduced nesting levels of the nest-IFs by more than a half. In addition, a survey involving 49 participants indicates that for most cases the participants prefer the refactored formulae, and agree on that such automated refactoring approach is necessary and helpful

    Distinct host immune responses in recurrent vulvovaginal candidiasis and vulvovaginal candidiasis

    Get PDF
    Recurrent vulvovaginal candidiasis (RVVC) and vulvovaginal candidiasis (RVVC) are one of the most common gynecological infections, primarily caused by Candida species. Although risk factors of RVVC and VVC have been identified in many studies, antifungal immunological mechanisms are still not fully understood. We performed a 1-year prospective study in a local hospital to monitor 98 patients clinically diagnosed with gynecological Candida infection. The results showed that 20.41% (20/98) are with RVVC, and 79.59% (78/98) patients have VVC. C. albicans accounts for 90% and 96.1% of all strains isolated collected from RVVC and VVC patients, respectively. Antifungal susceptibility testing showed no significant difference in Candida species between RVVC and VVC patients. However, the serum levels of IFN-γ, TNF-α, and IL-17F in the RVVC group were significantly lower than those of the VVC group, while IL-4, IL-6, and IL-10 were higher in the RVVC patients than VVC patients. IL-17A and IL-2 levels were comparable between the two groups. Taken together, our results suggest that the host-immune responses, especially Th1/2 immunity, may play important roles in prognosis of RVVC and VVC

    Identification of Free and Bound Exciton States and Their Phase-Dependent Trapping Behavior in Lead Halide Perovskites

    Full text link
    In this work we probe the sub-gap energy states within polycrystalline and single crystal lead halide perovskites to better understand their intrinsic photophysics behaviors. Through combined temperature and intensity-dependent optical measurements, we reveal the existence of both free and bound exciton contributions within the sub-gap energy state manifold. The trapping and recombination dynamics of these excitons is shown to be strongly dependent on the structural phase of the perovskite. The orthorhombic phase exhibits ultrafast exciton trapping and distinct trap emission, while the tetragonal phase gives low monomolecular recombination velocity and capture cross-sections (~10-18 cm2). Within the multiphonon transition scenario, this suppression in charge trapping is caused by the increase in the charge capture activation energy due to the reduction in electron-lattice interactions, which can be the origin for the unexpected long carrier lifetime in these material systems.Comment: 5 figure

    Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond

    Full text link
    Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better. Replication package including source code, datasets, and online Appendix is available at: \url{https://github.com/DeepSoftwareAnalytics/Telly}.Comment: Accepted by ISSTA 2023 (The 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis

    GPT4Table: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study

    Full text link
    Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks. However, there is still much to learn about how well LLMs understand structured data, such as tables. While it is true that tables can be used as inputs to LLMs with serialization, there is a lack of comprehensive studies examining whether LLMs can truly comprehend such data. In this paper, we try to understand this by designing a benchmark to evaluate the structural understanding capabilities (SUC) of LLMs. The benchmark we create includes seven tasks, each with its own unique challenges, \eg, cell lookup, row retrieval, and size detection. We conduct a series of evaluations on GPT-3.5 and GPT-4. We find that the performance varied depending on several input choices, including table input format, content order, role prompting, and partition marks. Drawing from the insights gained through the benchmark evaluations, we propose \textit{self-augmentation} for effective structural prompting, such as critical value / range identification using LLMs' internal knowledge. When combined with carefully chosen input choices, these structural prompting methods lead to promising improvements in LLM performance on a variety of tabular tasks, \eg, TabFact(↑2.31%\uparrow2.31\%), HybridQA(↑2.13%\uparrow2.13\%), SQA(↑2.72%\uparrow2.72\%), Feverous(↑0.84%\uparrow0.84\%), and ToTTo(↑5.68%\uparrow5.68\%). We believe that our benchmark and proposed prompting methods can serve as a simple yet generic selection for future research.Comment: This paper has been accepted as a full paper at WSDM 202

    XInsight: eXplainable Data Analysis Through The Lens of Causality

    Full text link
    In light of the growing popularity of Exploratory Data Analysis (EDA), understanding the underlying causes of the knowledge acquired by EDA is crucial. However, it remains under-researched. This study promotes a transparent and explicable perspective on data analysis, called eXplainable Data Analysis (XDA). For this reason, we present XInsight, a general framework for XDA. XInsight provides data analysis with qualitative and quantitative explanations of causal and non-causal semantics. This way, it will significantly improve human understanding and confidence in the outcomes of data analysis, facilitating accurate data interpretation and decision making in the real world. XInsight is a three-module, end-to-end pipeline designed to extract causal graphs, translate causal primitives into XDA semantics, and quantify the quantitative contribution of each explanation to a data fact. XInsight uses a set of design concepts and optimizations to address the inherent difficulties associated with integrating causality into XDA. Experiments on synthetic and real-world datasets as well as a user study demonstrate the highly promising capabilities of XInsight

    Detecting local processing unit in drosophila brain by using network theory

    Full text link
    Community detection method in network theory was applied to the neuron network constructed from the image overlapping between neuron pairs to detect the Local Processing Unit (LPU) automatically in Drosophila brain. 26 communities consistent with the known LPUs, and 13 subdivisions were found. Besides, 45 tracts were detected and could be discriminated from the LPUs by analyzing the distribution of participation coefficient P. Furthermore, layer structures in fan-shaped body (FB) were observed which coincided with the images shot by the optical devices, and a total of 13 communities were proven closely related to FB. The method proposed in this work was proven effective to identify the LPU structure in Drosophila brain irrespectively of any subjective aspect, and could be applied to the relevant areas extensively

    Demonstration of InsightPilot: An LLM-Empowered Automated Data Exploration System

    Full text link
    Exploring data is crucial in data analysis, as it helps users understand and interpret the data more effectively. However, performing effective data exploration requires in-depth knowledge of the dataset and expertise in data analysis techniques. Not being familiar with either can create obstacles that make the process time-consuming and overwhelming for data analysts. To address this issue, we introduce InsightPilot, an LLM (Large Language Model)-based, automated data exploration system designed to simplify the data exploration process. InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining. Then, these analysis intents are concretized by issuing corresponding intentional queries (IQueries) to create a meaningful and coherent exploration sequence. In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts and simplifies the exploration process for users. By employing an LLM to iteratively collaborate with a state-of-the-art insight engine via IQueries, InsightPilot is effective in analyzing real-world datasets, enabling users to gain valuable insights through natural language inquiries. We demonstrate the effectiveness of InsightPilot in a case study, showing how it can help users gain valuable insights from their datasets
    • …
    corecore