34 research outputs found

    Study of Distractors in Neural Models of Code

    Full text link
    Finding important features that contribute to the prediction of neural models is an active area of research in explainable AI. Neural models are opaque and finding such features sheds light on a better understanding of their predictions. In contrast, in this work, we present an inverse perspective of distractor features: features that cast doubt about the prediction by affecting the model's confidence in its prediction. Understanding distractors provide a complementary view of the features' relevance in the predictions of neural models. In this paper, we apply a reduction-based technique to find distractors and provide our preliminary results of their impacts and types. Our experiments across various tasks, models, and datasets of code reveal that the removal of tokens can have a significant impact on the confidence of models in their predictions and the categories of tokens can also play a vital role in the model's confidence. Our study aims to enhance the transparency of models by emphasizing those tokens that significantly influence the confidence of the models.Comment: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, Co-located with ICSE (InteNSE'23

    Memorization and Generalization in Neural Code Intelligence Models

    Full text link
    Deep Neural Networks (DNN) are increasingly commonly used in software engineering and code intelligence tasks. These are powerful tools that are capable of learning highly generalizable patterns from large datasets through millions of parameters. At the same time, training DNNs means walking a knife's edges, because their large capacity also renders them prone to memorizing data points. While traditionally thought of as an aspect of over-training, recent work suggests that the memorization risk manifests especially strongly when the training datasets are noisy and memorization is the only recourse. Unfortunately, most code intelligence tasks rely on rather noise-prone and repetitive data sources, such as GitHub, which, due to their sheer size, cannot be manually inspected and evaluated. We evaluate the memorization and generalization tendencies in neural code intelligence models through a case study across several benchmarks and model families by leveraging established approaches from other fields that use DNNs, such as introducing targeted noise into the training dataset. In addition to reinforcing prior general findings about the extent of memorization in DNNs, our results shed light on the impact of noisy dataset in training.Comment: manuscript in preparatio

    A Study of Variable-Role-based Feature Enrichment in Neural Models of Code

    Full text link
    Although deep neural models substantially reduce the overhead of feature engineering, the features readily available in the inputs might significantly impact training cost and the performance of the models. In this paper, we explore the impact of an unsuperivsed feature enrichment approach based on variable roles on the performance of neural models of code. The notion of variable roles (as introduced in the works of Sajaniemi et al. [Refs. 1,2]) has been found to help students' abilities in programming. In this paper, we investigate if this notion would improve the performance of neural models of code. To the best of our knowledge, this is the first work to investigate how Sajaniemi et al.'s concept of variable roles can affect neural models of code. In particular, we enrich a source code dataset by adding the role of individual variables in the dataset programs, and thereby conduct a study on the impact of variable role enrichment in training the Code2Seq model. In addition, we shed light on some challenges and opportunities in feature enrichment for neural code intelligence models.Comment: Accepted in the 1st International Workshop on Interpretability and Robustness in Neural Software Engineering (InteNSE'23), Co-located with ICS

    Assessment of eye health care services of Bangladesh using eye care service assessment tools

    Get PDF
    Background: Bangladesh is being the commissioner for oaths to vision 2020, a global campaign for elimination of avoidable blindness by 2020, formulated a national eye care plan. This report illustrates the present status of Bangladesh eye health care service using eye care service assessment tool (ECSAT) that assesses an eye health system across six ‘building blocks’ of a health system.Methods: The study followed a mixed method to collect data. World health organization (WHO) standard ECSAT was used to gather information on eye care service. A purposive sampling method was used. Data from the assessment were extracted and all the information was cross-checked with leading stakeholders of ministry of health.Results: Eye care planning is led by the national eye care. There is a national eye health action plan and a national eye health coordination office under the ministry of health. The health delivery system includes primarily government and non-profit facilities with eight hospitals delivering specialist eye care services across the country. A significant proportion of eye care is provided through community outreach camps and a network of primary and community health workers. The national cataract surgical rate (CSR) is estimated at 2600 per million populations per year.Conclusions: This assessment suggests that although Bangladesh has made some progress towards elimination of avoidable blindness, it would be difficult to retain without further significant investment with a transparent accountability framework in eye health considering all limitation and contemporary challenges

    CIAS detection of Fasciola hepatica/F. gigantica intermediate forms in bovines from Bangladesh

    Get PDF
    Fascioliasis is an important food-borne parasitic zoonosis caused by two trematode species, Fasciola hepatica and Fasciola gigantica. The characterisation and differentiation of Fasciola populations is crucial to control the disease, given the different transmission, epidemiology and pathology characteristics of the two species. Lineal biometric features of adult liver flukes infecting livestock have been studied to characterise and discriminate fasciolids from Bangladesh. An accurate analysis was conducted to phenotypically discriminate between fasciolids from naturally infected bovines (cattle, buffaloes) throughout the country. Morphometric analyses were made with a computer image analysis system (CIAS) applied on the basis of standardised measurements and the logistic model of the body growth and development of fasciolids in the different host groups. Since it is the first ever comprehensive study of this kind undertaken in Bangladesh, the results are compared to pure fasciolid populations of F. hepatica from the European Mediterranean area and F. gigantica from Burkina Faso, geographical areas where both species do not co-exist. Principal component analysis showed that the biometric characteristics of fasciolids from Bangladesh are situated between F. hepatica and F. gigantica standard populations, indicating the presence of phenotypes of intermediate forms in Bangladesh. These results are analysed by considering the present emergence of animal fascioliasis, the local lymnaeid fauna, the impact of climate change, and the risk of human infection in the country

    Single-cell insights into immune dysregulation in rheumatoid arthritis flare versus drug-free remission

    Get PDF
    Immune-mediated inflammatory diseases (IMIDs) are typically characterised by relapsing and remitting flares of inflammation. However, the unpredictability of disease flares impedes their study. Addressing this critical knowledge gap, we use the experimental medicine approach of immunomodulatory drug withdrawal in rheumatoid arthritis (RA) remission to synchronise flare processes allowing detailed characterisation. Exploratory mass cytometry analyses reveal three circulating cellular subsets heralding the onset of arthritis flare – CD45RO+PD1hi CD4+ and CD8+ T cells, and CD27+CD86+CD21- B cells – further characterised by single-cell sequencing. Distinct lymphocyte subsets including cytotoxic and exhausted CD4+ memory T cells, memory CD8+CXCR5+ T cells, and IGHA1+ plasma cells are primed for activation in flare patients. Regulatory memory CD4+ T cells (Treg cells) increase at flare onset, but with dysfunctional regulatory marker expression compared to drug-free remission. Significant clonal expansion is observed in T cells, but not B cells, after drug cessation; this is widespread throughout memory CD8+ T cell subsets but limited to the granzyme-expressing cytotoxic subset within CD4+ memory T cells. Based on our observations, we suggest a model of immune dysregulation for understanding RA flare, with potential for further translational research towards novel avenues for its treatment and prevention

    Sustainable chemical processing and energy-carbon dioxide management: Review of challenges and opportunities

    Full text link

    Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Models

    Full text link
    Neural code intelligence (CI) models are opaque black-boxes and offer little insight on the features they use in making predictions. This opacity may lead to distrust in their prediction and hamper their wider adoption in safety-critical applications. Recently, input program reduction techniques have been proposed to identify key features in the input programs to improve the transparency of CI models. However, this approach is syntax-unaware and does not consider the grammar of the programming language. In this paper, we apply a syntax-guided program reduction technique that considers the grammar of the input programs during reduction. Our experiments on multiple models across different types of input programs show that the syntax-guided program reduction technique is faster and provides smaller sets of key tokens in reduced programs. We also show that the key tokens could be used in generating adversarial examples for up to 65% of the input programs.Comment: The 6th ACM SIGPLAN International Symposium on Machine Programming (MAPS'22); Related to arXiv:2202.0647
    corecore