21 research outputs found

    MEMO: Coverage-guided Model Generation For Deep Learning Library Testing

    Full text link
    Recent deep learning (DL) applications are mostly built on top of DL libraries. The quality assurance of these libraries is critical to the dependable deployment of DL applications. A few techniques have thereby been proposed to test DL libraries by generating DL models as test inputs. Then these techniques feed those DL models to DL libraries for making inferences, in order to exercise DL libraries modules related to a DL model's execution. However, the test effectiveness of these techniques is constrained by the diversity of generated DL models. Our investigation finds that these techniques can cover at most 11.7% of layer pairs (i.e., call sequence between two layer APIs) and 55.8% of layer parameters (e.g., "padding" in Conv2D). As a result, we find that many bugs arising from specific layer pairs and parameters can be missed by existing techniques. In view of the limitations of existing DL library testing techniques, we propose MEMO to efficiently generate diverse DL models by exploring layer types, layer pairs, and layer parameters. MEMO: (1) designs an initial model reduction technique to boost test efficiency without compromising model diversity; and (2) designs a set of mutation operators for a customized Markov Chain Monte Carlo (MCMC) algorithm to explore new layer types, layer pairs, and layer parameters. We evaluate MEMO on seven popular DL libraries, including four for model execution (TensorFlow, PyTorch and MXNet, and ONNX) and three for model conversions (Keras-MXNet, TF2ONNX, ONNX2PyTorch). The evaluation result shows that MEMO outperforms recent works by covering 10.3% more layer pairs, 15.3% more layer parameters, and 2.3% library branches. Moreover, MEMO detects 29 new bugs in the latest version of DL libraries, with 17 of them confirmed by DL library developers, and 5 of those confirmed bugs have been fixed.Comment: 11 pages, 8 figure

    Nuances are the Key: Unlocking ChatGPT to Find Failure-Inducing Tests with Differential Prompting

    Full text link
    Automatically detecting software failures is an important task and a longstanding challenge. It requires finding failure-inducing test cases whose test input can trigger the software's fault, and constructing an automated oracle to detect the software's incorrect behaviors. Recent advancement of large language models (LLMs) motivates us to study how far this challenge can be addressed by ChatGPT, a state-of-the-art LLM. Unfortunately, our study shows that ChatGPT has a low probability (28.8%) of finding correct failure-inducing test cases for buggy programs. A possible reason is that finding failure-inducing test cases requires analyzing the subtle code differences between a buggy program and its correct version. When these two versions have similar syntax, ChatGPT is weak at recognizing subtle code differences. Our insight is that ChatGPT's performance can be substantially enhanced when ChatGPT is guided to focus on the subtle code difference. We have an interesting observation that ChatGPT is effective in inferring the intended behaviors of a buggy program. The intended behavior can be leveraged to synthesize programs, in order to make the subtle code difference between a buggy program and its correct version (i.e., the synthesized program) explicit. Driven by this observation, we propose a novel approach that synergistically combines ChatGPT and differential testing to find failure-inducing test cases. We evaluate our approach on Quixbugs (a benchmark of buggy programs), and compare it with state-of-the-art baselines, including direct use of ChatGPT and Pynguin. The experimental result shows that our approach has a much higher probability (77.8%) of finding correct failure-inducing test cases, 2.7X as the best baseline.Comment: Accepted to the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023

    Risk of thyroid dysfunction associated with mRNA and inactivated COVID-19 vaccines: a population-based study of 2.3 million vaccine recipients

    Get PDF
    Background: In view of accumulating case reports of thyroid dysfunction following COVID-19 vaccination, we evaluated the risks of incident thyroid dysfunction following inactivated (CoronaVac) and mRNA (BNT162b2) COVID-19 vaccines using a population-based dataset. / Methods: We identified people who received COVID-19 vaccination between 23 February and 30 September 2021 from a population-based electronic health database in Hong Kong, linked to vaccination records. Thyroid dysfunction encompassed anti-thyroid drug (ATD)/levothyroxine (LT4) initiation, biochemical picture of hyperthyroidism/hypothyroidism, incident Graves’ disease (GD), and thyroiditis. A self-controlled case series design was used to estimate the incidence rate ratio (IRR) of thyroid dysfunction in a 56-day post-vaccination period compared to the baseline period (non-exposure period) using conditional Poisson regression. / Results: A total of 2,288,239 people received at least one dose of COVID-19 vaccination (57.8% BNT162b2 recipients and 42.2% CoronaVac recipients). 94.3% of BNT162b2 recipients and 92.2% of CoronaVac recipients received the second dose. Following the first dose of COVID-19 vaccination, there was no increase in the risks of ATD initiation (BNT162b2: IRR 0.864, 95% CI 0.670–1.114; CoronaVac: IRR 0.707, 95% CI 0.549–0.912), LT4 initiation (BNT162b2: IRR 0.911, 95% CI 0.716–1.159; CoronaVac: IRR 0.778, 95% CI 0.618–0.981), biochemical picture of hyperthyroidism (BNT162b2: IRR 0.872, 95% CI 0.744–1.023; CoronaVac: IRR 0.830, 95% CI 0.713–0.967) or hypothyroidism (BNT162b2: IRR 1.002, 95% CI 0.838–1.199; CoronaVac: IRR 0.963, 95% CI 0.807–1.149), GD, and thyroiditis. Similarly, following the second dose of COVID-19 vaccination, there was no increase in the risks of ATD initiation (BNT162b2: IRR 0.972, 95% CI 0.770–1.227; CoronaVac: IRR 0.879, 95%CI 0.693–1.116), LT4 initiation (BNT162b2: IRR 1.019, 95% CI 0.833–1.246; CoronaVac: IRR 0.768, 95% CI 0.613–0.962), hyperthyroidism (BNT162b2: IRR 1.039, 95% CI 0.899–1.201; CoronaVac: IRR 0.911, 95% CI 0.786–1.055), hypothyroidism (BNT162b2: IRR 0.935, 95% CI 0.794–1.102; CoronaVac: IRR 0.945, 95% CI 0.799–1.119), GD, and thyroiditis. Age- and sex-specific subgroup and sensitivity analyses showed consistent neutral associations between thyroid dysfunction and both types of COVID-19 vaccines. / Conclusions: Our population-based study showed no evidence of vaccine-related increase in incident hyperthyroidism or hypothyroidism with both BNT162b2 and CoronaVac

    Effects of SARS-CoV-2 infection on incidence and treatment strategies of hepatocellular carcinoma in people with chronic liver disease

    Get PDF
    BACKGROUND: Chronic liver disease (CLD) was associated with adverse clinical outcomes among people with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. AIM To determine the effects of SARS-CoV-2 infection on the incidence and treatment strategy of hepatocellular carcinoma (HCC) among patients with CLD. METHODS: A retrospective, territory-wide cohort of CLD patients was identified from an electronic health database in Hong Kong. Patients with confirmed SARS-CoV-2 infection [coronavirus disease 2019 (COVID-19)+CLD] between January 1, 2020 and October 25, 2022 were identified and matched 1:1 by propensity-score with those without (COVID-19-CLD). Each patient was followed up until death, outcome event, or November 15, 2022. Primary outcome was incidence of HCC. Secondary outcomes included all-cause mortality, adverse hepatic outcomes, and different treatment strategies to HCC (curative, non-curative treatment, and palliative care). Analyses were further stratified by acute (within 20 d) and post-acute (21 d or beyond) phases of SARS-CoV-2 infection. Incidence rate ratios (IRRs) were estimated by Poisson regression models. RESULTS: Of 193589 CLD patients (> 95% non-cirrhotic) in the cohort, 55163 patients with COVID-19+CLD and 55163 patients with COVID-19-CLD were included after 1:1 propensity-score matching. Upon 249-d median follow-up, COVID-19+CLD was not associated with increased risk of incident HCC (IRR: 1.19, 95%CI: 0.99-1.42, P = 0.06), but higher risks of receiving palliative care for HCC (IRR: 1.60, 95%CI: 1.46-1.75, P < 0.001), compared to COVID-19- CLD. In both acute and post-acute phases of infection, COVID-19+CLD were associated with increased risks of all-cause mortality (acute: IRR: 7.06, 95%CI: 5.78-8.63, P < 0.001; post-acute: IRR: 1.24, 95%CI: 1.14-1.36, P < 0.001) and adverse hepatic outcomes (acute: IRR: 1.98, 95%CI: 1.79-2.18, P < 0.001; post-acute: IRR: 1.24, 95%CI: 1.13-1.35, P < 0.001), compared to COVID-19-CLD. CONCLUSION: Although CLD patients with SARS-CoV-2 infection were not associated with increased risk of HCC, they were more likely to receive palliative treatment than those without. The detrimental effects of SARS-CoV-2 infection persisted in post-acute phase

    Lupus nephritis in Chinese children--a territory-wide cohort study in Hong Kong

    Get PDF
    We report a multicenter study of Chinese children in Hong Kong with systemic lupus erythematosus (SLE) nephritis. Children were included if: they fulfilled the ACR criteria, had significant proteinuria or casturia, were Chinese and younger than 19 years and had been diagnosed with SLE between January 1990 and December 2003. Investigators in each center retrieved data on clinical features, biopsy reports, treatment and outcome of these patients. There were 128 patients (eight boys, 120 girls; mean age: 11.9+/-2.8 years). About 50% presented with multisystem illness and 40% with nephritic/nephrotic symptoms. Negative anti-dsDNA antibodies were found in 6% of the patients. Renal biopsy revealed WHO Class II, III, IV and V nephritis in 13 (10%), 22 (17%), 69 (54%) and 13 (10%) patients, respectively. The clinical severity of the nephritis did not accurately predict renal biopsy findings. The follow-up period ranged from 1 to 16.5 years (mean+/-SD: 5.76+/-3.61 years). During the study five patients died (two from lupus flare, one from cardiomyopathy, two from infections). Four patients had endstage renal failure (ESRF) (one died during a lupus flare). All deaths and end-stage renal failure occurred in the Class IV nephritis group. Chronic organ damage was infrequent in the survivors. The actuarial patient survival rates at 5, 10 and 15 years of age were 95.3, 91.8, and 91.8%, respectively. For Class IV nephritis patients, the survival rates without ESRF at 5, 10, and 15 years were 91.5, 82.3 and 76%, respectively. The survival and chronic morbidity rates of the Chinese SLE children in the present study are comparable to those of other published studies.postprin

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals &lt;1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    Why We Should Care about Ebola in West Africa and Middle East Respiratory Syndrome in South Korea: Global Health Ethics and the Moral Insignificance of Proximity

    Full text link
    In the era of globalization, no society exists in isolation. Global transportation networks facilitate the international spread of emerging infectious diseases, such as Ebola and Middle East Respiratory Syndrome (MERS). From restrictions of travel with regard to Ebola-stricken countries to international aid delivered to West Africa, from advice against travelling to South Korea (The Government of Hong Kong Special Administrative Region 2015) to experts from the World Health Organization visiting Seoul, decisions made by any country often have global health ramifications. Global health advocates affirm the importance of moral responsibilities for global public health. However, does everyone have moral responsibilities to help stop the Ebola outbreak in West Africa or MERS in South Korea and the Middle East? Should we consider global health issues to be as important as domestic ones

    The Importance of Solid-state Molecular Motion to Room Temperature Phosphorescence

    Full text link
    Molecular motion is often considered detrimental to luminescence because it favors nonradiative decay. However, nothing is absolute, and molecular motion can also do useful work if utilized properly. For example, photothermal therapy makes use of the heat generated in light irradiation for cancer treatment. To further explore the merits of molecular motion, ortho-substituted benzoic acids were used as model compounds to evaluate the importance of molecular motion to luminescence in the solid state. It is verified that the twisting of the carboxylic acid group can activate spin vibronic coupling to facilitate intersystem crossing to result in more efficient room temperature phosphorescence (RTP). A five-state model is established to understand the ISC process and an effective pre-twisted molecular design strategy is put forward for the development of efficient RTP materials.</div
    corecore