39 research outputs found
On Using Machine Learning to Identify Knowledge in API Reference Documentation
Using API reference documentation like JavaDoc is an integral part of
software development. Previous research introduced a grounded taxonomy that
organizes API documentation knowledge in 12 types, including knowledge about
the Functionality, Structure, and Quality of an API. We study how well modern
text classification approaches can automatically identify documentation
containing specific knowledge types. We compared conventional machine learning
(k-NN and SVM) and deep learning approaches trained on manually annotated Java
and .NET API documentation (n = 5,574). When classifying the knowledge types
individually (i.e., multiple binary classifiers) the best AUPRC was up to 87%.
The deep learning and SVM classifiers seem complementary. For four knowledge
types (Concept, Control, Pattern, and Non-Information), SVM clearly outperforms
deep learning which, on the other hand, is more accurate for identifying the
remaining types. When considering multiple knowledge types at once (i.e.,
multi-label classification) deep learning outperforms na\"ive baselines and
traditional machine learning achieving a MacroAUC up to 79%. We also compared
classifiers using embeddings pre-trained on generic text corpora and
StackOverflow but did not observe significant improvements. Finally, to assess
the generalizability of the classifiers, we re-tested them on a different,
unseen Python documentation dataset. Classifiers for Functionality, Concept,
Purpose, Pattern, and Directive seem to generalize from Java and .NET to Python
documentation. The accuracy related to the remaining types seems API-specific.
We discuss our results and how they inform the development of tools for
supporting developers sharing and accessing API knowledge. Published article:
https://doi.org/10.1145/3338906.333894
Recommended from our members
Estimating the changing burden of disease attributable to alcohol use in South Africa for 2000, 2006 and 2012
Background. Alcohol use was one of the leading contributors to South Africa (SA)’s disease burden in 2000, accounting for 7% of deaths and disability-adjusted life years (DALYs) in the first South African Comparative Risk Assessment Study (SACRA1). Since then, patterns of alcohol use have changed, as has the epidemiological evidence pertaining to the role of alcohol as a risk factor for infectious diseases, most notably HIV/AIDS and tuberculosis (TB).
Objectives. To estimate the burden of disease attributable to alcohol use by sex and age group in SA in 2000, 2006 and 2012.
Methods. The analysis follows the World Health Organization (WHO)’s comparative risk assessment methodology. Population attributable fractions (PAFs) were calculated from modelled exposure estimated from a systematic assessment and synthesis of 17 nationally representative surveys and relative risks based on the global review by the International Model of Alcohol Harms and Policies. PAFs were applied to the burden of disease estimates from the revised second South African National Burden of Disease Study (SANBD2) to calculate the alcohol-attributable burden for deaths and DALYs for 2000, 2006 and 2012. We quantified the uncertainty by observing the posterior distribution of the estimated prevalence of drinkers and mean use among adult drinkers (≥15 years old) in a Bayesian model. We assumed no uncertainty in the outcome measures.
Results. The alcohol-attributable disease burden decreased from 2000 to 2012 after peaking in 2006, owing to shifts in the disease burden, particularly infectious disease and injuries, and changes in drinking patterns. In 2012, alcohol-attributable harm accounted for an estimated 7.1% (95% uncertainty interval (UI) 6.6 - 7.6) of all deaths and 5.6% (95% UI 5.3 - 6.0) of all DALYs. Attributable deaths were split three ways fairly evenly across major disease categories: infectious diseases (36.4%), non-communicable diseases (32.4%) and injuries (31.2%). Top rankings for alcohol-attributable DALYs for specific causes were TB (22.6%), HIV/AIDS (16.0%), road traffic injuries (15.9%), interpersonal violence (12.8%), cardiovascular disease (11.1%), cancer and cirrhosis (both 4%). Alcohol remains an important contributor to the overall disease burden, ranking fifth in terms of deaths and DALYs.
Conclusion. Although reducing overall alcohol use will decrease the burden of disease at a societal level, alcohol harm reduction strategies in SA should prioritise evidence-based interventions to change drinking patterns. Frequent heavy episodic (i.e. binge) drinking accounts for the unusually large share of injuries and infectious diseases in the alcohol-attributable burden of disease profile. Interventions should focus on the distal causes of heavy drinking by focusing on strategies recommended by the WHO’s SAFER initiative
Recommended from our members
Estimating the burden of disease attributable to household air pollution from cooking with solid fuels in South Africa for 2000, 2006 and 2012
Background. Household air pollution (HAP) due to the use of solid fuels for cooking is a global problem with significant impacts on human health, especially in low- and middle-income countries. HAP remains problematic in South Africa (SA). While electrification rates have improved over the past two decades, many people still use solid fuels for cooking owing to energy poverty.
Objectives. To estimate the disease burden attributable to HAP for cooking in SA over three time points: 2000, 2006 and 2012.
Methods. Comparative risk assessment methodology was used. The proportion of South Africans exposed to HAP was assessed and assigned the estimated concentration of particulate matter with a diameter <2.5 ÎĽg/m3
(PM2.5) associated with HAP exposure. Health outcomes and relative risks associated with HAP exposure were identified. Population-attributable fractions and the attributable burden of disease due to HAP exposure (deaths, years of life lost, years lived with disability and disability-adjusted life years (DALYs)) for SA were calculated. Attributable burden was estimated for 2000, 2006 and 2012. For the year 2012, we estimated the attributable burden at provincial level.
Results. An estimated 17.6% of the SA population was exposed to HAP in 2012. In 2012, HAP exposure was estimated to have caused 8 862 deaths (95% uncertainty interval (UI) 8 413 - 9 251) and 1.7% (95% UI 1.6% - 1.8%) of all deaths in SA, respectively. Loss of healthy life years comprised 208 816 DALYs (95% UI 195 648 - 221 007) and 1.0% of all DALYs (95% UI 0.95% - 1.0%) in 2012, respectively. Lower respiratory infections and cardiovascular disease contributed to the largest proportion of deaths and DALYs. HAP exposure due to cooking varied across provinces, and was highest in Limpopo (50.0%), Mpumalanga (27.4%) and KwaZulu-Natal (26.4%) provinces in 2012. Age standardised burden measures showed that these three provinces had the highest rates of death and DALY burden attributable to HAP.
Conclusion. The burden of disease from HAP due to cooking in SA is of significant concern. Effective interventions supported by
legislation and policy, together with awareness campaigns, are needed to ensure access to clean household fuels and improved cook stoves. Continued and enhanced efforts in this regard are required to ensure the burden of disease from HAP is curbed in SA
Recommended from our members
Estimating the burden of disease attributable to ambient air pollution (ambient PM2.5 and ambient ozone) in South Africa for 2000, 2006 and 2012
Impact of dietary patterns, individual and workplace characteristics on blood pressure status among civil servants in Bida and Wushishi communities of Niger State, Nigeria
The global burden estimate of hypertension is alarming and results in several million deaths annually. A high incidence of sudden deaths from cardiovascular diseases in the civil workforce in Nigeria is often reported. However, the associations between Dietary Patterns (DPs), individual, and workplace characteristics of hypertension among this workforce have not been fully explored. This study aimed to identify DP in the Bida and Wushishi Communities of Niger State and establish its relationship with hypertension along with other individual and workplace characteristics. Factor analysis was used to establish DP, Chi-square test to identify their relationships with hypertension, and logistic regression to determine the predictor risk factors. The prevalence of hypertension was 43.7%; mean weight, height, and body fat were: 72.8±15 kg, 166±8.9 mm and 30.4%, respectively. Three DPs: “Efficient Diet,” “Local diet,” and “Energy Boost Diet” were identified. The factor loading scores for these factors were divided into quintiles Q1–Q5; none of them had a significant effect on hypertension status. Conversely, increase in age, the Ministry, Department, and Agency (MDA) of employment, frequency of eating in restaurants, and obesity were identified as significant risk factors. After adjusting for confounders (age, body mass index, MDA, and eating habits), a high score (Q5) in “efficient diet pattern” was significantly related to a lower likelihood of hypertension than a low score (Q1). The prevalence of hypertension among the participants was relatively very high. An increase in age and working in educational sector were risk factors associated with hypertension. Therefore, it is recommended that civil servants engage in frequent exercise and undergo regular medical checkups, especially as they get older. These findings highlight the need for large-scale assessment of the impact of variables considered in this study on hypertension, among the civil workforce across Niger state and Nigeria
Recommended from our members
Overview: Second Comparative Risk Assessment for South Africa (SACRA2) highlights need for health promotion and strengthened surveillance
Background. South Africa (SA) faces multiple health challenges. Quantifying the contribution of modifiable risk factors can be used to identify and prioritise areas of concern for population health and opportunities for health promotion and disease prevention interventions.
Objective. To estimate the attributable burden of 18 modifiable risk factors for 2000, 2006 and 2012.
Methods. Comparative risk assessment (CRA), a standardised and systematic approach, was used to estimate the attributable burden of 18 risk factors. Risk exposure estimates were sourced from local data, and meta-regressions were used to model the parameters, depending on the availability of data. Risk-outcome pairs meeting the criteria for convincing or probable evidence were assessed using relative risks against a theoretical minimum risk exposure level to calculate either a potential impact fraction or population attributable fraction (PAF). Relative risks were sourced from the Global Burden of Disease, Injuries, and Risk Factors (GBD) study as well as published cohort and intervention studies. Attributable burden was calculated for each risk factor for 2000, 2006 and 2012 by applying the PAF to estimates of deaths and years of life lost from the Second South African National Burden of Disease Study (SANBD2). Uncertainty analyses were performed using Monte Carlo simulation, and age-standardised rates were calculated using the World Health Organization standard population.
Results. Unsafe sex was the leading risk factor across all years, accounting for one in four DALYs (26.6%) of the estimated 20.6 million DALYs in 2012. The top five leading risk factors for males and females remained the same between 2000 and 2012. For males, the leading risks were (in order of descending rank): unsafe sex; alcohol consumption; interpersonal violence; tobacco smoking; and high systolic blood pressure; while for females the leading risks were unsafe sex; interpersonal violence; high systolic blood pressure; high body mass index; and high fasting plasma glucose. Since 2000, the attributable age-standardised death rates decreased for most risk factors. The largest decrease was for household air pollution (–41.8%). However, there was a notable increase in the age-standardised death rate for high fasting plasma glucose (44.1%), followed by ambient air pollution (7%).
Conclusion. This study reflects the continued dominance of unsafe sex and interpersonal violence during the study period, as well as the combined effects of poverty and underdevelopment with the emergence of cardiometabolic-related risk factors and ambient air pollution as key modifiable risk factors in SA. Despite reductions in the attributable burden of many risk factors, the study reveals significant scope for health promotion and disease prevention initiatives and provides an important tool for policy makers to influence policy and programme interventions in the country
Contribution française à l'upgrade de LHCb
La contribution française à l'upgrade de LHCb est d etaillée dans ce document et s'inscrit dans le prolongement du Framework TDR soumis au LHCC le 25 mai 2012. La France a contribué à la conception et à la réalisation de la mécanique et de l'électronique de lecture des calorimètres. Elle est l'acteur principal du système de déclenchement de premier niveau et l'initiatrice du projet DIRAC, progiciel de traitement et d'analyse de données dans un environnement distribué. Les physiciens et ingénieurs français ont de nombreuses responsabilités de premier plan et sont très fortement impliqués dans l'analyse des données. Les groupes français souhaitent poursuivre leur forte participation a l'expérience en contribuant a son upgrade, notamment l'électronique de lecture des calorimètres et du trajectographe en fibres scintillantes ainsi qu'au data processing
Contribution française à l'upgrade de LHCb
La contribution française à l'upgrade de LHCb est d etaillée dans ce document et s'inscrit dans le prolongement du Framework TDR soumis au LHCC le 25 mai 2012. La France a contribué à la conception et à la réalisation de la mécanique et de l'électronique de lecture des calorimètres. Elle est l'acteur principal du système de déclenchement de premier niveau et l'initiatrice du projet DIRAC, progiciel de traitement et d'analyse de données dans un environnement distribué. Les physiciens et ingénieurs français ont de nombreuses responsabilités de premier plan et sont très fortement impliqués dans l'analyse des données. Les groupes français souhaitent poursuivre leur forte participation a l'expérience en contribuant a son upgrade, notamment l'électronique de lecture des calorimètres et du trajectographe en fibres scintillantes ainsi qu'au data processing
Recent advances in understanding hypertension development in sub-Saharan Africa
Consistent reports indicate that hypertension is a particularly common finding in black populations. Hypertension occurs at younger ages and is often more severe in terms of blood pressure levels and organ damage than in whites, resulting in a higher incidence of cardiovascular disease and mortality. This review provides an outline of recent advances in the pathophysiological understanding of blood pressure elevation and the consequences thereof in black populations in Africa. This is set against the backdrop of populations undergoing demanding and rapid demographic transition, where infection with the Human Immunodeficiency Virus predominates, and where under and over-nutrition coexist. Collectively, recent findings from Africa illustrate an increased lifetime risk to hypertension from foetal life onwards. From young ages black populations display early endothelial dysfunction, increased vascular tone and reactivity, microvascular structural adaptions, as well as increased aortic stiffness resulting in elevated central and brachial blood pressures during the day and night, when compared to whites. Together with knowledge on the contributions of sympathetic activation and abnormal renal sodium handling, these pathophysiological adaptations result in subclinical and clinical organ damage at younger ages.
This overall enhanced understanding on the determinants of blood pressure elevation in blacks encourages (a) novel approaches to assess and manage hypertension in Africa better, (b) further scientific discovery to develop more effective prevention and treatment strategies, and (c) policymakers and health advocates to collectively contribute in creating health-promoting environments in Africa
Problem drinking as a risk factor for tuberculosis: a propensity score matched analysis of a national survey
BACKGROUND:Epidemiological and other evidence strongly supports the hypothesis that problem drinking is causally related to the incidence of active tuberculosis and the worsening of the disease course. The presence of a large number of potential confounders, however, complicates the assessment of the actual size of this causal effect, leaving room for a substantial amount of bias. This study aims to contribute to the understanding of the role of confounding in the observed association between problem drinking and tuberculosis, assessing the effect of the adjustment for a relatively large number of potential confounders on the estimated prevalence odds ratio of tuberculosis among problem drinkers vs. moderate drinkers/abstainers in a cross-sectional, nationally representative sample of the South African adult population. METHODS: A propensity score approach was used to match each problem drinker in the sample with a subset of moderate drinkers/abstainers with similar characteristics in respect to a set of potential confounders. The prevalence odds ratio of tuberculosis between the matched groups was then calculated using conditional logistic regression. Sensitivity analyses were conducted to assess the robustness of the results in respect to misspecification of the model. RESULTS: The prevalence odds ratio of tuberculosis between problem drinkers and moderate drinkers/abstainers was 1.97 (95% CI: 1.40 to 2.77), and the result was robust with respect to the matching procedure as well as to incorrect adjustment for potential mediators and to the possible presence of unmeasured confounders. Sub-population analysis did not provide noteworthy evidence for the presence of interaction between problem drinking and the observed confounders. CONCLUSION: In a cross-sectional national survey of the adult population of a middle income country with high tuberculosis burden, problem drinking was associated with a two fold increase in the odds of past TB diagnosis after controlling for a large number of socio-economic and biological confounders. Within the limitations of a cross-sectional study design with self-reported tuberculosis status, these results adds to previous evidence of a causal link between problem drinking and tuberculosis, and suggest that the observed higher prevalence of tuberculosis among problem drinkers commonly found in population studies cannot be attributed to the confounding effect of the uneven distribution of other risk factors