181 research outputs found

    WikiLinkGraphs: A Complete, Longitudinal and Multi-Language Dataset of the Wikipedia Link Networks

    Full text link
    Wikipedia articles contain multiple links connecting a subject to other pages of the encyclopedia. In Wikipedia parlance, these links are called internal links or wikilinks. We present a complete dataset of the network of internal Wikipedia links for the 99 largest language editions. The dataset contains yearly snapshots of the network and spans 1717 years, from the creation of Wikipedia in 2001 to March 1st, 2018. While previous work has mostly focused on the complete hyperlink graph which includes also links automatically generated by templates, we parsed each revision of each article to track links appearing in the main text. In this way we obtained a cleaner network, discarding more than half of the links and representing all and only the links intentionally added by editors. We describe in detail how the Wikipedia dumps have been processed and the challenges we have encountered, including the need to handle special pages such as redirects, i.e., alternative article titles. We present descriptive statistics of several snapshots of this network. Finally, we propose several research opportunities that can be explored using this new dataset.Comment: 10 pages, 3 figures, 7 tables, LaTeX. Final camera-ready version accepted at the 13TH International AAAI Conference on Web and Social Media (ICWSM 2019) - Munich (Germany), 11-14 June 201

    Development and Preliminary Validation of an Electromyography-Scoring Protocol for the Assessment and Grading of Muscle Involvement in Patients With Juvenile Idiopathic Inflammatory Myopathies.

    Get PDF
    Abstract Introduction We performed a pilot study in order to investigate the feasibility of an electromyography (EMG)-scoring protocol for the assessment of disease activity in juvenile idiopathic inflammatory myopathies (JIIM). Methods Children with JIIM followed up in a tertiary-level care center underwent standardized clinical, laboratory, and EMG assessment. An EMG-scoring protocol was devised by a consensus panel including a pediatric neurophysiologist and two pediatric rheumatologists, based on a combined score obtained as the sum of (1) the presence of denervation signs (fibrillation potentials) and (2) motor unit remodeling (mixed pattern of short- and long-duration motor unit action potentials). The EMG-scoring protocol was then validated following the Outcome Measures in Rheumatoid Arthritis Clinical Trials filter for outcome measures in rheumatology and the consensus-based standards for the selection of health measurement instruments methodology. Results Thirteen children (77% females) were included in the study, with a median age of 10 years (interquartile range: 7-17 years) and median disease duration of 11.8 months (interquartile range: 2.1-44.5). A total of 39 EMG examinations were evaluated. A strong positive association between a standardized tool for muscle strength assessment and the combined score was observed. No significant associations were found with both creatine kinase and erythrocyte sedimentation rate levels. Discussion Our EMG-scoring protocol is the first standardized and reproducible tool for the neurophysiologic evaluation and grading of muscle involvement in patients with JIIM and could provide relevant additional information in the assessment and follow-up of these rare conditions

    Cone-Beam Computed Tomographic Assessment of the Mandibular Condylar Volume in Different Skeletal Patterns: A Retrospective Study in Adult Patients

    Get PDF
    The aim of this study was to assess the condylar volume in adult patients with different skeletal classes and vertical patterns using cone‐beam computed tomography (CBCT). CBCT scans of 146 condyles from 73 patients (mean age 30   12 years old; 49 female, 24 male) were selected from the archive of the Department of Dentistry and Maxillofacial Surgery of Fondazione IRCCS Ca’ Granda, Milan, Italy, and retrospectively analyzed. The following inclusion criteria were used: adult patients; CBCT performed with the same protocol (0.4 mm slice thickness, 16   22 cm field of view, 20 s scan time); no systemic diseases; and no previous orthodontic treatments. Three‐dimensional cephalometric tracings were performed for each patient, the mandibular condyles were segmented and the relevant volumes calculated using Mimics Materialize 20.0  software (Materialise, Leuven, Belgium). Right and left variables were analyzed together using random‐intercept linear regression models. No significant association between condylar volumes and skeletal class was found. On the other hand, in relation to vertical patterns, the mean values of the mandibular condyle volumes in hyperdivergent subjects (688 mm3) with a post‐rotation growth pattern (625 mm3) were smaller than in hypodivergent patients (812 mm3) with a horizontal growth pattern (900 mm3). Patients with an increased divergence angle had smaller condylar volumes than subjects with normal or decreased mandibular plane divergence. This relationship may help the clinician when planning orthodontic treatment

    Geostatistical integration and uncertainty in pollutant concentration surface under preferential sampling

    Get PDF
    In this paper the focus is on environmental statistics, with the aim of estimating the concentration surface and related uncertainty of an air pollutant. We used air quality data recorded by a network of monitoring stations within a Bayesian framework to overcome difficulties in accounting for prediction uncertainty and to integrate information provided by deterministic models based on emissions meteorology and chemico-physical characteristics of the atmosphere. Several authors have proposed such integration, but all the proposed approaches rely on representativeness and completeness of existing air pollution monitoring networks. We considered the situation in which the spatial process of interest and the sampling locations are not independent. This is known in the literature as the preferential sampling problem, which if ignored in the analysis, can bias geostatistical inferences. We developed a Bayesian geostatistical model to account for preferential sampling with the main interest in statistical integration and uncertainty. We used PM10 data arising from the air quality network of the Environmental Protection Agency of Lombardy Region (Italy) and numerical outputs from the deterministic model. We specified an inhomogeneous Poisson process for the sampling locations intensities and a shared spatial random component model for the dependence between the spatial location of monitors and the pollution surface. We found greater predicted standard deviation differences in areas not properly covered by the air quality network. In conclusion, in this context inferences on prediction uncertainty may be misleading when geostatistical modelling does not take into account preferential sampling

    Lung Cancer and Occupation in a Population-based Case-Control Study

    Get PDF
    The authors examined the relation between occupation and lung cancer in the large, population-based Environment And Genetics in Lung cancer Etiology (EAGLE) case-control study. In 2002–2005 in the Lombardy region of northern Italy, 2,100 incident lung cancer cases and 2,120 randomly selected population controls were enrolled. Lifetime occupational histories (industry and job title) were coded by using standard international classifications and were translated into occupations known (list A) or suspected (list B) to be associated with lung cancer. Smoking-adjusted odds ratios and 95% confidence intervals were calculated with logistic regression. For men, an increased risk was found for list A (177 exposed cases and 100 controls; odds ratio = 1.74, 95% confidence interval: 1.27, 2.38) and most occupations therein. No overall excess was found for list B with the exception of filling station attendants and bus and truck drivers (men) and launderers and dry cleaners (women). The authors estimated that 4.9% (95% confidence interval: 2.0, 7.8) of lung cancers in men were attributable to occupation. Among those in other occupations, risk excesses were found for metal workers, barbers and hairdressers, and other motor vehicle drivers. These results indicate that past exposure to occupational carcinogens remains an important determinant of lung cancer occurrence

    Phase I Metabolic Genes and Risk of Lung Cancer: Multiple Polymorphisms and mRNA Expression

    Get PDF
    Polymorphisms in genes coding for enzymes that activate tobacco lung carcinogens may generate inter-individual differences in lung cancer risk. Previous studies had limited sample sizes, poor exposure characterization, and a few single nucleotide polymorphisms (SNPs) tested in candidate genes. We analyzed 25 SNPs (some previously untested) in 2101 primary lung cancer cases and 2120 population controls from the Environment And Genetics in Lung cancer Etiology (EAGLE) study from six phase I metabolic genes, including cytochrome P450s, microsomal epoxide hydrolase, and myeloperoxidase. We evaluated the main genotype effects and genotype-smoking interactions in lung cancer risk overall and in the major histology subtypes. We tested the combined effect of multiple SNPs on lung cancer risk and on gene expression. Findings were prioritized based on significance thresholds and consistency across different analyses, and accounted for multiple testing and prior knowledge. Two haplotypes in EPHX1 were significantly associated with lung cancer risk in the overall population. In addition, CYP1B1 and CYP2A6 polymorphisms were inversely associated with adenocarcinoma and squamous cell carcinoma risk, respectively. Moreover, the association between CYP1A1 rs2606345 genotype and lung cancer was significantly modified by intensity of cigarette smoking, suggesting an underling dose-response mechanism. Finally, increasing number of variants at CYP1A1/A2 genes revealed significant protection in never smokers and risk in ever smokers. Results were supported by differential gene expression in non-tumor lung tissue samples with down-regulation of CYP1A1 in never smokers and up-regulation in smokers from CYP1A1/A2 SNPs. The significant haplotype associations emphasize that the effect of multiple SNPs may be important despite null single SNP-associations, and warrants consideration in genome-wide association studies (GWAS). Our findings emphasize the necessity of post-GWAS fine mapping and SNP functional assessment to further elucidate cancer risk associations

    Gene Expression Signature of Cigarette Smoking and Its Role in Lung Adenocarcinoma Development and Survival

    Get PDF
    Tobacco smoking is responsible for over 90% of lung cancer cases, and yet the precise molecular alterations induced by smoking in lung that develop into cancer and impact survival have remained obscure.We performed gene expression analysis using HG-U133A Affymetrix chips on 135 fresh frozen tissue samples of adenocarcinoma and paired noninvolved lung tissue from current, former and never smokers, with biochemically validated smoking information. ANOVA analysis adjusted for potential confounders, multiple testing procedure, Gene Set Enrichment Analysis, and GO-functional classification were conducted for gene selection. Results were confirmed in independent adenocarcinoma and non-tumor tissues from two studies. We identified a gene expression signature characteristic of smoking that includes cell cycle genes, particularly those involved in the mitotic spindle formation (e.g., NEK2, TTK, PRC1). Expression of these genes strongly differentiated both smokers from non-smokers in lung tumors and early stage tumor tissue from non-tumor tissue (p<0.001 and fold-change >1.5, for each comparison), consistent with an important role for this pathway in lung carcinogenesis induced by smoking. These changes persisted many years after smoking cessation. NEK2 (p<0.001) and TTK (p = 0.002) expression in the noninvolved lung tissue was also associated with a 3-fold increased risk of mortality from lung adenocarcinoma in smokers.Our work provides insight into the smoking-related mechanisms of lung neoplasia, and shows that the very mitotic genes known to be involved in cancer development are induced by smoking and affect survival. These genes are candidate targets for chemoprevention and treatment of lung cancer in smokers

    Cancer incidence in the population exposed to dioxin after the "Seveso accident": twenty years of follow-up

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Seveso, Italy accident in 1976 caused the contamination of a large population by 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). Possible long-term effects have been examined through mortality and cancer incidence studies. We have updated the cancer incidence study which now covers the period 1977-96.</p> <p>Methods</p> <p>The study population includes subjects resident at the time of the accident in three contaminated zones with decreasing TCDD soil levels (zone A, very high; zone B, high; zone R, low) and in a surrounding non-contaminated reference territory. Gender-, age-, and period-adjusted rate ratios (RR) and 95% confidence intervals (95% CI) were calculated by using Poisson regression for subjects aged 0-74 years.</p> <p>Results</p> <p>All cancer incidence did not differ from expectations in any of the contaminated zones. An excess of lymphatic and hematopoietic tissue neoplasms was observed in zones A (four cases; RR, 1.39; 95% CI, 0.52-3.71) and B (29 cases; RR, 1.56; 95% CI, 1.07-2.27) consistent with the findings of the concurrent mortality study. An increased risk of breast cancer was detected in zone A females after 15 years since the accident (five cases, RR, 2.57; 95% CI, 1.07-6.20). No cases of soft tissue sarcomas occurred in the most exposed zones (A and B, 1.17 expected). No cancer cases were observed among subjects diagnosed with chloracne early after the accident.</p> <p>Conclusion</p> <p>The extension of the Seveso cancer incidence study confirmed an excess risk of lymphatic and hematopoietic tissue neoplasms in the most exposed zones. No clear pattern by time since the accident and zones was evident partly because of the low number of cases. The elevated risk of breast cancer in zone A females after 15 years since the accident deserves further and thorough investigation. The follow-up is continuing in order to cover the long time period (even decades) usually elapsing from exposure to carcinogenic chemicals and disease occurrence.</p
    corecore