1,527 research outputs found

    Rule-based knowledge aggregation for large-scale protein sequence analysis of influenza A viruses

    Get PDF
    Background: The explosive growth of biological data provides opportunities for new statistical and comparative analyses of large information sets, such as alignments comprising tens of thousands of sequences. In such studies, sequence annotations frequently play an essential role, and reliable results depend on metadata quality. However, the semantic heterogeneity and annotation inconsistencies in biological databases greatly increase the complexity of aggregating and cleaning metadata. Manual curation of datasets, traditionally favoured by life scientists, is impractical for studies involving thousands of records. In this study, we investigate quality issues that affect major public databases, and quantify the effectiveness of an automated metadata extraction approach that combines structural and semantic rules. We applied this approach to more than 90,000 influenza A records, to annotate sequences with protein name, virus subtype, isolate, host, geographic origin, and year of isolation. Results: Over 40,000 annotated Influenza A protein sequences were collected by combining information from more than 90,000 documents from NCBI public databases. Metadata values were automatically extracted, aggregated and reconciled from several document fields by applying user-defined structural rules. For each property, values were recovered from ≥88.8% of records, with accuracy exceeding 96% in most cases. Because of semantic heterogeneity, each property required up to six different structural rules to be combined. Significant quality differences between databases were found: GenBank documents yield values more reliably than documents extracted from GenPept. Using a simple set of semantic rules and a reasoner, we reconstructed relationships between sequences from the same isolate, thus identifying 7640 isolates. Validation of isolate metadata against a simple ontology highlighted more than 400 inconsistencies, leading to over 3,000 property value corrections. Conclusion: To overcome the quality issues inherent in public databases, automated knowledge aggregation with embedded intelligence is needed for large-scale analyses. Our results show that user-controlled intuitive approaches, based on combination of simple rules, can reliably automate various curation tasks, reducing the need for manual corrections to approximately 5% of the records. Emerging semantic technologies possess desirable features to support today's knowledge aggregation tasks, with a potential to bring immediate benefits to this field

    Predictors Of Early Treatment Discontinuation And Severe Anemia In A Brazilian Cohort Of Hepatitis C Patients Treated With First-generation Protease Inhibitors

    Get PDF
    The aim of this study was to determine risk factors for adverse events (AE)-related treatment discontinuation and severe anemia among patients with chronic hepatitis C virus (HCV) genotype 1 infection, treated with first-generation protease inhibitor (PI)-based therapy. We included all patients who initiated treatment with PI-based therapy at a Brazilian university hospital between November 2013 and December 2014. We prospectively collected data from medical records using standardized questionnaires and used Epi Info 6.0 for analysis. Severe anemia was defined as hemoglobin <= 8.5 mg/dL. We included 203 patients: 132 treated with telaprevir (TVR) and 71 treated with boceprevir (BOC). AE-related treatment discontinuation rate was 19.2% and anemia was the main reason (38.5%). Risk factors for treatment discontinuation were higher comorbidity index (OR=1.85, CI=1.05-3.25) for BOC, and higher bilirubin count (OR=1.02, CI=1.01-1.04) and lower BMI (OR=0.98, CI=0.96-0.99) for TVR. Severe anemia occurred in 35 (17.2%) patients. Risk factors for this outcome were lower estimated glomerular filtration rate (eGFR; OR=0.95, CI=0.91-0.98) for patients treated with TVR, and higher comorbidity index (OR=2.21, CI=1.04-4.67) and ribavirin dosage (OR=0.84, CI=0.72-0.99) for those treated with BOC. Fifty-five (57.3%) patients treated with TVR and 15 (27.3%) patients treated with BOC achieved sustained virological response (SVR). Among patients who received TVR and interrupted treatment due to AE (n=19), only 26.3% (n=5) achieved SVR (P=0.003). Higher number of comorbidities, lower eGFR and advanced liver disease are associated with severe anemia and early treatment cessation, which may compromise SVR achievement.49

    Predictors of early treatment discontinuation and severe anemia in a Brazilian cohort of hepatitis C patients treated with first-generation protease inhibitors

    Get PDF
    The aim of this study was to determine risk factors for adverse events (AE)-related treatment discontinuation and severe anemia among patients with chronic hepatitis C virus (HCV) genotype 1 infection, treated with first-generation protease inhibitor (PI)-based therapy. We included all patients who initiated treatment with PI-based therapy at a Brazilian university hospital between November 2013 and December 2014. We prospectively collected data from medical records using standardized questionnaires and used Epi Info 6.0 for analysis. Severe anemia was defined as hemoglobin ≤8.5 mg/dL. We included 203 patients: 132 treated with telaprevir (TVR) and 71 treated with boceprevir (BOC). AE-related treatment discontinuation rate was 19.2% and anemia was the main reason (38.5%). Risk factors for treatment discontinuation were higher comorbidity index (OR=1.85, CI=1.05-3.25) for BOC, and higher bilirubin count (OR=1.02, CI=1.01-1.04) and lower BMI (OR=0.98, CI=0.96-0.99) for TVR. Severe anemia occurred in 35 (17.2%) patients. Risk factors for this outcome were lower estimated glomerular filtration rate (eGFR; OR=0.95, CI=0.91-0.98) for patients treated with TVR, and higher comorbidity index (OR=2.21, CI=1.04-4.67) and ribavirin dosage (OR=0.84, CI=0.72-0.99) for those treated with BOC. Fifty-five (57.3%) patients treated with TVR and 15 (27.3%) patients treated with BOC achieved sustained virological response (SVR). Among patients who received TVR and interrupted treatment due to AE (n=19), only 26.3% (n=5) achieved SVR (P=0.003). Higher number of comorbidities, lower eGFR and advanced liver disease are associated with severe anemia and early treatment cessation, which may compromise SVR achievement4971

    Tutorial : applying machine learning in behavioral research

    Full text link
    Machine-learning algorithms hold promise for revolutionizing how educators and clinicians make decisions. However, researchers in behavior analysis have been slow to adopt this methodology to further develop their understanding of human behavior and improve the application of the science to problems of applied significance. One potential explanation for the scarcity of research is that machine learning is not typically taught as part of training programs in behavior analysis. This tutorial aims to address this barrier by promoting increased research using machine learning in behavior analysis. We present how to apply the random forest, support vector machine, stochastic gradient descent, and k-nearest neighbors algorithms on a small dataset to better identify parents of children with autism who would benefit from a behavior analytic interactive web training. These step-by-step applications should allow researchers to implement machine-learning algorithms with novel research questions and datasets

    Genome-wide analysis of ivermectin response by Onchocerca volvulus reveals that genetic drift and soft selective sweeps contribute to loss of drug sensitivity

    Get PDF
    Treatment of onchocerciasis using mass ivermectin administration has reduced morbidity and transmission throughout Africa and Central/South America. Mass drug administration is likely to exert selection pressure on parasites, and phenotypic and genetic changes in several Onchocerca volvulus populations from Cameroon and Ghana-exposed to more than a decade of regular ivermectin treatment-have raised concern that sub-optimal responses to ivermectin's anti-fecundity effect are becoming more frequent and may spread.Pooled next generation sequencing (Pool-seq) was used to characterise genetic diversity within and between 108 adult female worms differing in ivermectin treatment history and response. Genome-wide analyses revealed genetic variation that significantly differentiated good responder (GR) and sub-optimal responder (SOR) parasites. These variants were not randomly distributed but clustered in ~31 quantitative trait loci (QTLs), with little overlap in putative QTL position and gene content between the two countries. Published candidate ivermectin SOR genes were largely absent in these regions; QTLs differentiating GR and SOR worms were enriched for genes in molecular pathways associated with neurotransmission, development, and stress responses. Finally, single worm genotyping demonstrated that geographic isolation and genetic change over time (in the presence of drug exposure) had a significantly greater role in shaping genetic diversity than the evolution of SOR.This study is one of the first genome-wide association analyses in a parasitic nematode, and provides insight into the genomics of ivermectin response and population structure of O. volvulus. We argue that ivermectin response is a polygenically-determined quantitative trait (QT) whereby identical or related molecular pathways but not necessarily individual genes are likely to determine the extent of ivermectin response in different parasite populations. Furthermore, we propose that genetic drift rather than genetic selection of SOR is the underlying driver of population differentiation, which has significant implications for the emergence and potential spread of SOR within and between these parasite populations

    Search for chargino-neutralino production with mass splittings near the electroweak scale in three-lepton final states in √s=13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for supersymmetry through the pair production of electroweakinos with mass splittings near the electroweak scale and decaying via on-shell W and Z bosons is presented for a three-lepton final state. The analyzed proton-proton collision data taken at a center-of-mass energy of √s=13  TeV were collected between 2015 and 2018 by the ATLAS experiment at the Large Hadron Collider, corresponding to an integrated luminosity of 139  fb−1. A search, emulating the recursive jigsaw reconstruction technique with easily reproducible laboratory-frame variables, is performed. The two excesses observed in the 2015–2016 data recursive jigsaw analysis in the low-mass three-lepton phase space are reproduced. Results with the full data set are in agreement with the Standard Model expectations. They are interpreted to set exclusion limits at the 95% confidence level on simplified models of chargino-neutralino pair production for masses up to 345 GeV

    Observation of associated near-side and away-side long-range correlations in √sNN=5.02  TeV proton-lead collisions with the ATLAS detector

    Get PDF
    Two-particle correlations in relative azimuthal angle (Δϕ) and pseudorapidity (Δη) are measured in √sNN=5.02  TeV p+Pb collisions using the ATLAS detector at the LHC. The measurements are performed using approximately 1  μb-1 of data as a function of transverse momentum (pT) and the transverse energy (ΣETPb) summed over 3.1<η<4.9 in the direction of the Pb beam. The correlation function, constructed from charged particles, exhibits a long-range (2<|Δη|<5) “near-side” (Δϕ∼0) correlation that grows rapidly with increasing ΣETPb. A long-range “away-side” (Δϕ∼π) correlation, obtained by subtracting the expected contributions from recoiling dijets and other sources estimated using events with small ΣETPb, is found to match the near-side correlation in magnitude, shape (in Δη and Δϕ) and ΣETPb dependence. The resultant Δϕ correlation is approximately symmetric about π/2, and is consistent with a dominant cos⁡2Δϕ modulation for all ΣETPb ranges and particle pT

    Measurement of the inclusive and dijet cross-sections of b-jets in pp collisions at sqrt(s) = 7 TeV with the ATLAS detector

    Get PDF
    The inclusive and dijet production cross-sections have been measured for jets containing b-hadrons (b-jets) in proton-proton collisions at a centre-of-mass energy of sqrt(s) = 7 TeV, using the ATLAS detector at the LHC. The measurements use data corresponding to an integrated luminosity of 34 pb^-1. The b-jets are identified using either a lifetime-based method, where secondary decay vertices of b-hadrons in jets are reconstructed using information from the tracking detectors, or a muon-based method where the presence of a muon is used to identify semileptonic decays of b-hadrons inside jets. The inclusive b-jet cross-section is measured as a function of transverse momentum in the range 20 < pT < 400 GeV and rapidity in the range |y| < 2.1. The bbbar-dijet cross-section is measured as a function of the dijet invariant mass in the range 110 < m_jj < 760 GeV, the azimuthal angle difference between the two jets and the angular variable chi in two dijet mass regions. The results are compared with next-to-leading-order QCD predictions. Good agreement is observed between the measured cross-sections and the predictions obtained using POWHEG + Pythia. MC@NLO + Herwig shows good agreement with the measured bbbar-dijet cross-section. However, it does not reproduce the measured inclusive cross-section well, particularly for central b-jets with large transverse momenta.Comment: 10 pages plus author list (21 pages total), 8 figures, 1 table, final version published in European Physical Journal

    Measurements of fiducial and differential cross sections for Higgs boson production in the diphoton decay channel at s√=8 TeV with ATLAS

    Get PDF
    Measurements of fiducial and differential cross sections are presented for Higgs boson production in proton-proton collisions at a centre-of-mass energy of s√=8 TeV. The analysis is performed in the H → γγ decay channel using 20.3 fb−1 of data recorded by the ATLAS experiment at the CERN Large Hadron Collider. The signal is extracted using a fit to the diphoton invariant mass spectrum assuming that the width of the resonance is much smaller than the experimental resolution. The signal yields are corrected for the effects of detector inefficiency and resolution. The pp → H → γγ fiducial cross section is measured to be 43.2 ±9.4(stat.) − 2.9 + 3.2 (syst.) ±1.2(lumi)fb for a Higgs boson of mass 125.4GeV decaying to two isolated photons that have transverse momentum greater than 35% and 25% of the diphoton invariant mass and each with absolute pseudorapidity less than 2.37. Four additional fiducial cross sections and two cross-section limits are presented in phase space regions that test the theoretical modelling of different Higgs boson production mechanisms, or are sensitive to physics beyond the Standard Model. Differential cross sections are also presented, as a function of variables related to the diphoton kinematics and the jet activity produced in the Higgs boson events. The observed spectra are statistically limited but broadly in line with the theoretical expectations
    corecore