821 research outputs found

    Reliability-based cleaning of noisy training labels with inductive conformal prediction in multi-modal biomedical data mining

    Full text link
    Accurately labeling biomedical data presents a challenge. Traditional semi-supervised learning methods often under-utilize available unlabeled data. To address this, we propose a novel reliability-based training data cleaning method employing inductive conformal prediction (ICP). This method capitalizes on a small set of accurately labeled training data and leverages ICP-calculated reliability metrics to rectify mislabeled data and outliers within vast quantities of noisy training data. The efficacy of the method is validated across three classification tasks within distinct modalities: filtering drug-induced-liver-injury (DILI) literature with title and abstract, predicting ICU admission of COVID-19 patients through CT radiomics and electronic health records, and subtyping breast cancer using RNA-sequencing data. Varying levels of noise to the training labels were introduced through label permutation. Results show significant enhancements in classification performance: accuracy enhancement in 86 out of 96 DILI experiments (up to 11.4%), AUROC and AUPRC enhancements in all 48 COVID-19 experiments (up to 23.8% and 69.8%), and accuracy and macro-average F1 score improvements in 47 out of 48 RNA-sequencing experiments (up to 74.6% and 89.0%). Our method offers the potential to substantially boost classification performance in multi-modal biomedical machine learning tasks. Importantly, it accomplishes this without necessitating an excessive volume of meticulously curated training data

    GW25-e0768 Circulating level of miR-378 predicts left ventricular hypertrophy in patients with aortic stenosis

    Get PDF

    A New Method of RNA Secondary Structure Prediction Based on Convolutional Neural Network and Dynamic Programming

    Get PDF
    In recent years, obtaining RNA secondary structure information has played an important role in RNA and gene function research. Although some RNA secondary structures can be gained experimentally, in most cases, efficient, and accurate computational methods are still needed to predict RNA secondary structure. Current RNA secondary structure prediction methods are mainly based on the minimum free energy algorithm, which finds the optimal folding state of RNA in vivo using an iterative method to meet the minimum energy or other constraints. However, due to the complexity of biotic environment, a true RNA structure always keeps the balance of biological potential energy status, rather than the optimal folding status that meets the minimum energy. For short sequence RNA its equilibrium energy status for the RNA folding organism is close to the minimum free energy status; therefore, the minimum free energy algorithm for predicting RNA secondary structure has higher accuracy. Nevertheless, in a longer sequence RNA, constant folding causes its biopotential energy balance to deviate far from the minimum free energy status. This deviation is because of its complex structure and results in a serious decline in the prediction accuracy of its secondary structure. In this paper, we propose a novel RNA secondary structure prediction algorithm using a convolutional neural network model combined with a dynamic programming method to improve the accuracy with large-scale RNA sequence and structure data. We analyze current experimental RNA sequences and structure data to construct a deep convolutional network model, and then we extract implicit features of an effective classification from large-scale data to predict the pairing probability of each base in an RNA sequence. For the obtained probabilities of RNA sequence base pairing, an enhanced dynamic programming method is applied to obtain the optimal RNA secondary structure. Results indicate that our proposed method is superior to the common RNA secondary structure prediction algorithms in predicting three benchmark RNA families. Based on the characteristics of deep learning algorithm, it can be inferred that the method proposed in this paper has a 30% higher prediction success rate when compared with other algorithms, which will be needed as the amount of real RNA structure data increases in the future

    Quantification of hypsarrhythmia in infantile spasmatic EEG:a large cohort study

    Get PDF
    Infantile spasms (IS) is a neurological disorder causing mental and/or developmental retardation in many infants. Hypsarrhythmia is a typical symptom in the electroencephalography (EEG) signals with IS. Long-Term EEG/video monitoring is most frequently employed in clinical practice for IS diagnosis, from which manual screening of hypsarrhythmia is time consuming and lack of sufficient reliability. This study aims to identify potential biomarkers for automatic IS diagnosis by quantitative analysis of the EEG signals. A large cohort of 101 IS patients and 155 healthy controls (HC) were involved. Typical hypsarrhythmia and non-hypsarrhythmia EEG signals were annotated, and normal EEG were randomly picked from the HC. Root mean square (RMS), teager energy (TE), mean frequency, sample entropy (SamEn), multi-channel SamEn, multi-scale SamEn, and nonlinear correlation coefficient were computed in each sub-band of the three EEG signals, and then compared using either a one-way ANOVA or a Kruskal-Wallis test (based on their distribution) and the receiver operating characteristic (ROC) curves. The effects of infant age on these features were also investigated. For most of the employed features, significant ({p} &lt; {0}.{05} ) differences were observed between hypsarrhythmia EEG and non-hypsarrhythmia EEG or HC, which seem to increase with increased infant age. RMS and TE produce the best classification in the delta and theta bands, while entropy features yields the best performance in the gamma band. Our study suggests RMS and TE (delta and theta bands) and entropy features (gamma band) to be promising biomarkers for automatic detection of hypsarrhythmia in long-Term EEG monitoring. The findings of our study indicate the feasibility of automated IS diagnosis using artificial intelligence.</p

    Multidifferential study of identified charged hadron distributions in ZZ-tagged jets in proton-proton collisions at s=\sqrt{s}=13 TeV

    Full text link
    Jet fragmentation functions are measured for the first time in proton-proton collisions for charged pions, kaons, and protons within jets recoiling against a ZZ boson. The charged-hadron distributions are studied longitudinally and transversely to the jet direction for jets with transverse momentum 20 <pT<100< p_{\textrm{T}} < 100 GeV and in the pseudorapidity range 2.5<η<42.5 < \eta < 4. The data sample was collected with the LHCb experiment at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 1.64 fb1^{-1}. Triple differential distributions as a function of the hadron longitudinal momentum fraction, hadron transverse momentum, and jet transverse momentum are also measured for the first time. This helps constrain transverse-momentum-dependent fragmentation functions. Differences in the shapes and magnitudes of the measured distributions for the different hadron species provide insights into the hadronization process for jets predominantly initiated by light quarks.Comment: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-013.html (LHCb public pages

    Study of the BΛc+ΛˉcKB^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} decay

    Full text link
    The decay BΛc+ΛˉcKB^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} is studied in proton-proton collisions at a center-of-mass energy of s=13\sqrt{s}=13 TeV using data corresponding to an integrated luminosity of 5 fb1\mathrm{fb}^{-1} collected by the LHCb experiment. In the Λc+K\Lambda_{c}^+ K^{-} system, the Ξc(2930)0\Xi_{c}(2930)^{0} state observed at the BaBar and Belle experiments is resolved into two narrower states, Ξc(2923)0\Xi_{c}(2923)^{0} and Ξc(2939)0\Xi_{c}(2939)^{0}, whose masses and widths are measured to be m(Ξc(2923)0)=2924.5±0.4±1.1MeV,m(Ξc(2939)0)=2938.5±0.9±2.3MeV,Γ(Ξc(2923)0)=0004.8±0.9±1.5MeV,Γ(Ξc(2939)0)=0011.0±1.9±7.5MeV, m(\Xi_{c}(2923)^{0}) = 2924.5 \pm 0.4 \pm 1.1 \,\mathrm{MeV}, \\ m(\Xi_{c}(2939)^{0}) = 2938.5 \pm 0.9 \pm 2.3 \,\mathrm{MeV}, \\ \Gamma(\Xi_{c}(2923)^{0}) = \phantom{000}4.8 \pm 0.9 \pm 1.5 \,\mathrm{MeV},\\ \Gamma(\Xi_{c}(2939)^{0}) = \phantom{00}11.0 \pm 1.9 \pm 7.5 \,\mathrm{MeV}, where the first uncertainties are statistical and the second systematic. The results are consistent with a previous LHCb measurement using a prompt Λc+K\Lambda_{c}^{+} K^{-} sample. Evidence of a new Ξc(2880)0\Xi_{c}(2880)^{0} state is found with a local significance of 3.8σ3.8\,\sigma, whose mass and width are measured to be 2881.8±3.1±8.5MeV2881.8 \pm 3.1 \pm 8.5\,\mathrm{MeV} and 12.4±5.3±5.8MeV12.4 \pm 5.3 \pm 5.8 \,\mathrm{MeV}, respectively. In addition, evidence of a new decay mode Ξc(2790)0Λc+K\Xi_{c}(2790)^{0} \to \Lambda_{c}^{+} K^{-} is found with a significance of 3.7σ3.7\,\sigma. The relative branching fraction of BΛc+ΛˉcKB^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} with respect to the BD+DKB^{-} \to D^{+} D^{-} K^{-} decay is measured to be 2.36±0.11±0.22±0.252.36 \pm 0.11 \pm 0.22 \pm 0.25, where the first uncertainty is statistical, the second systematic and the third originates from the branching fractions of charm hadron decays.Comment: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-028.html (LHCb public pages

    Measurement of the ratios of branching fractions R(D)\mathcal{R}(D^{*}) and R(D0)\mathcal{R}(D^{0})

    Full text link
    The ratios of branching fractions R(D)B(BˉDτνˉτ)/B(BˉDμνˉμ)\mathcal{R}(D^{*})\equiv\mathcal{B}(\bar{B}\to D^{*}\tau^{-}\bar{\nu}_{\tau})/\mathcal{B}(\bar{B}\to D^{*}\mu^{-}\bar{\nu}_{\mu}) and R(D0)B(BD0τνˉτ)/B(BD0μνˉμ)\mathcal{R}(D^{0})\equiv\mathcal{B}(B^{-}\to D^{0}\tau^{-}\bar{\nu}_{\tau})/\mathcal{B}(B^{-}\to D^{0}\mu^{-}\bar{\nu}_{\mu}) are measured, assuming isospin symmetry, using a sample of proton-proton collision data corresponding to 3.0 fb1{ }^{-1} of integrated luminosity recorded by the LHCb experiment during 2011 and 2012. The tau lepton is identified in the decay mode τμντνˉμ\tau^{-}\to\mu^{-}\nu_{\tau}\bar{\nu}_{\mu}. The measured values are R(D)=0.281±0.018±0.024\mathcal{R}(D^{*})=0.281\pm0.018\pm0.024 and R(D0)=0.441±0.060±0.066\mathcal{R}(D^{0})=0.441\pm0.060\pm0.066, where the first uncertainty is statistical and the second is systematic. The correlation between these measurements is ρ=0.43\rho=-0.43. Results are consistent with the current average of these quantities and are at a combined 1.9 standard deviations from the predictions based on lepton flavor universality in the Standard Model.Comment: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-039.html (LHCb public pages
    corecore