8 research outputs found
Active learning with deep pre-trained models for sequence tagging of clinical and biomedical texts
Active learning is a technique that helps to minimize the annotation budget required for the creation of a labeled dataset while maximizing the performance of a model trained on this dataset. It has been shown that active learning can be successfully applied to sequence tagging tasks of text processing in conjunction with deep learning models even when a limited amount of labeled data is available. Recent advances in transfer learning methods for natural language processing based on deep pre-trained models such as ELMo and BERT offer a much better ability to generalize on small annotated datasets compared to their shallow counterparts. The combination of deep pre-trained models and active learning leads to a powerful approach to dealing with annotation scarcity. In this work, we investigate the potential of this approach on clinical and biomedical data. The experimental evaluation shows that the combination of active learning and deep pre-trained models outperforms the standard methods of active learning. We also suggest a modification to a standard uncertainty sampling strategy and empirically show that it could be beneficial for annotation of very skewed datasets. Finally, we propose an annotation tool empowered with active learning and deep pre-trained models that could be used for entity annotation directly from Jupyter IDE
Thrombin generation test for evaluation of antiplatelet treatment in patients with coronary artery disease after percutaneous coronary intervention
To study the possibility of using thrombin generation tests in platelet-rich and platelet-poor plasma for evaluation of dual antiplatelet therapy efficacy in patients with coronary artery disease (CAD), following percutaneous coronary intervention. Venous blood was analyzed from CAD patients aged 53–75 years who had undergone percutaneous coronary intervention with stenting within one year and had been receiving standard doses of clopidogrel and aspirin (75 and 75–100 mg per day, respectively). The control group comprised age- and sex-matched subjects without clinical signs of CAD who were not receiving these drugs. Thrombin generation tests were performed in platelet-rich and platelet-poor plasma. Intravascular platelet activation, induced platelet aggregation, and routine coagulation were evaluated. Antiplatelet treatment did not influence results of routine coagulation tests or intravascular platelet activation. The dual antiplatelet therapy affects collagen-induced platelet aggregation (44 ± 2.5 vs. 7.9 ± 2.6%, р = 10−7) and leads to decreases in endogenous thrombin potential (1900 ± 85 vs. 1740 ± 95 nM∙min, p = 0.0045), maximum thrombin concentration (134 ± 9.5 vs. 106 ± 6.5 nM, p = 4∙10−6), and increases in time to peak thrombin (27 ± 1.5 vs. 31 ± 2 min, p = 0.0012). Decreases in thrombin generation rate showed the highest statistical significance (13 ± 2 vs. 7.9 ± 0.8 nM/min, p = 10−8). Antiplatelet treatment did not alter thrombogram parameters for platelet-poor plasma
Active learning with deep pre-trained models for sequence tagging of clinical and biomedical texts
Active learning is a technique that helps to minimize the annotation budget required for the creation of a labeled dataset while maximizing the performance of a model trained on this dataset. It has been shown that active learning can be successfully applied to sequence tagging tasks of text processing in conjunction with deep learning models even when a limited amount of labeled data is available. Recent advances in transfer learning methods for natural language processing based on deep pre-trained models such as ELMo and BERT offer a much better ability to generalize on small annotated datasets compared to their shallow counterparts. The combination of deep pre-trained models and active learning leads to a powerful approach to dealing with annotation scarcity. In this work, we investigate the potential of this approach on clinical and biomedical data. The experimental evaluation shows that the combination of active learning and deep pre-trained models outperforms the standard methods of active learning. We also suggest a modification to a standard uncertainty sampling strategy and empirically show that it could be beneficial for annotation of very skewed datasets. Finally, we propose an annotation tool empowered with active learning and deep pre-trained models that could be used for entity annotation directly from Jupyter IDE
Genetic screening of an endemic mutation in the DYSF gene in an isolated, mountainous population in the Republic of Dagestan
Abstract Background Dysferlinopathy has a high prevalence in relatively isolated ethnic groups where consanguineous marriages are characteristic and/or the founder effect exists. However, the frequency of endemic mutations in most isolates has not been investigated. Methods The prevalence of the pathological DYSF gene variant (NM_003494.4); c.200_201delinsAT, p. Val67Asp (rs121908957) was investigated in an isolated Avar population in the Republic of Dagestan. Genetic screenings were conducted in a remote mountainous region characterized by a high level of consanguinity among its inhabitants. In total, 746 individuals were included in the screenings. Results This pathological DYSF gene variant causes two primary phenotypes of dysferlinopathy: limb‐girdle muscular dystrophy (LGMD) type R2 and Miyoshi muscular dystrophy type 1. Results indicated a high prevalence of the allele at 14% (95% confidence interval [CI]: 12–17; 138 out of 1518 alleles), while the allele in the homozygous state was detected in 29 cases—3.8% (CI: 2.6–5.4). The population load for dysferlinopathy was 832.3 ± 153.9 per 100,000 with an average prevalence of limb‐girdle muscular dystrophies ranging from 0.38 ± 0.38 to 5.93 ± 1.44 per 100,000. Conclusion A significant burden of the allele was due to inbreeding, as evidenced by a deficiency of heterozygotes and the Wright fixation index equal to 0.14 (CI 0.06–0.23)