13 research outputs found
Reliably Filter Drug-Induced Liver Injury Literature With Natural Language Processing and Conformal Prediction
Drug-induced liver injury describes the adverse effects of drugs that damage the liver. Life-threatening results were also reported in severe cases. Therefore, liver toxicity is an important assessment for new drug candidates. These reports are documented in research papers that contain preliminary in vitro and in vivo experiments. Conventionally, data extraction from publications relies on resource-demanding manual labeling, which restricts the efficiency of the information extraction. The development of natural language processing techniques enables the automatic processing of biomedical texts. Herein, based on around 28,000 papers (titles and abstracts) provided by the Critical Assessment of Massive Data Analysis challenge, this study benchmarked model performances on filtering liver-damage-related literature. Among five text embedding techniques, the model using term frequency-inverse document frequency (TF-IDF) and logistic regression outperformed others with an accuracy of 0.957 on the validation set. Furthermore, an ensemble model with similar overall performances was developed with a logistic regression model on the predicted probability given by separate models with different vectorization techniques. The ensemble model achieved a high accuracy of 0.954 and an F1 score of 0.955 in the hold-out validation data in the challenge. Moreover, important words in positive/negative predictions were identified via model interpretation. The prediction reliability was quantified with conformal prediction, which provides users with a control over the prediction uncertainty. Overall, the ensemble model and TF-IDF model reached satisfactory classification results, which can be used by researchers to rapidly filter literature that describes events related to liver injury induced by medications
An electronic nose-based assistive diagnostic prototype for lung cancer detection with conformal prediction
Reliability-based cleaning of noisy training labels with inductive conformal prediction in multi-modal biomedical data mining
Accurately labeling biomedical data presents a challenge. Traditional
semi-supervised learning methods often under-utilize available unlabeled data.
To address this, we propose a novel reliability-based training data cleaning
method employing inductive conformal prediction (ICP). This method capitalizes
on a small set of accurately labeled training data and leverages ICP-calculated
reliability metrics to rectify mislabeled data and outliers within vast
quantities of noisy training data. The efficacy of the method is validated
across three classification tasks within distinct modalities: filtering
drug-induced-liver-injury (DILI) literature with title and abstract, predicting
ICU admission of COVID-19 patients through CT radiomics and electronic health
records, and subtyping breast cancer using RNA-sequencing data. Varying levels
of noise to the training labels were introduced through label permutation.
Results show significant enhancements in classification performance: accuracy
enhancement in 86 out of 96 DILI experiments (up to 11.4%), AUROC and AUPRC
enhancements in all 48 COVID-19 experiments (up to 23.8% and 69.8%), and
accuracy and macro-average F1 score improvements in 47 out of 48 RNA-sequencing
experiments (up to 74.6% and 89.0%). Our method offers the potential to
substantially boost classification performance in multi-modal biomedical
machine learning tasks. Importantly, it accomplishes this without necessitating
an excessive volume of meticulously curated training data
Toward more accurate and generalizable brain deformation estimators for traumatic brain injury detection with unsupervised domain adaptation
Machine learning head models (MLHMs) are developed to estimate brain
deformation for early detection of traumatic brain injury (TBI). However, the
overfitting to simulated impacts and the lack of generalizability caused by
distributional shift of different head impact datasets hinders the broad
clinical applications of current MLHMs. We propose brain deformation estimators
that integrates unsupervised domain adaptation with a deep neural network to
predict whole-brain maximum principal strain (MPS) and MPS rate (MPSR). With
12,780 simulated head impacts, we performed unsupervised domain adaptation on
on-field head impacts from 302 college football (CF) impacts and 457 mixed
martial arts (MMA) impacts using domain regularized component analysis (DRCA)
and cycle-GAN-based methods. The new model improved the MPS/MPSR estimation
accuracy, with the DRCA method significantly outperforming other domain
adaptation methods in prediction accuracy (p<0.001): MPS RMSE: 0.027 (CF) and
0.037 (MMA); MPSR RMSE: 7.159 (CF) and 13.022 (MMA). On another two hold-out
test sets with 195 college football impacts and 260 boxing impacts, the DRCA
model significantly outperformed the baseline model without domain adaptation
in MPS and MPSR estimation accuracy (p<0.001). The DRCA domain adaptation
reduces the MPS/MPSR estimation error to be well below TBI thresholds, enabling
accurate brain deformation estimation to detect TBI in future clinical
applications
Padded Helmet Shell Covers in American Football: A Comprehensive Laboratory Evaluation with Preliminary On-Field Findings
Protective headgear effects measured in the laboratory may not always
translate to the field. In this study, we evaluated the impact attenuation
capabilities of a commercially available padded helmet shell cover in the
laboratory and field. In the laboratory, we evaluated the efficacy of the
padded helmet shell cover in attenuating impact magnitude across six impact
locations and three impact velocities when equipped to three different helmet
models. In a preliminary on-field investigation, we used instrumented
mouthguards to monitor head impact magnitude in collegiate linebackers during
practice sessions while not wearing the padded helmet shell covers (i.e., bare
helmets) for one season and whilst wearing the padded helmet shell covers for
another season. The addition of the padded helmet shell cover was effective in
attenuating the magnitude of angular head accelerations and two brain injury
risk metrics (DAMAGE, HARM) across most laboratory impact conditions, but did
not significantly attenuate linear head accelerations for all helmets. Overall,
HARM values were reduced in laboratory impact tests by an average of 25% at 3.5
m/s (range: 9.7 - 39.6%), 18% at 5.5 m/s (range: -5.5 - 40.5%), and 10% at 7.4
m/s (range: -6.0 - 31.0%). However, on the field, no significant differences in
any measure of head impact magnitude were observed between the bare helmet
impacts and padded helmet impacts. Further laboratory tests were conducted to
evaluate the ability of the padded helmet shell cover to maintain its
performance after exposure to repeated, successive impacts and across a range
of temperatures. This research provides a detailed assessment of padded helmet
shell covers and supports the continuation of in vivo helmet research to
validate laboratory testing results.Comment: 49 references, 8 figure
Classification of head impacts based on the spectral density of measurable kinematics
Traumatic brain injury can be caused by head impacts, but many brain injury
risk estimation models are less accurate across the variety of impacts that
patients may undergo. We investigated the spectral characteristics of different
head impact types with kinematics classification. Data was analyzed from 3,262
head impacts from lab reconstruction, American football, mixed martial arts,
and publicly available car crash data. A random forest classifier with spectral
densities of linear acceleration and angular velocity was built to classify
head impact types (e.g., football), reaching a median accuracy of 96% over
1,000 random partitions of training and test sets. To test the classifier on
data from different measurement devices, another 271 lab-reconstructed impacts
were obtained from 5 other instrumented mouthguards with the classifier
reaching over 96% accuracy. The most important features in the classification
included both low-frequency and high-frequency features, both linear
acceleration features and angular velocity features. Different head impact
types had different distributions of spectral densities in low-frequency and
high-frequency ranges (e.g., the spectral densities of MMA impacts were higher
in high-frequency range than in the low-frequency range). Finally, with the
classifier, type-specific, nearest-neighbor regression models were built for
95th percentile maximum principal strain, 95th percentile maximum principal
strain in corpus callosum, and cumulative strain damage (15th percentile). This
showed a generally higher R2-value than baseline models. The classifier enables
a better understanding of the impact kinematics in different sports, and it can
be applied to evaluate the quality of impact-simulation systems and on-field
data augmentation. Key words: traumatic brain injury, head impacts,
classification, impact kinematicsComment: 16 pages, 5 figure
Predictive Factors of Kinematics in Traumatic Brain Injury from Head Impacts Based on Statistical Interpretation
Brain tissue deformation resulting from head impacts is primarily caused by
rotation and can lead to traumatic brain injury. To quantify brain injury risk
based on measurements of kinematics on the head, finite element (FE) models and
various brain injury criteria based on different factors of these kinematics
have been developed, but the contribution of different kinematic factors has
not been comprehensively analyzed across different types of head impacts in a
data-driven manner. To better design brain injury criteria, the predictive
power of rotational kinematics factors, which are different in 1) the
derivative order (angular velocity, angular acceleration, angular jerk), 2) the
direction and 3) the power (e.g., square-rooted, squared, cubic) of the angular
velocity, were analyzed based on different datasets including laboratory
impacts, American football, mixed martial arts (MMA), NHTSA automobile
crashworthiness tests and NASCAR crash events. Ordinary least squares
regressions were built from kinematics factors to the 95\% maximum principal
strain (MPS95), and we compared zero-order correlation coefficients, structure
coefficients, commonality analysis, and dominance analysis. The angular
acceleration, the magnitude, and the first power factors showed the highest
predictive power for the majority of impacts including laboratory impacts,
American football impacts, with few exceptions (angular velocity for MMA and
NASCAR impacts). The predictive power of rotational kinematics in three
directions (x: posterior-to-anterior, y: left-to-right, z:
superior-to-inferior) of kinematics varied with different sports and types of
head impacts