10,938 research outputs found
Recommended from our members
Prediction of inherited genomic susceptibility to 20 common cancer types by a supervised machine-learning method.
Prevention and early intervention are the most effective ways of avoiding or minimizing psychological, physical, and financial suffering from cancer. However, such proactive action requires the ability to predict the individual's susceptibility to cancer with a measure of probability. Of the triad of cancer-causing factors (inherited genomic susceptibility, environmental factors, and lifestyle factors), the inherited genomic component may be derivable from the recent public availability of a large body of whole-genome variation data. However, genome-wide association studies have so far showed limited success in predicting the inherited susceptibility to common cancers. We present here a multiple classification approach for predicting individuals' inherited genomic susceptibility to acquire the most likely phenotype among a panel of 20 major common cancer types plus 1 "healthy" type by application of a supervised machine-learning method under competing conditions among the cohorts of the 21 types. This approach suggests that, depending on the phenotypes of 5,919 individuals of "white" ethnic population in this study, (i) the portion of the cohort of a cancer type who acquired the observed type due to mostly inherited genomic susceptibility factors ranges from about 33 to 88% (or its corollary: the portion due to mostly environmental and lifestyle factors ranges from 12 to 67%), and (ii) on an individual level, the method also predicts individuals' inherited genomic susceptibility to acquire the other types ranked with associated probabilities. These probabilities may provide practical information for individuals, heath professionals, and health policymakers related to prevention and/or early intervention of cancer
Drug prescription support in dental clinics through drug corpus mining
The rapid increase in the volume and variety of data poses a challenge to safe drug prescription for the dentist. The increasing number of patients that take multiple drugs further exerts pressure on the dentist to make the right decision at point-of-care. Hence, a robust decision support system will enable dentists to make decisions on drug prescription quickly and accurately. Based on the assumption that similar drug pairs have a higher similarity ratio, this paper suggests an innovative approach to obtain the similarity ratio between the drug that the dentist is going to prescribe and the drug that the patient is currently taking. We conducted experiments to obtain the similarity ratios of both positive and negative drug pairs, by using feature vectors generated from term similarities and word embeddings of biomedical text corpus. This model can be easily adapted and implemented for use in a dental clinic to assist the dentist in deciding if a drug is suitable for prescription, taking into consideration the medical profile of the patients. Experimental evaluation of our model’s association of the similarity ratio between two drugs yielded a superior F score of 89%. Hence, such an approach, when integrated within the clinical work flow, will reduce prescription errors and thereby increase the health outcomes of patients
Drug prescription support in dental clinics through drug corpus mining
The rapid increase in the volume and variety of data poses a challenge to safe drug prescription for the dentist. The increasing number of patients that take multiple drugs further exerts pressure on the dentist to make the right decision at point-of-care. Hence, a robust decision support system will enable dentists to make decisions on drug prescription quickly and accurately. Based on the assumption that similar drug pairs have a higher similarity ratio, this paper suggests an innovative approach to obtain the similarity ratio between the drug that the dentist is going to prescribe and the drug that the patient is currently taking. We conducted experiments to obtain the similarity ratios of both positive and negative drug pairs, by using feature vectors generated from term similarities and word embeddings of biomedical text corpus. This model can be easily adapted and implemented for use in a dental clinic to assist the dentist in deciding if a drug is suitable for prescription, taking into consideration the medical profile of the patients. Experimental evaluation of our model’s association of the similarity ratio between two drugs yielded a superior F score of 89%. Hence, such an approach, when integrated within the clinical work flow, will reduce prescription errors and thereby increase the health outcomes of patients
Estimating good discrete partitions from observed data: symbolic false nearest neighbors
A symbolic analysis of observed time series data requires making a discrete
partition of a continuous state space containing observations of the dynamics.
A particular kind of partition, called ``generating'', preserves all dynamical
information of a deterministic map in the symbolic representation, but such
partitions are not obvious beyond one dimension, and existing methods to find
them require significant knowledge of the dynamical evolution operator or the
spectrum of unstable periodic orbits. We introduce a statistic and algorithm to
refine empirical partitions for symbolic state reconstruction. This method
optimizes an essential property of a generating partition: avoiding topological
degeneracies. It requires only the observed time series and is sensible even in
the presence of noise when no truly generating partition is possible. Because
of its resemblance to a geometrical statistic frequently used for
reconstructing valid time-delay embeddings, we call the algorithm ``symbolic
false nearest neighbors''
Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates
We establish the first nonasymptotic error bounds for Kaplan-Meier-based
nearest neighbor and kernel survival probability estimators where feature
vectors reside in metric spaces. Our bounds imply rates of strong consistency
for these nonparametric estimators and, up to a log factor, match an existing
lower bound for conditional CDF estimation. Our proof strategy also yields
nonasymptotic guarantees for nearest neighbor and kernel variants of the
Nelson-Aalen cumulative hazards estimator. We experimentally compare these
methods on four datasets. We find that for the kernel survival estimator, a
good choice of kernel is one learned using random survival forests.Comment: International Conference on Machine Learning (ICML 2019
- …