430,234 research outputs found
DNA methylation-based age prediction and telomere length in white blood cells and cumulus cells of infertile women with normal or poor response to ovarian stimulation.
An algorithm assessing the methylation levels of 353 informative CpG sites in the human genome permits accurate prediction of the chronologic age of a subject. Interestingly, when there is discrepancy between the predicted age and chronologic age (age acceleration or AgeAccel ), patients are at risk for morbidity and mortality. Identification of infertile patients at risk for accelerated reproductive senescence may permit preventative action. This study aimed to assess the accuracy of the epigenetic clock concept in reproductive age women undergoing fertility treatment by applying the age prediction algorithm in peripheral (white blood cells [WBCs]) and follicular somatic cells (cumulus cells [CCs]), and to identify whether women with premature reproductive aging (diminished ovarian reserve) were at risk of AgeAccel in their age prediction. Results indicated that the epigenetic algorithm accurately predicts age when applied to WBCs but not to CCs. The age prediction of CCs was substantially younger than chronologic age regardless of the patient\u27s age or response to stimulation. In addition, telomeres of CCs were significantly longer than that of WBCs. Our findings suggest that CCs do not demonstrate changes in methylome-predicted age or telomere-length in association with increasing female age or ovarian response to stimulation
Predicting Alzheimer's risk: why and how?
Because the pathologic processes that underlie Alzheimer's disease (AD) appear to start 10 to 20 years before symptoms develop, there is currently intense interest in developing techniques to accurately predict which individuals are most likely to become symptomatic. Several AD risk prediction strategies - including identification of biomarkers and neuroimaging techniques and development of risk indices that combine traditional and non-traditional risk factors - are being explored. Most AD risk prediction strategies developed to date have had moderate prognostic accuracy but are limited by two key issues. First, they do not explicitly model mortality along with AD risk and, therefore, do not differentiate individuals who are likely to develop symptomatic AD prior to death from those who are likely to die of other causes. This is critically important so that any preventive treatments can be targeted to maximize the potential benefit and minimize the potential harm. Second, AD risk prediction strategies developed to date have not explored the full range of predictive variables (biomarkers, imaging, and traditional and non-traditional risk factors) over the full preclinical period (10 to 20 years). Sophisticated modeling techniques such as hidden Markov models may enable the development of a more comprehensive AD risk prediction algorithm by combining data from multiple cohorts. As the field moves forward, it will be critically important to develop techniques that simultaneously model the risk of mortality as well as the risk of AD over the full preclinical spectrum and to consider the potential harm as well as the benefit of identifying and treating high-risk older patients
Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction
We study the interplay between surrogate methods for structured prediction
and techniques from multitask learning designed to leverage relationships
between surrogate outputs. We propose an efficient algorithm based on trace
norm regularization which, differently from previous methods, does not require
explicit knowledge of the coding/decoding functions of the surrogate framework.
As a result, our algorithm can be applied to the broad class of problems in
which the surrogate space is large or even infinite dimensional. We study
excess risk bounds for trace norm regularized structured prediction, implying
the consistency and learning rates for our estimator. We also identify relevant
regimes in which our approach can enjoy better generalization performance than
previous methods. Numerical experiments on ranking problems indicate that
enforcing low-rank relations among surrogate outputs may indeed provide a
significant advantage in practice.Comment: 42 pages, 1 tabl
Fast Ridge Regression with Randomized Principal Component Analysis and Gradient Descent
We propose a new two stage algorithm LING for large scale regression
problems. LING has the same risk as the well known Ridge Regression under the
fixed design setting and can be computed much faster. Our experiments have
shown that LING performs well in terms of both prediction accuracy and
computational efficiency compared with other large scale regression algorithms
like Gradient Descent, Stochastic Gradient Descent and Principal Component
Regression on both simulated and real datasets
High-Resolution Road Vehicle Collision Prediction for the City of Montreal
Road accidents are an important issue of our modern societies, responsible
for millions of deaths and injuries every year in the world. In Quebec only, in
2018, road accidents are responsible for 359 deaths and 33 thousands of
injuries. In this paper, we show how one can leverage open datasets of a city
like Montreal, Canada, to create high-resolution accident prediction models,
using big data analytics. Compared to other studies in road accident
prediction, we have a much higher prediction resolution, i.e., our models
predict the occurrence of an accident within an hour, on road segments defined
by intersections. Such models could be used in the context of road accident
prevention, but also to identify key factors that can lead to a road accident,
and consequently, help elaborate new policies.
We tested various machine learning methods to deal with the severe class
imbalance inherent to accident prediction problems. In particular, we
implemented the Balanced Random Forest algorithm, a variant of the Random
Forest machine learning algorithm in Apache Spark. Interestingly, we found that
in our case, Balanced Random Forest does not perform significantly better than
Random Forest.
Experimental results show that 85% of road vehicle collisions are detected by
our model with a false positive rate of 13%. The examples identified as
positive are likely to correspond to high-risk situations. In addition, we
identify the most important predictors of vehicle collisions for the area of
Montreal: the count of accidents on the same road segment during previous
years, the temperature, the day of the year, the hour and the visibility
- …
