60 research outputs found

    Improved shrunken centroid classifiers for high-dimensional class-imbalanced data

    Get PDF
    BACKGROUND: PAM, a nearest shrunken centroid method (NSC), is a popular classification method for high-dimensional data. ALP and AHP are NSC algorithms that were proposed to improve upon PAM. The NSC methods base their classification rules on shrunken centroids; in practice the amount of shrinkage is estimated minimizing the overall cross-validated (CV) error rate. RESULTS: We show that when data are class-imbalanced the three NSC classifiers are biased towards the majority class. The bias is larger when the number of variables or class-imbalance is larger and/or the differences between classes are smaller. To diminish the class-imbalance problem of the NSC classifiers we propose to estimate the amount of shrinkage by maximizing the CV geometric mean of the class-specific predictive accuracies (g-means). CONCLUSIONS: The results obtained on simulated and real high-dimensional class-imbalanced data show that our approach outperforms the currently used strategy based on the minimization of the overall error rate when NSC classifiers are biased towards the majority class. The number of variables included in the NSC classifiers when using our approach is much smaller than with the original approach. This result is supported by experiments on simulated and real high-dimensional class-imbalanced data

    Class prediction for high-dimensional class-imbalanced data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The goal of class prediction studies is to develop rules to accurately predict the class membership of new samples. The rules are derived using the values of the variables available for each subject: the main characteristic of high-dimensional data is that the number of variables greatly exceeds the number of samples. Frequently the classifiers are developed using class-imbalanced data, i.e., data sets where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced data often produce classifiers that do not accurately predict the minority class; the prediction is biased towards the majority class. In this paper we investigate if the high-dimensionality poses additional challenges when dealing with class-imbalanced prediction. We evaluate the performance of six types of classifiers on class-imbalanced data, using simulated data and a publicly available data set from a breast cancer gene-expression microarray study. We also investigate the effectiveness of some strategies that are available to overcome the effect of class imbalance.</p> <p>Results</p> <p>Our results show that the evaluated classifiers are highly sensitive to class imbalance and that variable selection introduces an additional bias towards classification into the majority class. Most new samples are assigned to the majority class from the training set, unless the difference between the classes is very large. As a consequence, the class-specific predictive accuracies differ considerably. When the class imbalance is not too severe, down-sizing and asymmetric bagging embedding variable selection work well, while over-sampling does not. Variable normalization can further worsen the performance of the classifiers.</p> <p>Conclusions</p> <p>Our results show that matching the prevalence of the classes in training and test set does not guarantee good performance of classifiers and that the problems related to classification with class-imbalanced data are exacerbated when dealing with high-dimensional data. Researchers using class-imbalanced data should be careful in assessing the predictive accuracy of the classifiers and, unless the class imbalance is mild, they should always use an appropriate method for dealing with the class imbalance problem.</p

    News

    Get PDF
    Detailed description of the classifiers. In the Additional file we provide a description of each classifier used in the paper. (PDF 201 kb

    Povečana debelina intime in medije in znižani gleženjski indeks pri bolnikih po prebolelem infarktu miokarda

    No full text
    Izhodišče: Zvečana debelina intime in medije (DIM) na karotidnih arterijah in znižani gleženjski indeks (GI) veljata za zgodnja kazalnika ateroskleroze. Cilj raziskave je bil ugotoviti pomen DIM na karotidnih arterijah in GI kot kazalnika ateroskleroze pri bolnikih s prebolelim akutnim miokardnim infarktom (AMI). Metode: Raziskali smo dve skupini preiskovancev. V testni skupini je bilo 50 bolnikov s prebolelim AMI med 38. in 78. letom starosti, od tega 32 moških in 18 žensk. Kontrolna skupina, ki je imela tudi 50 preiskovancev brez ishemične bolezni srca, je bila primerljiva s testno po starosti, spolu in kraju bivanja. Meritve debeline intime in medije so bile izvedene z B-ultrazvočnim prikazom z visoko resolucijo, in sicer na zadnji steni na treh različnih mestih karotidnega žilja: v področju skupne karotidne arterije 1 cm pod začetkom razcepišča in v področju bulbusa in notranje karotidne arterije 1 cm za razcepiščem. Periferno arterijsko bolezen (PAB) na spodnjih udih, gleženjski indeks je bil 0,9 ali manj, smo ugotavljali z merjenjem sistoličnega krvnega tlaka z ultrazvočnim dopplerskim detektorjem. Rezultati: V testni skupini je imelo 32 (64 %) preiskovancev maksimalno zadebelitev intime in medije nad 0,9 mm, medtem ko je bila v kontrolni skupini pri 16 (32 %) preiskovancih ugotovljena DIM nad 0,9 (p < 0,05). Povprečna DIM pri preiskovancih s prebolelim AMI (mediana 0,85, interkvartilni razmik (0,72–0,95) je bila v primerjavi s kontrolno skupino 0,74 (0,67–0,86) statistično značilno večja (p < 0,05). Bolniki z AMI in PAB so imeli pomembno večjo DIM 0,93 (0,86–1,03) v primerjavi z bolniki z AMI in brez PAB 0,73 (0,65–0,84), (p < 0,001). Razlika v DIM med bolniki z AMI z dvigom ST-veznice (STEMI), 0,84 (0,70–0,96), in bolnikov z AMI brez dviga ST-veznice (NSTEMI), 0,89 (0,79–0,95), ni bila statistično značilna. PAB smo ugotovili pri 24 (48 %) bolnikih s prebolelim AMI in pri 10 (20 %) preiskovancih kontrolne skupine (p < 0,05). Zaključek: Bolniki s prebolelim AMI so imeli pomembno večjo debelino intime in medije in večjo prevalenco PAB v primerjavi s kontrolno skupino. Pri preiskovancih testne in kontrolne skupine smo ugotovili zvečano debelino intime in medije ter znižan gleženjski indeks kot kazalnika ateroskleroze

    A Prospective Cohort Study on Cardiotoxicity of Adjuvant Trastuzumab Therapy in Breast Cancer Patients

    Get PDF
    Abstract Background: Cardiotoxicity is an important side effect of trastuzumab therapy and cardiac surveillance is recommended

    Physical activity, screen time and the COVID-19 school closures in Europe – an observational study in 10 countries

    Get PDF
    To date, few data on how the COVID-19 pandemic and restrictions affected children’s physical activity in Europe have been published. This study examined the prevalence and correlates of physical activity and screen time from a large sample of European children during the COVID-19 pandemic to inform strategies and provide adequate mitigation measures. An online survey was conducted using convenience sampling from 15 May to 22 June, 2020. Parents were eligible if they resided in one of the survey countries and their children aged 6–18 years. 8395 children were included (median age [IQR], 13 [10–15] years; 47% boys; 57.6% urban residents; 15.5% in self-isolation). Approximately two-thirds followed structured routines (66.4% [95%CI, 65.4–67.4]), and more than half were active during online P.E. (56.6% [95%CI, 55.5–57.6]). 19.0% (95%CI, 18.2–19.9) met the WHO Global physical activity recommendation. Total screen time in excess of 2 h/day was highly prevalent (weekdays: 69.5% [95%CI, 68.5–70.5]; weekend: 63.8% [95%CI, 62.7–64.8]). Playing outdoors more than 2 h/day, following a daily routine and being active in online P.E. increased the odds of healthy levels of physical activity and screen time, particularly in mildly affected countries. In severely affected countries, online P.E. contributed most to meet screen time recommendation, whereas outdoor play was most important for adequate physical activity. Promoting safe and responsible outdoor activities, safeguarding P.E. lessons during distance learning and setting pre-planned, consistent daily routines are important in helping children maintain healthy active lifestyle in pandemic situation. These factors should be prioritised by policymakers, schools and parents. Highlights • To our knowledge, our data provide the first multi-national estimates on physical activity and total screen time in European children roughly two months after COVID-19 was declared a global pandemic. • Only 1 in 5 children met the WHO Global physical activity recommendations. • Under pandemic conditions, parents should set pre-planned, consistent daily routines and integrate at least 2-hours outdoor activities into the daily schedule, preferable on each day. Schools should make P.E. lessons a priority. Decision makers should mandate online P.E. be delivered by schools during distance learning. Closing outdoor facilities for PA should be considered only as the last resort during lockdowns

    European fitness landscape for children and adolescents: updated reference values, fitness maps and country rankings based on nearly 8 million test results from 34 countries gathered by the FitBack network

    Full text link
    OBJECTIVES (1) To develop reference values for health-related fitness in European children and adolescents aged 6-18 years that are the foundation for the web-based, open-access and multilanguage fitness platform (FitBack); (2) to provide comparisons across European countries. METHODS This study builds on a previous large fitness reference study in European youth by (1) widening the age demographic, (2) identifying the most recent and representative country-level data and (3) including national data from existing fitness surveillance and monitoring systems. We used the Assessing Levels of PHysical Activity and fitness at population level (ALPHA) test battery as it comprises tests with the highest test-retest reliability, criterion/construct validity and health-related predictive validity: the 20 m shuttle run (cardiorespiratory fitness); handgrip strength and standing long jump (muscular strength); and body height, body mass, body mass index and waist circumference (anthropometry). Percentile values were obtained using the generalised additive models for location, scale and shape method. RESULTS A total of 7 966 693 test results from 34 countries (106 datasets) were used to develop sex-specific and age-specific percentile values. In addition, country-level rankings based on mean percentiles are provided for each fitness test, as well as an overall fitness ranking. Finally, an interactive fitness platform, including individual and group reporting and European fitness maps, is provided and freely available online (www.fitbackeurope.eu). CONCLUSION This study discusses the major implications of fitness assessment in youth from health, educational and sport perspectives, and how the FitBack reference values and interactive web-based platform contribute to it. Fitness testing can be conducted in school and/or sport settings, and the interpreted results be integrated in the healthcare systems across Europe

    European fitness landscape for children and adolescents: updated reference values, fitness maps and country rankings based on nearly 8 million test results from 34 countries gathered by the FitBack network

    Get PDF
    Objectives (1) To develop reference values for health-related fitness in European children and adolescents aged 6–18 years that are the foundation for the web-based, open-access and multilanguage fitness platform (FitBack); (2) to provide comparisons across European countries. Methods This study builds on a previous large fitness reference study in European youth by (1) widening the age demographic, (2) identifying the most recent and representative country-level data and (3) including national data from existing fitness surveillance and monitoring systems. We used the Assessing Levels of PHysical Activity and fitness at population level (ALPHA) test battery as it comprises tests with the highest test–retest reliability, criterion/construct validity and health-related predictive validity: the 20 m shuttle run (cardiorespiratory fitness); handgrip strength and standing long jump (muscular strength); and body height, body mass, body mass index and waist circumference (anthropometry). Percentile values were obtained using the generalised additive models for location, scale and shape method. Results A total of 7 966 693 test results from 34 countries (106 datasets) were used to develop sex-specific and age-specific percentile values. In addition, country-level rankings based on mean percentiles are provided for each fitness test, as well as an overall fitness ranking. Finally, an interactive fitness platform, including individual and group reporting and European fitness maps, is provided and freely available online (www.fitbackeurope.eu). Conclusion This study discusses the major implications of fitness assessment in youth from health, educational and sport perspectives, and how the FitBack reference values and interactive web-based platform contribute to it. Fitness testing can be conducted in school and/or sport settings, and the interpreted results be integrated in the healthcare systems across Europe
    corecore