18 research outputs found
Exact Cross-Validation for k-NN in binary classification, applications to passive and active learning
International audienc
Recommended from our members
Treatment patterns and outcomes for pancreatic tumors in children: an analysis of the National Cancer Database
Pancreatic tumors are rare in children and limited data are available regarding incidence, treatment, and outcomes. We aim to describe patient and tumor characteristics and to report on survival of these diseases.
Children with pancreatic tumors were queried from the National Cancer Database (2004-2014). The association between treatment and hazard of death was assessed using Kaplan-Meier method and Cox regression model.
We identified 109 children with pancreatic tumors; 52% were male and median age at diagnosis was 14Â years. Tumors were distributed as follows: pseudopapillary neoplasm (30%), endocrine tumors (27%), pancreatoblastoma (16%), pancreatic adenocarcinoma (16%), sarcoma (6%) and neuroblastoma (5%). Seventy-nine patients underwent surgery, of which 76% achieved R0 resection. Most patients (85%) had lymph nodes examined, of which 22% had positive nodes. Five-year overall survival by tumor histology was 95% (pseudopapillary neoplasm), 75% (neuroblastoma), 70% (pancreatoblastoma), 51% (endocrine tumors), 43% (sarcoma), and 34% (adenocarcinoma). On multivariable analysis, surgical resection was the strongest predictor of survival (HR 0.26, 95% CI 0.10-0.68, pâ<â0.01).
Overall survival of children with pancreatic tumors is grim, with varying survival rates among different tumors. Surgical resection is associated with improved long-term survival
Adjacency-constrained hierarchical clustering of a band similarity matrix with application to Genomics
International audienceMotivation: Genomic data analyses such as Genome-Wide Association Studies (GWAS) or Hi-C studies are often faced with the problem of partitioning chromosomes into successive regions based on a similarity matrix of high-resolution, locus-level measurements. An intuitive way of doing this is to perform a modified Hierarchical Agglomerative Clustering (HAC), where only adjacent clusters (according to the ordering of positions within a chromosome) are allowed to be merged. A major practical drawback of this method is its quadratic time and space complexity in the number of loci, which is typically of the order of 10^4 to 10^5 for each chromosome. Results: By assuming that the similarity between physically distant objects is negligible, we propose an implementation of this adjacency-constrained HAC with quasi-linear complexity. Our illustrations on GWAS and Hi-C datasets demonstrate the relevance of this assumption, and show that this method highlights biologically meaningful signals. Thanks to its small time and memory footprint, the method can be run on a standard laptop in minutes or even seconds. Availability and Implementation: Software and sample data are available as an R package, adjclust, that can be downloaded from the Comprehensive R Archive Network (CRAN)