185 research outputs found
Mutation Clusters from Cancer Exome
We apply our statistically deterministic machine learning/clustering
algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to
10,656 published exome samples for 32 cancer types. A majority of cancer types
exhibit mutation clustering structure. Our results are in-sample stable. They
are also out-of-sample stable when applied to 1,389 published genome samples
across 14 cancer types. In contrast, we find in- and out-of-sample
instabilities in cancer signatures extracted from exome samples via nonnegative
matrix factorization (NMF), a computationally costly and non-deterministic
method. Extracting stable mutation structures from exome data could have
important implications for speed and cost, which are critical for early-stage
cancer diagnostics such as novel blood-test methods currently in development.Comment: 84 page
- …