183 research outputs found
Substructure Discovery in SUBDUE
Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNational Science Foundation / NSF IST-85-11170Office of Naval Research / N00014-82-K-0186Defense Advanced Research Projects Agency / N00014-87-K-0874Texas Instruments, Inc
Parallel Knowledge Discovery from Large Complex Databases
NASA is focusing on grand challenge problems in Earth and space sciences. Within these areas of science, new instrumentation will be providing scientists with unprecedented amounts of unprocessed data. Our goal is to design and implement a system that takes raw data as input and efficiently discovers interesting concepts that can target areas for further investigation and can be used to compress the data. Our approach will provide an intelligent parallel data analysis system
Generalized Query-Based Active Learning to Identify Differentially Methylated Regions in DNA
Active learning is a supervised learning technique that reduces the number of examples required for building a successful classifier, because it can choose the data it learns from. This technique holds promise for many biological domains in which classified examples are expensive and time-consuming to obtain. Most traditional active learning methods ask very specific queries to the Oracle (e.g., a human expert) to label an unlabeled example. The example may consist of numerous features, many of which are irrelevant. Removing such features will create a shorter query with only relevant features, and it will be easier for the Oracle to answer. We propose a generalized query-based active learning (GQAL) approach that constructs generalized queries based on multiple instances. By constructing appropriately generalized queries, we can achieve higher accuracy compared to traditional active learning methods. We apply our active learning method to find differentially DNA methylated regions (DMRs). DMRs are DNA locations in the genome that are known to be involved in tissue differentiation, epigenetic regulation, and disease. We also apply our method on 13 other data sets and show that our method is better than another popular active learning technique
Toward Intelligent Machine Learning Algorithms
Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNational Science Foundation / NSF IST-85-11170Office of Naval Research / N00014-82-K-0186Defense Advanced Research Projects Agency / N00014-87-K-0874Texas Instruments, Inc
Recommended from our members
Multiple generation distinct toxicant exposures induce epigenetic transgenerational inheritance of enhanced pathology and obesity
Three successive multiple generations of rats were exposed to different toxicants and then bred to the transgenerational F5 generation to assess the impacts of multiple generation different exposures. The current study examines the actions of the agricultural fungicide vinclozolin on the F0 generation, followed by jet fuel hydrocarbon mixture exposure of the F1 generation, and then pesticide dichlorodiphenyltrichloroethane on the F2 generation gestating females. The subsequent F3 and F4 generations and F5 transgenerational generation were obtained and F1-F5 generations examined for male sperm epigenetic alterations and pathology in males and females. Significant impacts on the male sperm differential DNA methylation regions were observed. The F3-F5 generations were similar in ∼50% of the DNA methylation regions. The pathology of each generation was assessed in the testis, ovary, kidney, and prostate, as well as the presence of obesity and tumors. The pathology used a newly developed Deep Learning, artificial intelligence-based histopathology analysis. Observations demonstrated compounded disease impacts in obesity and metabolic parameters, but other pathologies plateaued with smaller increases at the F5 transgenerational generation. Observations demonstrate that multiple generational exposures, which occur in human populations, appear to increase epigenetic impacts and disease susceptibility
- …