220,397 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Curriculum Guidelines for Undergraduate Programs in Data Science
The Park City Math Institute (PCMI) 2016 Summer Undergraduate Faculty Program
met for the purpose of composing guidelines for undergraduate programs in Data
Science. The group consisted of 25 undergraduate faculty from a variety of
institutions in the U.S., primarily from the disciplines of mathematics,
statistics and computer science. These guidelines are meant to provide some
structure for institutions planning for or revising a major in Data Science
Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation
We present a probabilistic model that uses both prosodic and lexical cues for
the automatic segmentation of speech into topically coherent units. We propose
two methods for combining lexical and prosodic information using hidden Markov
models and decision trees. Lexical information is obtained from a speech
recognizer, and prosodic features are extracted automatically from speech
waveforms. We evaluate our approach on the Broadcast News corpus, using the
DARPA-TDT evaluation metrics. Results show that the prosodic model alone is
competitive with word-based segmentation methods. Furthermore, we achieve a
significant reduction in error by combining the prosodic and word-based
knowledge sources.Comment: 27 pages, 8 figure
E-methods in literary production: integrating e-learning in creative writing
This paper discusses the integration of e-learning in creative writing. The online approach to the teaching of creative writing takes into account today’s Malaysian youth and their fascination with computer technology. It is this appeal of innovation in electronics and knowledge that leads an educator to design an on-line approach to a creative writing course. The theoretical construct used to support the discussion is Anderson’s theory that on-line learning is knowledge-, community-, assessment-, and learner-centered. The writer, who is also the course developer, analyses a poetry-writing activity, which students undertake, and the e-portfolio used in the course. To analyze the processes involved in this creative writing exercise Macherey’s (1978) Theory of Literary Production is adapted and utilized. This theory, which regards literary production as a process imitating that of a production line, provides the methodology and conceptual framework for analyzing the raw materials collected by the students and their transformation during the writing process. This paper thus addresses the benefits of e-learning in a creative writing context
- …