616 research outputs found
Redescription Mining and Applications in Bioinformatics
Our ability to interrogate the cell and computationally assimilate its answers is improving at a dramatic pace. For instance, the study of even a focused aspect of cellular activity, such as gene action, now benefits from multiple high-throughput data acquisition technologies such as microarrays, genome-wide deletion screens, and RNAi assays. A critical need is the development of algorithms that can bridge, relate, and unify diverse categories of data descriptors. Redescription mining is such an approach. Given a set of biological objects (e.g., genes, proteins) and a collection of descriptors defined over this set, the goal of redescription mining is to use the given descriptors as a vocabulary and find subsets of data that afford multiple definitions. The premise of redescription mining is that subsets that afford multiple definitions are likely to exhibit concerted behavior and are, hence, interesting. We present algorithms for redescription mining based on formal concept analysis and applications of redescription mining to multiple biological datasets. We demonstrate how redescriptions identify conceptual clusters of data using mutually reinforcing features, without explicit training information.
Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data
With an increased interest in the production of personal health technologies
designed to track user data (e.g., nutrient intake, step counts), there is now
more opportunity than ever to surface meaningful behavioral insights to
everyday users in the form of natural language. This knowledge can increase
their behavioral awareness and allow them to take action to meet their health
goals. It can also bridge the gap between the vast collection of personal
health data and the summary generation required to describe an individual's
behavioral tendencies. Previous work has focused on rule-based time-series data
summarization methods designed to generate natural language summaries of
interesting patterns found within temporal personal health data. We examine
recurrent, convolutional, and Transformer-based encoder-decoder models to
automatically generate natural language summaries from numeric temporal
personal health data. We showcase the effectiveness of our models on real user
health data logged in MyFitnessPal and show that we can automatically generate
high-quality natural language summaries. Our work serves as a first step
towards the ambitious goal of automatically generating novel and meaningful
temporal summaries from personal health data.Comment: 5 pages, 2 figures, 1 tabl
- …