529 research outputs found
Deep Metric Loss for Multimodal Learning
Multimodal learning often outperforms its unimodal counterparts by exploiting
unimodal contributions and cross-modal interactions. However, focusing only on
integrating multimodal features into a unified comprehensive representation
overlooks the unimodal characteristics. In real data, the contributions of
modalities can vary from instance to instance, and they often reinforce or
conflict with each other. In this study, we introduce a novel \text{MultiModal}
loss paradigm for multimodal learning, which subgroups instances according to
their unimodal contributions. \text{MultiModal} loss can prevent inefficient
learning caused by overfitting and efficiently optimize multimodal models. On
synthetic data, \text{MultiModal} loss demonstrates improved classification
performance by subgrouping difficult instances within certain modalities. On
four real multimodal datasets, our loss is empirically shown to improve the
performance of recent models. Ablation studies verify the effectiveness of our
loss. Additionally, we show that our loss generates a reliable prediction score
for each modality, which is essential for subgrouping. Our \text{MultiModal}
loss is a novel loss function to subgroup instances according to the
contribution of modalities in multimodal learning and is applicable to a
variety of multimodal models with unimodal decisions. Our code is available at
https://github.com/SehwanMoon/MultiModalLoss.Comment: 18 pages, 9 figure
Using Astronomical Photographs to Investigate Misconceptions about Galaxies and Spectra: Question Development for Clicker Use
Many topics in introductory astronomy at the college or high-school level
rely implicitly on using astronomical photographs and visual data in class.
However, students bring many preconceptions to their understanding of these
materials that ultimately lead to misconceptions, and research about students'
interpretation of astronomical images has been scarcely conducted. In this
study we probed college students' understanding of astronomical photographs and
visual data about galaxies and spectra, and developed a set of concept
questions based on their common misconceptions. The study was conducted mainly
in three successive surveys: 1) open-ended questions looking for students'
ideas and common misconceptions, 2) combined multiple-choice and open-ended
questions seeking to explore student reasoning and to improve concept questions
for clickers, and 3) a finalized version of the concept questions used to
investigate the strength of each misconception among the students in
introductory astronomy courses. This study reports on the procedures and the
development of the concept questions with the investigated common
misconceptions about galaxies and spectra. We also provide the set of developed
questions for teachers and instructors seeking to implement in their classes
for the purpose of formative assessment with the use of classroom response
systems. These questions would help them recognize the gap between their
teaching and students' understanding, and ultimately improve teaching of the
concepts.Comment: published in PRST-PER (July 13, 2015
Body Information Analysis based Personal Exercise Management System
Recently, people's interest in health is deepening. So health-related systems are being developed. Existing exercise management systems provided users with exercise related information using PC or smart phone. However, there is a problem that the accuracy of the algorithm for analyzing the user's body information and providing information is low.In this paper, we analyze users' body mass index (BMI) and basal metabolic rate (BMR) and we propose a system that provides the user with necessary information through recommendation algorithm. It informs the user of exercise intensity and momentum, and graphs the exercise history of the user. It also allows the user to refer to the fitness history of other users in the same BMI group. This allows the user to receive more personalized services than the existing exercise management system, thereby enabling efficient exercise
An integrated approach to the prediction of domain-domain interactions
BACKGROUND: The development of high-throughput technologies has produced several large scale protein interaction data sets for multiple species, and significant efforts have been made to analyze the data sets in order to understand protein activities. Considering that the basic units of protein interactions are domain interactions, it is crucial to understand protein interactions at the level of the domains. The availability of many diverse biological data sets provides an opportunity to discover the underlying domain interactions within protein interactions through an integration of these biological data sets. RESULTS: We combine protein interaction data sets from multiple species, molecular sequences, and gene ontology to construct a set of high-confidence domain-domain interactions. First, we propose a new measure, the expected number of interactions for each pair of domains, to score domain interactions based on protein interaction data in one species and show that it has similar performance as the E-value defined by Riley et al. [1]. Our new measure is applied to the protein interaction data sets from yeast, worm, fruitfly and humans. Second, information on pairs of domains that coexist in known proteins and on pairs of domains with the same gene ontology function annotations are incorporated to construct a high-confidence set of domain-domain interactions using a Bayesian approach. Finally, we evaluate the set of domain-domain interactions by comparing predicted domain interactions with those defined in iPfam database [2,3] that were derived based on protein structures. The accuracy of predicted domain interactions are also confirmed by comparing with experimentally obtained domain interactions from H. pylori [4]. As a result, a total of 2,391 high-confidence domain interactions are obtained and these domain interactions are used to unravel detailed protein and domain interactions in several protein complexes. CONCLUSION: Our study shows that integration of multiple biological data sets based on the Bayesian approach provides a reliable framework to predict domain interactions. By integrating multiple data sources, the coverage and accuracy of predicted domain interactions can be significantly increased
Prediction of plant promoters based on hexamers and random triplet pair analysis
<p>Abstract</p> <p>Background</p> <p>With an increasing number of plant genome sequences, it has become important to develop a robust computational method for detecting plant promoters. Although a wide variety of programs are currently available, prediction accuracy of these still requires further improvement. The limitations of these methods can be addressed by selecting appropriate features for distinguishing promoters and non-promoters.</p> <p>Methods</p> <p>In this study, we proposed two feature selection approaches based on hexamer sequences: the Frequency Distribution Analyzed Feature Selection Algorithm (FDAFSA) and the Random Triplet Pair Feature Selecting Genetic Algorithm (RTPFSGA). In FDAFSA, adjacent triplet-pairs (hexamer sequences) were selected based on the difference in the frequency of hexamers between promoters and non-promoters. In RTPFSGA, random triplet-pairs (RTPs) were selected by exploiting a genetic algorithm that distinguishes frequencies of non-adjacent triplet pairs between promoters and non-promoters. Then, a support vector machine (SVM), a nonlinear machine-learning algorithm, was used to classify promoters and non-promoters by combining these two feature selection approaches. We referred to this novel algorithm as PromoBot.</p> <p>Results</p> <p>Promoter sequences were collected from the PlantProm database. Non-promoter sequences were collected from plant mRNA, rRNA, and tRNA of PlantGDB and plant miRNA of miRBase. Then, in order to validate the proposed algorithm, we applied a 5-fold cross validation test. Training data sets were used to select features based on FDAFSA and RTPFSGA, and these features were used to train the SVM. We achieved 89% sensitivity and 86% specificity.</p> <p>Conclusions</p> <p>We compared our PromoBot algorithm to five other algorithms. It was found that the sensitivity and specificity of PromoBot performed well (or even better) with the algorithms tested. These results show that the two proposed feature selection methods based on hexamer frequencies and random triplet-pair could be successfully incorporated into a supervised machine learning method in promoter classification problem. As such, we expect that PromoBot can be used to help identify new plant promoters. Source codes and analysis results of this work could be provided upon request.</p
- …