Search CORE

529 research outputs found

Deep Metric Loss for Multimodal Learning

Author: Lee Hyunju
Moon Sehwan
Publication venue
Publication date: 21/08/2023
Field of study

Multimodal learning often outperforms its unimodal counterparts by exploiting unimodal contributions and cross-modal interactions. However, focusing only on integrating multimodal features into a unified comprehensive representation overlooks the unimodal characteristics. In real data, the contributions of modalities can vary from instance to instance, and they often reinforce or conflict with each other. In this study, we introduce a novel \text{MultiModal} loss paradigm for multimodal learning, which subgroups instances according to their unimodal contributions. \text{MultiModal} loss can prevent inefficient learning caused by overfitting and efficiently optimize multimodal models. On synthetic data, \text{MultiModal} loss demonstrates improved classification performance by subgrouping difficult instances within certain modalities. On four real multimodal datasets, our loss is empirically shown to improve the performance of recent models. Ablation studies verify the effectiveness of our loss. Additionally, we show that our loss generates a reliable prediction score for each modality, which is essential for subgrouping. Our \text{MultiModal} loss is a novel loss function to subgroup instances according to the contribution of modalities in multimodal learning and is applicable to a variety of multimodal models with unimodal decisions. Our code is available at https://github.com/SehwanMoon/MultiModalLoss.Comment: 18 pages, 9 figure

arXiv.org e-Print Archive

Using Astronomical Photographs to Investigate Misconceptions about Galaxies and Spectra: Question Development for Clicker Use

Author: Lee Hyunju
Schneider Stephen E.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/07/2015
Field of study

Many topics in introductory astronomy at the college or high-school level rely implicitly on using astronomical photographs and visual data in class. However, students bring many preconceptions to their understanding of these materials that ultimately lead to misconceptions, and research about students' interpretation of astronomical images has been scarcely conducted. In this study we probed college students' understanding of astronomical photographs and visual data about galaxies and spectra, and developed a set of concept questions based on their common misconceptions. The study was conducted mainly in three successive surveys: 1) open-ended questions looking for students' ideas and common misconceptions, 2) combined multiple-choice and open-ended questions seeking to explore student reasoning and to improve concept questions for clickers, and 3) a finalized version of the concept questions used to investigate the strength of each misconception among the students in introductory astronomy courses. This study reports on the procedures and the development of the concept questions with the investigated common misconceptions about galaxies and spectra. We also provide the set of developed questions for teachers and instructors seeking to implement in their classes for the purpose of formative assessment with the use of classroom response systems. These questions would help them recognize the gap between their teaching and students' understanding, and ultimately improve teaching of the concepts.Comment: published in PRST-PER (July 13, 2015

arXiv.org e-Print Archive

Directory of Open Access Journals

iStarDB (The Astronomy Education Research Repository)

Body Information Analysis based Personal Exercise Management System

Author: Jung Hoekyung
Lee Hyunju
Lee Jongwon
Yu Donggyun
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/04/2018
Field of study

Recently, people's interest in health is deepening. So health-related systems are being developed. Existing exercise management systems provided users with exercise related information using PC or smart phone. However, there is a problem that the accuracy of the algorithm for analyzing the user's body information and providing information is low.In this paper, we analyze users' body mass index (BMI) and basal metabolic rate (BMR) and we propose a system that provides the user with necessary information through recommendation algorithm. It informs the user of exercise intensity and momentum, and graphs the exercise history of the user. It also allows the user to refer to the fitness history of other users in the same BMI group. This allows the user to receive more personalized services than the existing exercise management system, thereby enabling efficient exercise

IAES journal

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

An integrated approach to the prediction of domain-domain interactions

Author: Chen Ting
Deng Minghua
Lee Hyunju
Sun Fengzhu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The development of high-throughput technologies has produced several large scale protein interaction data sets for multiple species, and significant efforts have been made to analyze the data sets in order to understand protein activities. Considering that the basic units of protein interactions are domain interactions, it is crucial to understand protein interactions at the level of the domains. The availability of many diverse biological data sets provides an opportunity to discover the underlying domain interactions within protein interactions through an integration of these biological data sets. RESULTS: We combine protein interaction data sets from multiple species, molecular sequences, and gene ontology to construct a set of high-confidence domain-domain interactions. First, we propose a new measure, the expected number of interactions for each pair of domains, to score domain interactions based on protein interaction data in one species and show that it has similar performance as the E-value defined by Riley et al. [1]. Our new measure is applied to the protein interaction data sets from yeast, worm, fruitfly and humans. Second, information on pairs of domains that coexist in known proteins and on pairs of domains with the same gene ontology function annotations are incorporated to construct a high-confidence set of domain-domain interactions using a Bayesian approach. Finally, we evaluate the set of domain-domain interactions by comparing predicted domain interactions with those defined in iPfam database [2,3] that were derived based on protein structures. The accuracy of predicted domain interactions are also confirmed by comparing with experimentally obtained domain interactions from H. pylori [4]. As a result, a total of 2,391 high-confidence domain interactions are obtained and these domain interactions are used to unravel detailed protein and domain interactions in several protein complexes. CONCLUSION: Our study shows that integration of multiple biological data sets based on the Bayesian approach provides a reliable framework to predict domain interactions. By integrating multiple data sources, the coverage and accuracy of predicted domain interactions can be significantly increased

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Prediction of plant promoters based on hexamers and random triplet pair analysis

Author: Azad A K M
Lee Hyunju
Noman Nasimul
Shahid Saima
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background With an increasing number of plant genome sequences, it has become important to develop a robust computational method for detecting plant promoters. Although a wide variety of programs are currently available, prediction accuracy of these still requires further improvement. The limitations of these methods can be addressed by selecting appropriate features for distinguishing promoters and non-promoters. Methods In this study, we proposed two feature selection approaches based on hexamer sequences: the Frequency Distribution Analyzed Feature Selection Algorithm (FDAFSA) and the Random Triplet Pair Feature Selecting Genetic Algorithm (RTPFSGA). In FDAFSA, adjacent triplet-pairs (hexamer sequences) were selected based on the difference in the frequency of hexamers between promoters and non-promoters. In RTPFSGA, random triplet-pairs (RTPs) were selected by exploiting a genetic algorithm that distinguishes frequencies of non-adjacent triplet pairs between promoters and non-promoters. Then, a support vector machine (SVM), a nonlinear machine-learning algorithm, was used to classify promoters and non-promoters by combining these two feature selection approaches. We referred to this novel algorithm as PromoBot. Results Promoter sequences were collected from the PlantProm database. Non-promoter sequences were collected from plant mRNA, rRNA, and tRNA of PlantGDB and plant miRNA of miRBase. Then, in order to validate the proposed algorithm, we applied a 5-fold cross validation test. Training data sets were used to select features based on FDAFSA and RTPFSGA, and these features were used to train the SVM. We achieved 89% sensitivity and 86% specificity. Conclusions We compared our PromoBot algorithm to five other algorithms. It was found that the sensitivity and specificity of PromoBot performed well (or even better) with the algorithms tested. These results show that the two proposed feature selection methods based on hexamer frequencies and random triplet-pair could be successfully incorporated into a supervised machine learning method in promoter classification problem. As such, we expect that PromoBot can be used to help identify new plant promoters. Source codes and analysis results of this work could be provided upon request.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central