8 research outputs found
Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning
Image captioning is one of the straightforward tasks that can take advantage
of large-scale web-crawled data which provides rich knowledge about the visual
world for a captioning model. However, since web-crawled data contains
image-text pairs that are aligned at different levels, the inherent noises
(e.g., misaligned pairs) make it difficult to learn a precise captioning model.
While the filtering strategy can effectively remove noisy data, however, it
leads to a decrease in learnable knowledge and sometimes brings about a new
problem of data deficiency. To take the best of both worlds, we propose a
noise-aware learning framework, which learns rich knowledge from the whole
web-crawled data while being less affected by the noises. This is achieved by
the proposed quality controllable model, which is learned using alignment
levels of the image-text pairs as an additional control signal during training.
The alignment-conditioned training allows the model to generate high-quality
captions of well-aligned by simply setting the control signal to desired
alignment level at inference time. Through in-depth analysis, we show that our
controllable captioning model is effective in handling noise. In addition, with
two tasks of zero-shot captioning and text-to-image retrieval using generated
captions (i.e., self-retrieval), we also demonstrate our model can produce
high-quality captions in terms of descriptiveness and distinctiveness. Code is
available at \url{https://github.com/kakaobrain/noc}
Channel and timeslot co-scheduling with minimal channel switching for data aggregation in MWSNs.
Collision-free transmission and efficient data transfer between nodes can be achieved through a set of channels in multichannel wireless sensor networks (MWSNs). While using multiple channels, we have to carefully consider channel interference, channel and time slot (resources) optimization, channel switching delay, and energy consumption. Since sensor nodes operate on low battery power, the energy consumed in channel switching becomes an important challenge. In this paper, we propose channel and time slot scheduling for minimal channel switching in MWSNs, while achieving efficient and collision-free transmission between nodes. The proposed scheme constructs a duty-cycled tree while reducing the amount of channel switching. As a next step, collision-free time slots are assigned to every node based on the minimal data collection delay. The experimental results demonstrate that the validity of our scheme reduces the amount of channel switching by 17.5%, reduces energy consumption for channel switching by 28%, and reduces the schedule length by 46%, as compared to the existing schemes.N/
CloudIoT-based Jukebox Platform: a music player for mobile users in Café
Contents services have been provided to people in a variety of ways. Jukebox service is one of the contents streaming which provides an automated music-playing service. User inserts coin and presses a play button, the jukebox automatically selects and plays the record. The Disk Jockey (DJ) in Korean cafeteria (café) received contents desired of customer and played them through the speakers in the store. In this paper, we propose a service platform that reinvented the Korean café DJ in an integrated environment of IoT and cloud computing. The user in a store can request contents (music, video, and message) through the service platform. The contents are provided through the public screen and speaker in the store where the user is located. This allows people in the same location store to enjoy the contents together. The user information and the usage history are collected and managed in the cloud. Therefore, users can receive customized services regardless of stores. We compare our platform to exist services. As a result of the performance evaluation, the proposed platform shows that contents can be efficiently provided to users and adapts IoT-Cloud integrated environments.N/
NICE 2023 Zero-shot Image Captioning Challenge
In this report, we introduce NICE
project\footnote{\url{https://nice.lgresearch.ai/}} and share the results and
outcomes of NICE challenge 2023. This project is designed to challenge the
computer vision community to develop robust image captioning models that
advance the state-of-the-art both in terms of accuracy and fairness. Through
the challenge, the image captioning models were tested using a new evaluation
dataset that includes a large variety of visual concepts from many domains.
There was no specific training data provided for the challenge, and therefore
the challenge entries were required to adapt to new types of image descriptions
that had not been seen during training. This report includes information on the
newly proposed NICE dataset, evaluation methods, challenge results, and
technical details of top-ranking entries. We expect that the outcomes of the
challenge will contribute to the improvement of AI models on various
vision-language tasks.Comment: Tech report, project page https://nice.lgresearch.ai
Structural Motion Grammar for Universal Use of Leap Motion: Amusement and Functional Contents Focused
Motions using Leap Motion controller are not standardized while the use of it is spreading in media contents. Each content defines its own motions, thereby creating confusion for users. Therefore, to alleviate user inconvenience, this study categorized the commonly used motion by Amusement and Functional Contents and defined the Structural Motion Grammar that can be universally used based on the classification. To this end, the Motion Lexicon was defined, which is a fundamental motion vocabulary, and an algorithm that enables real-time recognition of Structural Motion Grammar was developed. Moreover, the proposed method was verified by user evaluation and quantitative comparison tests
Machine-Learning-Based Clinical Biomarker Using Cell-Free DNA for Hepatocellular Carcinoma (HCC)
(1) Background: Hepatocellular carcinoma (HCC) is one of the leading causes of cancer-related death worldwide. Although various serum enzymes have been utilized for the diagnosis and prognosis of HCC, the currently available biomarkers lack the sensitivity needed to detect HCC at early stages and accurately predict treatment responses. (2) Methods: We utilized our highly sensitive cell-free DNA (cfDNA) detection system, in combination with a machine learning algorithm, to provide a platform for improved diagnosis and prognosis of HCC. (3) Results: cfDNA, specifically alpha-fetoprotein (AFP) expression in captured cfDNA, demonstrated the highest accuracy for diagnosing malignancies among the serum/plasma biomarkers used in this study, including AFP, aspartate aminotransferase, alanine aminotransferase, albumin, alkaline phosphatase, and bilirubin. The diagnostic/prognostic capability of cfDNA was further improved by establishing a cfDNA score (cfDHCC), which integrated the total plasma cfDNA levels and cfAFP-DNA expression into a single score using machine learning algorithms. (4) Conclusion: The cfDHCC score demonstrated significantly improved accuracy in determining the pathological features of HCC and predicting patients’ survival outcomes compared to the other biomarkers. The results presented herein reveal that our cfDNA capture/analysis platform is a promising approach to effectively utilize cfDNA as a biomarker for the diagnosis and prognosis of HCC