Search CORE

121 research outputs found

Multi-theme sentiment analysis with sentiment shifting

Author: Yu Hongkun
Publication venue
Publication date: 01/05/2016
Field of study

Business reviews contain rich sentiment on multiple themes, disclosing more interesting information than the overall polarities of documents. When it comes to fine-grained sentiment analysis, given any segment of text, we are not only interested in overall polarity of such segment, but also the sentiment words play major effects. However, sentiment analysis at the word level poses significant challenges due to the complexity of reviews, the inconsistency of sentiment in different themes, and the sentiment shifting resulting from linguistic patterns---contextual valence shifters. To simultaneously resolve the multi-theme and sentiment shifting dilemma, a unified explainable sentiment analysis model, MTSA, is proposed in this paper, which enables both classification of sentiment polarity and discovery of quantified sentiment-shifting patterns. MTSA formulates multi-theme sentiment by learning embeddings (i.e., vector representations) for both themes and words, and derives the shifter effect learning algorithm by modeling the shifted sentiment in a logistic regression model. Extensive experiments have been conducted on Yelp business reviews and IMDB movie reviews. The improvement of sentiment polarity classification demonstrates the effectiveness of MTSA at rectifying word feature representations of reviews, and the human evaluation shows its successful discovery of multi-theme sentiment words and automatic effect quantification of contextual valence shifters

Illinois Digital Environment for Access to Learning and Scholarship Repository

Enable Language Models to Implicitly Learn Self-Improvement From Data

Author: Hou Le
Ji Heng
Li Yunxuan
Lu Tianjian
Wang Ziqi
Wu Yuexin
Yu Hongkun
Publication venue
Publication date: 05/10/2023
Field of study

Large Language Models (LLMs) have demonstrated remarkable capabilities in open-ended text generation tasks. However, the inherent open-ended nature of these tasks implies that there is always room for improvement in the quality of model responses. To address this challenge, various approaches have been proposed to enhance the performance of LLMs. There has been a growing focus on enabling LLMs to self-improve their response quality, thereby reducing the reliance on extensive human annotation efforts for collecting diverse and high-quality training data. Recently, prompting-based methods have been widely explored among self-improvement methods owing to their effectiveness, efficiency, and convenience. However, those methods usually require explicitly and thoroughly written rubrics as inputs to LLMs. It is expensive and challenging to manually derive and provide all necessary rubrics with a real-world complex goal for improvement (e.g., being more helpful and less harmful). To this end, we propose an ImPlicit Self-ImprovemenT (PIT) framework that implicitly learns the improvement goal from human preference data. PIT only requires preference data that are used to train reward models without extra human efforts. Specifically, we reformulate the training objective of reinforcement learning from human feedback (RLHF) -- instead of maximizing response quality for a given input, we maximize the quality gap of the response conditioned on a reference response. In this way, PIT is implicitly trained with the improvement goal of better aligning with human preferences. Experiments on two real-world datasets and one synthetic dataset show that our method significantly outperforms prompting-based methods.Comment: 28 pages, 5 figures, 4 table

arXiv.org e-Print Archive

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

Author: Hou Le
Ji Heng
Li Jing
Liu Daogao
Liu Frederick
Wang Ziqi
Wu Yuexin
Yu Hongkun
Publication venue
Publication date: 10/03/2023
Field of study

Knowledge distillation is one of the primary methods of transferring knowledge from large to small models. However, it requires massive task-specific data, which may not be plausible in many real-world applications. Data augmentation methods such as representation interpolation, token replacement, or augmentation with models are applied to tackle this problem. However, these data augmentation methods either potentially cause shifts in decision boundaries (representation interpolation), are not expressive enough (token replacement), or introduce too much computational overhead (augmentation with models). To this end, we propose AugPro (Augmentation with Projection), an effective and efficient data augmentation method for distillation. Our method builds on top of representation interpolation augmentation methods to maintain the diversity of expressions and converts the augmented data to tokens to avoid shifting decision boundaries. It uses simple operations that come with little computational overhead. The results on multiple GLUE tasks show that our methods can improve distillation performance by a large margin at a low time cost. Codes are available at https://github.com/google-research/google-research/tree/master/augpro.Comment: 20 pages, 5 figures. Accepted by ICLR 202

arXiv.org e-Print Archive

Analysis of General Network Coding Conditions and Design of a Free-Ride-Oriented Routing Metric

Author: Bin Guo
Chi Zhou
Hongkun Li
Yu Cheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Recommended from our members

Tailoring Light-Matter Interaction with a Nanoscale Plasmon Resonator

Author: Akimov Alexey
de Leon Nathalie Pulmones
Englund Dirk E.
Lukin Mikhail D.
Park Hongkun
Shields Brendan John
Yu Chun
Publication venue: 'American Physical Society (APS)'
Publication date: 06/03/2014
Field of study

We propose and demonstrate a new approach for achieving enhanced light-matter interactions with quantum emitters. Our approach makes use of a plasmon resonator composed of defect-free, highly crystalline silver nanowires surrounded by patterned dielectric distributed Bragg reflectors. These resonators have an effective mode volume (Veff) 2 orders of magnitude below the diffraction limit and a quality factor (Q) approaching 100, enabling enhancement of spontaneous emission rates by a factor exceeding 75 at the cavity resonance. We also show that these resonators can be used to convert a broadband quantum emitter to a narrow-band single-photon source with color-selective emission enhancement.Physic

Harvard University - DASH

Latent factor transition for dynamic collaborative filtering

Author: LIM Ee Peng
SUN Jianling
WANG Ke
YU Hongkun
ZHANG Chengyi
Publication venue
Publication date: 01/04/2014
Field of study

User preferences change over time and capturing such changes is essential for developing accurate recom-mender systems. Despite its importance, only a few works in collaborative filtering have addressed this is-sue. In this paper, we consider evolving preferences and we model user dynamics by introducing and learning a transition matrix for each user’s latent vectors between consecutive time windows. Intuitively, the transition matrix for a user summarizes the time-invariant pat-tern of the evolution for the user. We first extend the conventional probabilistic matrix factorization and then improve upon this solution through its fully Bayesian model. These solutions take advantage of the model complexity and scalability of conventional Bayesian ma-trix factorization, yet adapt dynamically to user’s evolv-ing preferences. We evaluate the effectiveness of these solutions through empirical studies on six large-scale real life data sets

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Annotating Search Results from Web Databases

Author: Clement Yu
Hai He
Hongkun Zhao
Weiyi Meng
Yiyao Lu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref