Search CORE

714 research outputs found

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

Author: Gan Zhe
He Xiaodong
Huang Qiuyuan
Huang Xiaolei
Xu Tao
Zhang Han
Zhang Pengchuan
Publication venue
Publication date: 28/11/2017
Field of study

In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation. With a novel attentional generative network, the AttnGAN can synthesize fine-grained details at different subregions of the image by paying attentions to the relevant words in the natural language description. In addition, a deep attentional multimodal similarity model is proposed to compute a fine-grained image-text matching loss for training the generator. The proposed AttnGAN significantly outperforms the previous state of the art, boosting the best reported inception score by 14.14% on the CUB dataset and 170.25% on the more challenging COCO dataset. A detailed analysis is also performed by visualizing the attention layers of the AttnGAN. It for the first time shows that the layered attentional GAN is able to automatically select the condition at the word level for generating different parts of the image

arXiv.org e-Print Archive

Crossref

Token Imbalance Adaptation for Radiology Report Generation

Author: Huang I-Chan
Huang Xiaolei
Wu Yuexin
Publication venue
Publication date: 18/04/2023
Field of study

Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. % However, no prior study has proposed methods to adapt infrequent tokens for text generators feeding with medical images. To solve the challenge, we propose the \textbf{T}oken \textbf{Im}balance Adapt\textbf{er} (\textit{TIMER}), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.Comment: Accepted by CHIL202

arXiv.org e-Print Archive

Enriching Unsupervised User Embedding via Medical Concepts

Author: Dernoncourt Franck
Dredze Mark
Huang Xiaolei
Publication venue
Publication date: 01/01/2022
Field of study

Clinical notes in Electronic Health Records (EHR) present rich documented information of patients to inference phenotype for disease diagnosis and study patient characteristics for cohort selection. Unsupervised user embedding aims to encode patients into fixed-length vectors without human supervisions. Medical concepts extracted from the clinical notes contain rich connections between patients and their clinical categories. However, existing unsupervised approaches of user embeddings from clinical notes do not explicitly incorporate medical concepts. In this study, we propose a concept-aware unsupervised user embedding that jointly leverages text documents and medical concepts from two clinical corpora, MIMIC-III and Diabetes. We evaluate user embeddings on both extrinsic and intrinsic tasks, including phenotype classification, in-hospital mortality prediction, patient retrieval, and patient relatedness. Experiments on the two clinical corpora show our approach exceeds unsupervised baselines, and incorporating medical concepts can significantly improve the baseline performance.Comment: accepted at ACM CHIL 2022. a revision for section reforma

arXiv.org e-Print Archive

University of Memphis Digital Commons

Research on Safety Investment Decision Evaluation and Optimization of Network Booking Taxi Platform Enterprise based on Subjective-Objective Assessment Method

Author: Huang Qilong
Liu Xiaolei
Sui Ling
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2023
Field of study

This study addresses the current problem of disproportion between the investment and return of safety operation of Network Booking Taxi Platform Enterprises (NBTPE). This study selects the more representative NBTPE in the domestic travel field, and further forms a graph of safety input law based on the impact analysis of internal and external safety inputs by applying the System Dynamics method. Based on the comprehensive use of subjective empowerment method represented by analytical hierarchy process and objective empowerment method represented by entropy weight method, the study proposes the method of determining the reasonable proportion of each safety input cost through the comprehensive Subjective-Objective Assessment Method, and evaluates the feasibility and reasonableness of the method by using the method of linear regularization. Further the study concluded that enterprises need to increase the investment in equipment and facilities in the field of safety investment, while the proportion of investment in different links was measured and suggestions were made to optimize the current proportion of safety investment in NBTPE. This study provides support for optimizing the safety investment ratio of platform companies and improving the efficiency of safety management

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Recommended from our members

Metadata Matters: Adaptation Methods For Robust Document Classification

Author: Huang Xiaolei
Publication venue: University of Colorado Boulder
Publication date: 18/11/2020
Field of study

Metadata, implicitly embedded in documents such as time, demographic factors and user interests, can cause language variations and impact performance of document classifiers. For example, language shifts over periods of time, and males and females express sentiment differently. However, models for document classification, the automatic categorization of documents into categories, typically ignore document metadata. In this thesis, we focus on two types of document metadata, temporality and user factors. We propose to use domain adaptation by treating each metadata attribute as domains (e.g., gender domains: male vs. female), aiming to integrate temporality and user factors into document classifiers and improve classification performance. First, we propose temporality adaptation that explicitly incorporates time into the representation learning process via feature augmentation and diachronic word embedding. The feature augmentation method aims to learn time-independent feature weights for document classifiers. We then develop an end-to-end time-adapted model with the diachronic word embedding under a time-driven framework. Second, we propose user factor adaptation that models demographic attributes and user interests using multitask learning. To model demographic attributes, document classifiers jointly predict demographic factors and document categories. We further develop a multitask user embedding that jointly learns language, user behaviors and user interests. We examine and visualize impacts of temporality and user factor on word, topic, semantic and classifier levels. Benefits of adapting demographic attributes motivate us to examine if domain adaptation can reduce demographic biases. We release a multilingual hate speech corpus with author-level demographic labels. We examine demographic variations of user language and demographic biases of document classifiers. Following this, to reduce demographic bias, we apply a feature augmentation method to learn demographic-independent classifiers.</p

CU Scholar Institutional Repository