5,443 research outputs found
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval
Where previous reviews on content-based image retrieval emphasize on what can
be seen in an image to bridge the semantic gap, this survey considers what
people tag about an image. A comprehensive treatise of three closely linked
problems, i.e., image tag assignment, refinement, and tag-based image retrieval
is presented. While existing works vary in terms of their targeted tasks and
methodology, they rely on the key functionality of tag relevance, i.e.
estimating the relevance of a specific tag with respect to the visual content
of a given image and its social context. By analyzing what information a
specific method exploits to construct its tag relevance function and how such
information is exploited, this paper introduces a taxonomy to structure the
growing literature, understand the ingredients of the main works, clarify their
connections and difference, and recognize their merits and limitations. For a
head-to-head comparison between the state-of-the-art, a new experimental
protocol is presented, with training sets containing 10k, 100k and 1m images
and an evaluation on three test sets, contributed by various research groups.
Eleven representative works are implemented and evaluated. Putting all this
together, the survey aims to provide an overview of the past and foster
progress for the near future.Comment: to appear in ACM Computing Survey
Peeking into the other half of the glass : handling polarization in recommender systems.
This dissertation is about filtering and discovering information online while using recommender systems. In the first part of our research, we study the phenomenon of polarization and its impact on filtering and discovering information. Polarization is a social phenomenon, with serious consequences, in real-life, particularly on social media. Thus it is important to understand how machine learning algorithms, especially recommender systems, behave in polarized environments. We study polarization within the context of the users\u27 interactions with a space of items and how this affects recommender systems. We first formalize the concept of polarization based on item ratings and then relate it to the item reviews, when available. We then propose a domain independent data science pipeline to automatically detect polarization using the ratings rather than the properties, typically used to detect polarization, such as item\u27s content or social network topology. We perform an extensive comparison of polarization measures on several benchmark data sets and show that our polarization detection framework can detect different degrees of polarization and outperforms existing measures in capturing an intuitive notion of polarization. We also investigate and uncover certain peculiar patterns that are characteristic of environments where polarization emerges: A machine learning algorithm finds it easier to learn discriminating models in polarized environments: The models will quickly learn to keep each user in the safety of their preferred viewpoint, essentially, giving rise to filter bubbles and making them easier to learn. After quantifying the extent of polarization in current recommender system benchmark data, we propose new counter-polarization approaches for existing collaborative filtering recommender systems, focusing particularly on the state of the art models based on Matrix Factorization. Our work represents an essential step toward the new research area concerned with quantifying, detecting and counteracting polarization in human-generated data and machine learning algorithms.We also make a theoretical analysis of how polarization affects learning latent factor models, and how counter-polarization affects these models. In the second part of our dissertation, we investigate the problem of discovering related information by recommendation of tags on social media micro-blogging platforms. Real-time micro-blogging services such as Twitter have recently witnessed exponential growth, with millions of active web users who generate billions of micro-posts to share information, opinions and personal viewpoints, daily. However, these posts are inherently noisy and unstructured because they could be in any format, hence making them difficult to organize for the purpose of retrieval of relevant information. One way to solve this problem is using hashtags, which are quickly becoming the standard approach for annotation of various information on social media, such that varied posts about the same or related topic are annotated with the same hashtag. However hashtags are not used in a consistent manner and most importantly, are completely optional to use. This makes them unreliable as the sole mechanism for searching for relevant information. We investigate mechanisms for consolidating the hashtag space using recommender systems. Our methods are general enough that they can be used for hashtag annotation in various social media services such as twitter, as well as for general item recommendations on systems that rely on implicit user interest data such as e-learning and news sites, or explicit user ratings, such as e-commerce and online entertainment sites. To conclude, we propose a methodology to extract stories based on two types of hashtag co-occurrence graphs. Our research in hashtag recommendation was able to exploit the textual content that is available as part of user messages or posts, and thus resulted in hybrid recommendation strategies. Using content within this context can bridge polarization boundaries. However, when content is not available, is missing, or is unreliable, as in the case of platforms that are rich in multimedia and multilingual posts, the content option becomes less powerful and pure collaborative filtering regains its important role, along with the challenges of polarization
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
A lot of the recent success in natural language processing (NLP) has been
driven by distributed vector representations of words trained on large amounts
of text in an unsupervised manner. These representations are typically used as
general purpose features for words across a range of NLP problems. However,
extending this success to learning representations of sequences of words, such
as sentences, remains an open problem. Recent work has explored unsupervised as
well as supervised learning techniques with different training objectives to
learn general purpose fixed-length sentence representations. In this work, we
present a simple, effective multi-task learning framework for sentence
representations that combines the inductive biases of diverse training
objectives in a single model. We train this model on several data sources with
multiple training objectives on over 100 million sentences. Extensive
experiments demonstrate that sharing a single recurrent sentence encoder across
weakly related tasks leads to consistent improvements over previous methods. We
present substantial improvements in the context of transfer learning and
low-resource settings using our learned general-purpose representations.Comment: Accepted at ICLR 201
Recent Progress in Image Deblurring
This paper comprehensively reviews the recent development of image
deblurring, including non-blind/blind, spatially invariant/variant deblurring
techniques. Indeed, these techniques share the same objective of inferring a
latent sharp image from one or several corresponding blurry images, while the
blind deblurring techniques are also required to derive an accurate blur
kernel. Considering the critical role of image restoration in modern imaging
systems to provide high-quality images under complex environments such as
motion, undesirable lighting conditions, and imperfect system components, image
deblurring has attracted growing attention in recent years. From the viewpoint
of how to handle the ill-posedness which is a crucial issue in deblurring
tasks, existing methods can be grouped into five categories: Bayesian inference
framework, variational methods, sparse representation-based methods,
homography-based modeling, and region-based methods. In spite of achieving a
certain level of development, image deblurring, especially the blind case, is
limited in its success by complex application conditions which make the blur
kernel hard to obtain and be spatially variant. We provide a holistic
understanding and deep insight into image deblurring in this review. An
analysis of the empirical evidence for representative methods, practical
issues, as well as a discussion of promising future directions are also
presented.Comment: 53 pages, 17 figure
- …