58 research outputs found
VISIR : visual and semantic image label refinement
The social media explosion has populated the Internet with a wealth of images. There are two existing paradigms for image retrieval: 1) content-based image retrieval (CBIR), which has traditionally used visual features for similarity search (e.g., SIFT features), and 2) tag-based image retrieval (TBIR), which has relied on user tagging (e.g., Flickr tags). CBIR now gains semantic expressiveness by advances in deep-learning-based detection of visual labels. TBIR benefits from query-and-click logs to automatically infer more informative labels. However, learning-based tagging still yields noisy labels and is restricted to concrete objects, missing out on generalizations and abstractions. Click-based tagging is limited to terms that appear in the textual context of an image or in queries that lead to a click. This paper addresses the above limitations by semantically refining and expanding the labels suggested by learning-based object detection. We consider the semantic coherence between the labels for different objects, leverage lexical and commonsense knowledge, and cast the label assignment into a constrained optimization problem solved by an integer linear program. Experiments show that our method, called VISIR, improves the quality of the state-of-the-art visual labeling tools like LSDA and YOLO
Discrete deep learning for fast content-aware recommendation
Cold-start problem and recommendation efficiency have been regarded as two crucial challenges in the recommender system. In this paper, we propose a hashing based deep learning framework called Discrete Deep Learning (DDL), to map users and items to Hamming space, where a user's preference for an item can be efficiently calculated by Hamming distance, and this computation scheme significantly improves the efficiency of online recommendation. Besides, DDL unifies the user-item interaction information and the item content information to overcome the issues of data sparsity and cold-start. To be more specific, to integrate content information into our DDL framework, a deep learning model, Deep Belief Network (DBN), is applied to extract effective item representation from the item content information. Besides, the framework imposes balance and irrelevant constraints on binary codes to derive compact but informative binary codes. Due to the discrete constraints in DDL, we propose an efficient alternating optimization method consisting of iteratively solving a series of mixed-integer programming subproblems. Extensive experiments have been conducted to evaluate the performance of our DDL framework on two different Amazon datasets, and the experimental results demonstrate the superiority of DDL over the state-of-the-art methods regarding online recommendation efficiency and cold-start recommendation accuracy
Adversarial Domain Adaptation for Duplicate Question Detection
We address the problem of detecting duplicate questions in forums, which is
an important step towards automating the process of answering new questions. As
finding and annotating such potential duplicates manually is very tedious and
costly, automatic methods based on machine learning are a viable alternative.
However, many forums do not have annotated data, i.e., questions labeled by
experts as duplicates, and thus a promising solution is to use domain
adaptation from another forum that has such annotations. Here we focus on
adversarial domain adaptation, deriving important findings about when it
performs well and what properties of the domains are important in this regard.
Our experiments with StackExchange data show an average improvement of 5.6%
over the best baseline across multiple pairs of domains.Comment: EMNLP 2018 short paper - camera ready. 8 page
How to Perform Reproducible Experiments in the ELLIOT Recommendation Framework: Data Processing, Model Selection, and Performance Evaluation
Recommender Systems have shown to be an efective way to alleviate the over-choice problem and provide
accurate and tailored recommendations. However, the impressive number of proposed recommendation
algorithms, splitting strategies, evaluation protocols, metrics, and tasks, has made rigorous experimental
evaluation particularly challenging. ELLIOT is a comprehensive recommendation framework that aims
to run and reproduce an entire experimental pipeline by processing a simple confguration fle. The
framework loads, flters, and splits the data considering a vast set of strategies. Then, it optimizes
hyperparameters for several recommendation algorithms, selects the best models, compares them with
the baselines, computes metrics spanning from accuracy to beyond-accuracy, bias, and fairness, and
conducts statistical analysis. The aim is to provide researchers a tool to ease all the experimental
evaluation phases (and make them reproducible), from data reading to results collection. ELLIOT is
freely available on GitHub at https://github.com/sisinflab/ellio
A Survey on Cross-domain Recommendation: Taxonomies, Methods, and Future Directions
Traditional recommendation systems are faced with two long-standing
obstacles, namely, data sparsity and cold-start problems, which promote the
emergence and development of Cross-Domain Recommendation (CDR). The core idea
of CDR is to leverage information collected from other domains to alleviate the
two problems in one domain. Over the last decade, many efforts have been
engaged for cross-domain recommendation. Recently, with the development of deep
learning and neural networks, a large number of methods have emerged. However,
there is a limited number of systematic surveys on CDR, especially regarding
the latest proposed methods as well as the recommendation scenarios and
recommendation tasks they address. In this survey paper, we first proposed a
two-level taxonomy of cross-domain recommendation which classifies different
recommendation scenarios and recommendation tasks. We then introduce and
summarize existing cross-domain recommendation approaches under different
recommendation scenarios in a structured manner. We also organize datasets
commonly used. We conclude this survey by providing several potential research
directions about this field
- …