264 research outputs found
Learning structure and schemas from heterogeneous domains in networked systems: a survey
The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.Peer ReviewedPostprint (published version
Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking
This paper proposes a new neural architecture for collaborative ranking with
implicit feedback. Our model, LRML (\textit{Latent Relational Metric Learning})
is a novel metric learning approach for recommendation. More specifically,
instead of simple push-pull mechanisms between user and item pairs, we propose
to learn latent relations that describe each user item interaction. This helps
to alleviate the potential geometric inflexibility of existing metric learing
approaches. This enables not only better performance but also a greater extent
of modeling capability, allowing our model to scale to a larger number of
interactions. In order to do so, we employ a augmented memory module and learn
to attend over these memory blocks to construct latent relations. The
memory-based attention module is controlled by the user-item interaction,
making the learned relation vector specific to each user-item pair. Hence, this
can be interpreted as learning an exclusive and optimal relational translation
for each user-item interaction. The proposed architecture demonstrates the
state-of-the-art performance across multiple recommendation benchmarks. LRML
outperforms other metric learning models by in terms of Hits@10 and
nDCG@10 on large datasets such as Netflix and MovieLens20M. Moreover,
qualitative studies also demonstrate evidence that our proposed model is able
to infer and encode explicit sentiment, temporal and attribute information
despite being only trained on implicit feedback. As such, this ascertains the
ability of LRML to uncover hidden relational structure within implicit
datasets.Comment: WWW 201
Recommender systems fairness evaluation via generalized cross entropy
Fairness in recommender systems has been considered with respect
to sensitive attributes of users (e.g., gender, race) or items (e.g., revenue
in a multistakeholder setting). Regardless, the concept has been
commonly interpreted as some form of equality – i.e., the degree to
which the system is meeting the information needs of all its users in
an equal sense. In this paper, we argue that fairness in recommender
systems does not necessarily imply equality, but instead it should
consider a distribution of resources based on merits and needs.We
present a probabilistic framework based ongeneralized cross entropy
to evaluate fairness of recommender systems under this perspective,
wherewe showthat the proposed framework is flexible and explanatory
by allowing to incorporate domain knowledge (through an ideal
fair distribution) that can help to understand which item or user aspects
a recommendation algorithm is over- or under-representing.
Results on two real-world datasets show the merits of the proposed
evaluation framework both in terms of user and item fairnessThis work was supported in part by the Center for Intelligent Information
Retrieval and in part by project TIN2016-80630-P (MINECO
Clustering high dimensional data using subspace and projected clustering algorithms
Problem statement: Clustering has a number of techniques that have been
developed in statistics, pattern recognition, data mining, and other fields.
Subspace clustering enumerates clusters of objects in all subspaces of a
dataset. It tends to produce many over lapping clusters. Approach: Subspace
clustering and projected clustering are research areas for clustering in high
dimensional spaces. In this research we experiment three clustering oriented
algorithms, PROCLUS, P3C and STATPC. Results: In general, PROCLUS performs
better in terms of time of calculation and produced the least number of
un-clustered data while STATPC outperforms PROCLUS and P3C in the accuracy of
both cluster points and relevant attributes found. Conclusions/Recommendations:
In this study, we analyze in detail the properties of different data clustering
method.Comment: 9 pages, 6 figure
Design of an Interface for Page Rank Calculation using Web Link Attributes Information
This paper deals with the Web Structure Mining and the different Structure Mining Algorithms like Page Rank, HITS, Trust Rank and Sel-HITS. The functioning of these algorithms are discussed. An incremental algorithm for calculation of PageRank using an interface has been formulated. This algorithm makes use of Web Link Attributes Information as key parameters and has been implemented using Visibility and Position of a Link. The application of Web Structure Mining Algorithm in an Academic Search Application has been discussed. The present work can be a useful input to Web Users, Faculty, Students and Web Administrators in a University Environment.HITS, Page Rank, Sel-HITS, Structure Mining
A Survey on Cross-domain Recommendation: Taxonomies, Methods, and Future Directions
Traditional recommendation systems are faced with two long-standing
obstacles, namely, data sparsity and cold-start problems, which promote the
emergence and development of Cross-Domain Recommendation (CDR). The core idea
of CDR is to leverage information collected from other domains to alleviate the
two problems in one domain. Over the last decade, many efforts have been
engaged for cross-domain recommendation. Recently, with the development of deep
learning and neural networks, a large number of methods have emerged. However,
there is a limited number of systematic surveys on CDR, especially regarding
the latest proposed methods as well as the recommendation scenarios and
recommendation tasks they address. In this survey paper, we first proposed a
two-level taxonomy of cross-domain recommendation which classifies different
recommendation scenarios and recommendation tasks. We then introduce and
summarize existing cross-domain recommendation approaches under different
recommendation scenarios in a structured manner. We also organize datasets
commonly used. We conclude this survey by providing several potential research
directions about this field
- …