2,102 research outputs found
Recommender Systems
The ongoing rapid expansion of the Internet greatly increases the necessity
of effective recommender systems for filtering the abundant information.
Extensive research for recommender systems is conducted by a broad range of
communities including social and computer scientists, physicists, and
interdisciplinary researchers. Despite substantial theoretical and practical
achievements, unification and comparison of different approaches are lacking,
which impedes further advances. In this article, we review recent developments
in recommender systems and discuss the major challenges. We compare and
evaluate available algorithms and examine their roles in the future
developments. In addition to algorithms, physical aspects are described to
illustrate macroscopic behavior of recommender systems. Potential impacts and
future directions are discussed. We emphasize that recommendation has a great
scientific depth and combines diverse research fields which makes it of
interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports
Pairwise gene GO-based measures for biclustering of high-dimensional expression data
Background: Biclustering algorithms search for groups of genes that share the same
behavior under a subset of samples in gene expression data. Nowadays, the biological
knowledge available in public repositories can be used to drive these algorithms to
find biclusters composed of groups of genes functionally coherent. On the other hand,
a distance among genes can be defined according to their information stored in Gene
Ontology (GO). Gene pairwise GO semantic similarity measures report a value for each
pair of genes which establishes their functional similarity. A scatter search-based
algorithm that optimizes a merit function that integrates GO information is studied in
this paper. This merit function uses a term that addresses the information through a GO
measure.
Results: The effect of two possible different gene pairwise GO measures on the
performance of the algorithm is analyzed. Firstly, three well known yeast datasets with
approximately one thousand of genes are studied. Secondly, a group of human
datasets related to clinical data of cancer is also explored by the algorithm. Most of
these data are high-dimensional datasets composed of a huge number of genes. The
resultant biclusters reveal groups of genes linked by a same functionality when the
search procedure is driven by one of the proposed GO measures. Furthermore, a
qualitative biological study of a group of biclusters show their relevance from a cancer
disease perspective.
Conclusions: It can be concluded that the integration of biological information
improves the performance of the biclustering process. The two different GO measures
studied show an improvement in the results obtained for the yeast dataset. However, if
datasets are composed of a huge number of genes, only one of them really improves
the algorithm performance. This second case constitutes a clear option to explore
interesting datasets from a clinical point of view.Ministerio de EconomĂa y Competitividad TIN2014-55894-C2-
A Survey on Event-based News Narrative Extraction
Narratives are fundamental to our understanding of the world, providing us
with a natural structure for knowledge representation over time. Computational
narrative extraction is a subfield of artificial intelligence that makes heavy
use of information retrieval and natural language processing techniques.
Despite the importance of computational narrative extraction, relatively little
scholarly work exists on synthesizing previous research and strategizing future
research in the area. In particular, this article focuses on extracting news
narratives from an event-centric perspective. Extracting narratives from news
data has multiple applications in understanding the evolving information
landscape. This survey presents an extensive study of research in the area of
event-based news narrative extraction. In particular, we screened over 900
articles that yielded 54 relevant articles. These articles are synthesized and
organized by representation model, extraction criteria, and evaluation
approaches. Based on the reviewed studies, we identify recent trends, open
challenges, and potential research lines.Comment: 37 pages, 3 figures, to be published in the journal ACM CSU
Improved collaborative filtering using clustering and association rule mining on implicit data
The recommender systems are recently becoming more significant due to their ability in making decisions on appropriate choices. Collaborative Filtering (CF) is the most successful and most applied technique in the design of a recommender system where items to an active user will be recommended based on the past rating records from like-minded users. Unfortunately, CF may lead to poor recommendation when user ratings on items are very sparse (insufficient number of ratings) in comparison with the huge number of users and items in user-item matrix. In the case of a lack of user rating on items, implicit feedback is used to profile a userâs item preferences. Implicit feedback can indicate usersâ preferences by providing more evidences and information through observations made on usersâ behaviors. Data mining technique, which is the focus of this research, can predict a userâs future behavior without item evaluation and can too, analyze his preferences. In order to investigate the states of research in CF and implicit feedback, a systematic literature review has been conducted on the published studies related to topic areas in CF and implicit feedback. To investigate usersâ activities that influence the recommender system developed based on the CF technique, a critical observation on the public recommendation datasets has been carried out. To overcome data sparsity problem, this research applies usersâ implicit interaction records with items to efficiently process massive data by employing association rules mining (Apriori algorithm). It uses item repetition within a transaction as an input for association rules mining, in which can achieve high recommendation accuracy. To do this, a modified preprocessing has been employed to discover similar interest patterns among users. In addition, the clustering technique (Hierarchical clustering) has been used to reduce the size of data and dimensionality of the item space as the performance of association rules mining. Then, similarities between items based on their features have been computed to make recommendations. Experiments have been conducted and the results have been compared with basic CF and other extended version of CF techniques including K-Means Clustering, Hybrid Representation, and Probabilistic Learning by using public dataset, namely, Million Song dataset. The experimental results demonstrate that the proposed technique exhibits improvements of an average of 20% in terms of Precision, Recall and Fmeasure metrics when compared to the basic CF technique. Our technique achieves even better performance (an average of 15% improvement in terms of Precision and Recall metrics) when compared to the other extended version of CF techniques, even when the data is very sparse
A Hybrid Social Network-based Collaborative Filtering Method for Personalized Manufacturing Service Recommendation
Nowadays, social network-based collaborative filtering (CF) methods are widely applied to recommend suitable products to consumers by combining trust relationships and similarities in the preference ratings among past users. However, these types of methods are rarely used for recommending manufacturing services. Hence, this study has developed a hybrid social network-based CF method for recommending personalized manufacturing services. The trustworthy enterprises and three types of similar enterprises with different features were considered as the four influential components for calculating predicted ratings of candidate services. The stochastic approach for link structure analysis (SALSA) was adopted to select top K trustworthy enterprises while also considering their reputation propagation on enterprise social network. The predicted ratings of candidate services were computed by using an extended user-based CF method where the particle swarm optimization (PSO) algorithm was leveraged to optimize the weights of the four components, thus making service recommendation more objective. Finally, an evaluation experiment illustrated that the proposed method is more accurate than the traditional user-based CF method
- âŠ