688 research outputs found

    Progresses and Challenges in Link Prediction

    Full text link
    Link prediction is a paradigmatic problem in network science, which aims at estimating the existence likelihoods of nonobserved links, based on known topology. After a brief introduction of the standard problem and metrics of link prediction, this Perspective will summarize representative progresses about local similarity indices, link predictability, network embedding, matrix completion, ensemble learning and others, mainly extracted from thousands of related publications in the last decade. Finally, this Perspective will outline some long-standing challenges for future studies.Comment: 45 pages, 1 tabl

    Interaction-aware Factorization Machines for Recommender Systems

    Full text link
    Factorization Machine (FM) is a widely used supervised learning approach by effectively modeling of feature interactions. Despite the successful application of FM and its many deep learning variants, treating every feature interaction fairly may degrade the performance. For example, the interactions of a useless feature may introduce noises; the importance of a feature may also differ when interacting with different features. In this work, we propose a novel model named \emph{Interaction-aware Factorization Machine} (IFM) by introducing Interaction-Aware Mechanism (IAM), which comprises the \emph{feature aspect} and the \emph{field aspect}, to learn flexible interactions on two levels. The feature aspect learns feature interaction importance via an attention network while the field aspect learns the feature interaction effect as a parametric similarity of the feature interaction vector and the corresponding field interaction prototype. IFM introduces more structured control and learns feature interaction importance in a stratified manner, which allows for more leverage in tweaking the interactions on both feature-wise and field-wise levels. Besides, we give a more generalized architecture and propose Interaction-aware Neural Network (INN) and DeepIFM to capture higher-order interactions. To further improve both the performance and efficiency of IFM, a sampling scheme is developed to select interactions based on the field aspect importance. The experimental results from two well-known datasets show the superiority of the proposed models over the state-of-the-art methods

    Statistical Significance of the Netflix Challenge

    Full text link
    Inspired by the legacy of the Netflix contest, we provide an overview of what has been learned---from our own efforts, and those of others---concerning the problems of collaborative filtering and recommender systems. The data set consists of about 100 million movie ratings (from 1 to 5 stars) involving some 480 thousand users and some 18 thousand movies; the associated ratings matrix is about 99% sparse. The goal is to predict ratings that users will give to movies; systems which can do this accurately have significant commercial applications, particularly on the world wide web. We discuss, in some detail, approaches to "baseline" modeling, singular value decomposition (SVD), as well as kNN (nearest neighbor) and neural network models; temporal effects, cross-validation issues, ensemble methods and other considerations are discussed as well. We compare existing models in a search for new models, and also discuss the mission-critical issues of penalization and parameter shrinkage which arise when the dimensions of a parameter space reaches into the millions. Although much work on such problems has been carried out by the computer science and machine learning communities, our goal here is to address a statistical audience, and to provide a primarily statistical treatment of the lessons that have been learned from this remarkable set of data.Comment: Published in at http://dx.doi.org/10.1214/11-STS368 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Drug Target Interaction Prediction Using Machine Learning Techniques ā€“ A Review

    Get PDF
    Drug discovery is a key process, given the rising and ubiquitous demand for medication to stay in good shape right through the course of oneā€™s life. Drugs are small molecules that inhibit or activate the function of a protein, offering patients a host of therapeutic benefits. Drug design is the inventive process of finding new medication, based on targets or proteins. Identifying new drugs is a process that involves time and money. This is where computer-aided drug design helps cut time and costs. Drug design needs drug targets that are a protein and a drug compound, with which the interaction between a drug and a target is established. Interaction, in this context, refers to the process of discovering protein binding sites, which are protein pockets that bind with drugs. Pockets are regions on a protein macromolecule that bind to drug molecules. Researchers have been at work trying to determine new Drug Target Interactions (DTI) that predict whether or not a given drug molecule will bind to a target. Machine learning (ML) techniques help establish the interaction between drugs and their targets, using computer-aided drug design. This paper aims to explore ML techniques better for DTI prediction and boost future research. Qualitative and quantitative analyses of ML techniques show that several have been applied to predict DTIs, employing a range of classifiers. Though DTI prediction improves with negative drug target pairs (DTP), the lack of true negative DTPs has led to the use a particular dataset of drugs and targets. Using dynamic DTPs improves DTI prediction. Little attention has so far been paid to developing a new classifier for DTI classification, and there is, unquestionably, a need for better ones

    Mining and Analyzing the Academic Network

    Get PDF
    Social Network research has attracted the interests of many researchers, not only in analyzing the online social networking applications, such as Facebook and Twitter, but also in providing comprehensive services in scientific research domain. We define an Academic Network as a social network which integrates scientific factors, such as authors, papers, affiliations, publishing venues, and their relationships, such as co-authorship among authors and citations among papers. By mining and analyzing the academic network, we can provide users comprehensive services as searching for research experts, published papers, conferences, as well as detecting research communities or the evolutions hot research topics. We can also provide recommendations to users on with whom to collaborate, whom to cite and where to submit.In this dissertation, we investigate two main tasks that have fundamental applications in the academic network research. In the first, we address the problem of expertise retrieval, also known as expert finding or ranking, in which we identify and return a ranked list of researchers, based upon their estimated expertise or reputation, to user-specified queries. In the second, we address the problem of research action recommendation (prediction), specifically, the tasks of publishing venue recommendation, citation recommendation and coauthor recommendation. For both tasks, to effectively mine and integrate heterogeneous information and therefore develop well-functioning ranking or recommender systems is our principal goal. For the task of expertise retrieval, we first proposed or applied three modified versions of PageRank-like algorithms into citation network analysis; we then proposed an enhanced author-topic model by simultaneously modeling citation and publishing venue information; we finally incorporated the pair-wise learning-to-rank algorithm into traditional topic modeling process, and further improved the model by integrating groups of author-specific features. For the task of research action recommendation, we first proposed an improved neighborhood-based collaborative filtering approach for publishing venue recommendation; we then applied our proposed enhanced author-topic model and demonstrated its effectiveness in both cited author prediction and publishing venue prediction; finally we proposed an extended latent factor model that can jointly model several relations in an academic environment in a unified way and verified its performance in four recommendation tasks: the recommendation on author-co-authorship, author-paper citation, paper-paper citation and paper-venue submission. Extensive experiments conducted on large-scale real-world data sets demonstrated the superiority of our proposed models over other existing state-of-the-art methods

    Algoritmo HĆ­brido de RecomendaĆ§Ć£o

    Get PDF
    Nesta era tecnolĆ³gica em que nos encontramos hĆ” cada vez mais informaĆ§Ć£o disponĆ­vel na internet, mas grande parte dessa informaĆ§Ć£o nĆ£o Ć© relevante. Isto leva Ć  necessidade de criar maneiras de filtrar informaĆ§Ć£o, de forma a reduzir o tempo de recolha de informaĆ§Ć£o Ćŗtil. Esta necessidade torna o uso de sistemas de recomendaĆ§Ć£o muito apelativo, visto estes personalizarem as pesquisas de forma a ajudar os seus utilizadores a fazer escolhas mais informadas. Os sistemas de recomendaĆ§Ć£o procuram recomendar os itens mais relevantes aos seus utilizadores, no entanto necessitam de informaĆ§Ć£o sobre os utilizadores e os itens, de forma a melhor os poder organizar e categorizar. HĆ” vĆ”rios tipos de sistemas de recomendaĆ§Ć£o, cada um com as suas forƧas e fraquezas. De modo a superar as limitaƧƵes destes sistemas surgiram os sistemas de recomendaĆ§Ć£o hĆ­bridos, que procuram combinar caracterĆ­sticas dos diferentes tipos de sistemas de recomendaĆ§Ć£o de modo a reduzir, ou eliminar, as suas fraquezas. Uma das limitaƧƵes dos sistemas de recomendaĆ§Ć£o acontece quando o prĆ³prio sistema nĆ£o tem informaĆ§Ć£o suficiente para fazer recomendaƧƵes. Esta limitaĆ§Ć£o tem o nome de Cold Start e pode focar-se numa de duas Ć”reas: quando a falta de informaĆ§Ć£o vem do utilizador, conhecida como User Cold Start; e quando a falta de informaĆ§Ć£o vem de um item, conhecida como Item Cold Start. O foco desta dissertaĆ§Ć£o Ć© no User Cold Start, nomeadamente na criaĆ§Ć£o de um sistema de recomendaĆ§Ć£o hĆ­brido capaz de lidar com esta situaĆ§Ć£o. A abordagem apresentada nesta dissertaĆ§Ć£o procura combinar a segmentaĆ§Ć£o de clientes com regras de associaĆ§Ć£o. O objetivo passa por descobrir os utilizadores mais similares aos utilizadores numa situaĆ§Ć£o de Cold Start e, atravĆ©s dos itens avaliados pelos utilizadores mais similares, recomendar os itens considerados mais relevantes, obtidos atravĆ©s de regras de associaĆ§Ć£o. O algoritmo hĆ­brido apresentado nesta dissertaĆ§Ć£o procura e classifica todos os tipos de utilizadores. Quando um utilizador numa situaĆ§Ć£o de Cold Start estĆ” Ć  procura de recomendaƧƵes, o sistema encontra itens para recomendar atravĆ©s da aplicaĆ§Ć£o de regras de associaĆ§Ć£o a itens avaliados por utilizadores no mesmo grupo que o utilizador na situaĆ§Ć£o de Cold Start, cruzando essas regras com os itens avaliados por este Ćŗltimo e apresentando as recomendaƧƵes com base no resultado.Recommender systems, or recommenders, are a way to filter the useful information from the data, in this age where there is a lot of available data. A recommender systemā€™s purpose is to recommend relevant items to users, and to do that, it requires information on both, data from users and from items, to better organise and categorise both of them. There are several types of recommenders, each best suited for a specific purpose, and with specific weaknesses. Then there are hybrid recommenders, made by combining one or more types of recommenders in a way that each type supresses, or at least limits, the weaknesses of the other types. A very important weakness of recommender systems occurs when the system doesnā€™t have enough information about something and so, it cannot make a recommendation. This problem known as a Cold Start problem is addressed in this thesis. There are two types of Cold Start problems: those where the lack of information comes from a user (User Cold Start) and those where it comes from an item (Item Cold Start). This thesisā€™ main focus is on User Cold Start problems. A novel approach is introduced in this thesis which combines clientsā€™ segmentation with association rules. The goal is first, finding the most similar users to cold start users and then, with the items rated by these similar users, recommend those that are most suitable, which are gotten through association rules. The hybrid algorithm presented in this thesis finds and classifies all usersā€™ types. When a user in a Cold Start situation is looking for recommendations, the system finds the items to recommend to him by applying association rules to the items evaluated by users in the same user group as the Cold Start user, crossing them with the few items evaluated by the Cold Start user and finally making its recommendations based on that
    • ā€¦
    corecore