4,552 research outputs found

    Combining privileged information to improve context-aware recommender systems

    Get PDF
    A recommender system is an information filtering technology which can be used to predict preference ratings of items (products, services, movies, etc) and/or to output a ranking of items that are likely to be of interest to the user. Context-aware recommender systems (CARS) learn and predict the tastes and preferences of users by incorporating available contextual information in the recommendation process. One of the major challenges in context-aware recommender systems research is the lack of automatic methods to obtain contextual information for these systems. Considering this scenario, in this paper, we propose to use contextual information from topic hierarchies of the items (web pages) to improve the performance of context-aware recommender systems. The topic hierarchies are constructed by an extension of the LUPI-based Incremental Hierarchical Clustering method that considers three types of information: traditional bag-of-words (technical information), and the combination of named entities (privileged information I) with domain terms (privileged information II). We evaluated the contextual information in four context-aware recommender systems. Different weights were assigned to each type of information. The empirical results demonstrated that topic hierarchies with the combination of the two kinds of privileged information can provide better recommendations.FAPESP (grant #2010/20564-8, #2012/13830-9, and #2013/16039-3, São Paulo Research Foundation (FAPESP))CAPE

    Named entities as privileged information for hierarchical text clustering

    Get PDF
    Text clustering is a text mining task which is often used to aid the organization, knowledge extraction, and exploratory search of text collections. Nowadays, the automatic text clustering becomes essential as the volume and variety of digital text documents increase, either in social networks and the Web or inside organizations. This paper explores the use of named entities as privileged information in a hierarchical clustering process, so as to improve clusters quality and interpretation. We carried out an experimental evaluation on three text collections (one written in Portuguese and two written in English) and the results show that named entities can be applied as privileged information to power clustering solution in dynamic text collection scenarios.FAPESP (grant #2010/20564-8, #2012/13830-9, #2013/14757-6 and #2013/16039-3

    A chemo-centric view of human health and disease

    Get PDF
    Efforts to compile the phenotypic effects of drugs and environmental chemicals offer the opportunity to adopt a chemo-centric view of human health that does not require detailed mechanistic information. Here we consider thousands of chemicals and analyse the relationship of their structures with adverse and therapeutic responses. Our study includes molecules related to the aetiology of 934 health-threatening conditions and used to treat 835 diseases. We first identify chemical moieties that could be independently associated with each phenotypic effect. Using these fragments, we build accurate predictors for approximately 400 clinical phenotypes, finding many privileged and liable structures. Finally, we connect two diseases if they relate to similar chemical structures. The resulting networks of human conditions are able to predict disease comorbidities, as well as identifying potential drug side effects and opportunities for drug repositioning, and show a remarkable coincidence with clinical observations

    Applying multi-view based metadata in personalized ranking for recommender systems

    Get PDF
    In this paper, we propose a multi-view based metadata extraction technique from unstructured textual content in order to be applied in recommendation algorithms based on latent factors. The solution aims at reducing the problem of intense and time-consuming human effort to identify, collect and label descriptions about the items. Our proposal uses a unsupervised learning method to construct topic hierarchies with named entity recognition as privileged information. We evaluate the technique using different recommendation algorithms, and show that better accuracy is obtained when additional information about items is considered.São Paulo Research Foundation (FAPESP) (Grants 2012/13830-9, 2013/16039-3, 2013/22547-1)CAPE

    Privileged information for hierarchical document clustering: a metric learning approach

    Get PDF
    Traditional hierarchical text clustering methods assume that the documents are represented only by “technical information”, i.e., keywords, phrases, expressions and named entities that can be directly extracted from the texts. However, in many scenarios there is an additional and valuable information about the documents which is usually disregarded during the clustering task, such as user-validated tags, annotations and comments from experts, dictionaries and domain ontologies. Recently, Vapnik introduced a new learning paradigm, called LUPI - Learning Using Privileged Information, which allows the incorporation of this additional (privileged) information in a supervised learning setting. We investigated the incorporation of privileged information in unsupervised setting. The key idea in our proposed approach is to extract important relationships among documents represented in the privileged information dimensional space to learn a more accurate metric for text clustering in the technical information space. A thorough experimental evaluation indicates that the incorporation of privileged information through metric learning significantly improves the hierarchical clustering accuracy.São Paulo Research Foundation (FAPESP) (grants 2010/20564-8, 2011/17366-2, 2011/19850-9, 2012/13830-9, 2013/16039-3, 2013/22547-1)PROPP/UFMSCAPESCNP