222 research outputs found

    Addressing Item-Cold Start Problem in Recommendation Systems using Model Based Approach and Deep Learning

    Full text link
    Traditional recommendation systems rely on past usage data in order to generate new recommendations. Those approaches fail to generate sensible recommendations for new users and items into the system due to missing information about their past interactions. In this paper, we propose a solution for successfully addressing item-cold start problem which uses model-based approach and recent advances in deep learning. In particular, we use latent factor model for recommendation, and predict the latent factors from item's descriptions using convolutional neural network when they cannot be obtained from usage data. Latent factors obtained by applying matrix factorization to the available usage data are used as ground truth to train the convolutional neural network. To create latent factor representations for the new items, the convolutional neural network uses their textual description. The results from the experiments reveal that the proposed approach significantly outperforms several baseline estimators

    Traitement du signal audio-visuel et visiophone personne libre

    Get PDF
    Les informations visuelles et acoustiques sont au coeur de la (télé)communication entre les personnes. Le visage est la principale source d'information. Des techniques de détection du mouvement et de la teinte de la peau délimitent des régions d'intérêt où peuvent se trouver des visages. Un réseau de neurones détecte le visage et fournit la position et l'échelle du visage. Le visage repéré est suivi, en temps réel, par une caméra motorisée et par une antenne acoustique qui génère un lobe orientable. La prise de vue et la prise de son sont ainsi centrées en permanence sur l'utilisateur qui est libre de se déplacer et libre de tout équipement spécifique. Le traitement du signal audio-visuel sont intégrées à LISTEN, démonstrateur du visiophone "personne libre"

    Large Scale Application of Neural Network Based Semantic Role Labeling for Automated Relation Extraction from Biomedical Texts

    Get PDF
    To reduce the increasing amount of time spent on literature search in the life sciences, several methods for automated knowledge extraction have been developed. Co-occurrence based approaches can deal with large text corpora like MEDLINE in an acceptable time but are not able to extract any specific type of semantic relation. Semantic relation extraction methods based on syntax trees, on the other hand, are computationally expensive and the interpretation of the generated trees is difficult. Several natural language processing (NLP) approaches for the biomedical domain exist focusing specifically on the detection of a limited set of relation types. For systems biology, generic approaches for the detection of a multitude of relation types which in addition are able to process large text corpora are needed but the number of systems meeting both requirements is very limited. We introduce the use of SENNA (“Semantic Extraction using a Neural Network Architecture”), a fast and accurate neural network based Semantic Role Labeling (SRL) program, for the large scale extraction of semantic relations from the biomedical literature. A comparison of processing times of SENNA and other SRL systems or syntactical parsers used in the biomedical domain revealed that SENNA is the fastest Proposition Bank (PropBank) conforming SRL program currently available. 89 million biomedical sentences were tagged with SENNA on a 100 node cluster within three days. The accuracy of the presented relation extraction approach was evaluated on two test sets of annotated sentences resulting in precision/recall values of 0.71/0.43. We show that the accuracy as well as processing speed of the proposed semantic relation extraction approach is sufficient for its large scale application on biomedical text. The proposed approach is highly generalizable regarding the supported relation types and appears to be especially suited for general-purpose, broad-scale text mining systems. The presented approach bridges the gap between fast, cooccurrence-based approaches lacking semantic relations and highly specialized and computationally demanding NLP approaches

    Deep Memory Networks for Attitude Identification

    Full text link
    We consider the task of identifying attitudes towards a given set of entities from text. Conventionally, this task is decomposed into two separate subtasks: target detection that identifies whether each entity is mentioned in the text, either explicitly or implicitly, and polarity classification that classifies the exact sentiment towards an identified entity (the target) into positive, negative, or neutral. Instead, we show that attitude identification can be solved with an end-to-end machine learning architecture, in which the two subtasks are interleaved by a deep memory network. In this way, signals produced in target detection provide clues for polarity classification, and reversely, the predicted polarity provides feedback to the identification of targets. Moreover, the treatments for the set of targets also influence each other -- the learned representations may share the same semantics for some targets but vary for others. The proposed deep memory network, the AttNet, outperforms methods that do not consider the interactions between the subtasks or those among the targets, including conventional machine learning methods and the state-of-the-art deep learning models.Comment: Accepted to WSDM'1

    Deep Learning of Representations: Looking Forward

    Full text link
    Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data. It also proposes a few forward-looking research directions aimed at overcoming these challenges

    The supernova rate in local galaxy clusters

    Get PDF
    We report a measurement of the supernova (SN) rates (Ia and core-collapse) in galaxy clusters based on the 136 SNe of the sample described in Cappellaro et al. (1999) and Mannucci et al. (2005). Early-type cluster galaxies show a type Ia SN rate (0.066 SNuM) similar to that obtained by Sharon et al. (2007) and more than 3 times larger than that in field early-type galaxies (0.019 SNuM). This difference has a 98% statistical confidence level. We examine many possible observational biases which could affect the rate determination, and conclude that none of them is likely to significantly alter the results. We investigate how the rate is related to several properties of the parent galaxies, and find that cluster membership, morphology and radio power all affect the SN rate, while galaxy mass has no measurable effect. The increased rate may be due to galaxy interactions in clusters, inducing either the formation of young stars or a different evolution of the progenitor binary systems. We present the first measurement of the core-collapse SN rate in cluster late-type galaxies, which turns out to be comparable to the rate in field galaxies. This suggests that no large systematic difference in the initial mass function exists between the two environments.Comment: MNRAS, revised version after referee's comment

    An incremental dual nu-support vector regression algorithm

    Full text link
    © 2018, Springer International Publishing AG, part of Springer Nature. Support vector regression (SVR) has been a hot research topic for several years as it is an effective regression learning algorithm. Early studies on SVR mostly focus on solving large-scale problems. Nowadays, an increasing number of researchers are focusing on incremental SVR algorithms. However, these incremental SVR algorithms cannot handle uncertain data, which are very common in real life because the data in the training example must be precise. Therefore, to handle the incremental regression problem with uncertain data, an incremental dual nu-support vector regression algorithm (dual-v-SVR) is proposed. In the algorithm, a dual-v-SVR formulation is designed to handle the uncertain data at first, then we design two special adjustments to enable the dual-v-SVR model to learn incrementally: incremental adjustment and decremental adjustment. Finally, the experiment results demonstrate that the incremental dual-v-SVR algorithm is an efficient incremental algorithm which is not only capable of solving the incremental regression problem with uncertain data, it is also faster than batch or other incremental SVR algorithms

    The stellar populations of early-type galaxies -- II. The effects of environment and mass

    Get PDF
    The degree of influence that environment and mass have on the stellar populations of early-type galaxies is uncertain. In this paper we present the results of a spectroscopic analysis of the stellar populations of early-type galaxies aimed at addressing this question. The sample of galaxies is drawn from four clusters, with =0.04, and their surrounding structure extending to ~10R_{vir}. We find that the distributions of the absorption-line strengths and the stellar population parameters age, metallicity and alpha-element abundance ratio do not differ significantly between the clusters and their outskirts, but the tight correlations found between these quantities and velocity dispersion within the clusters are weaker in their outskirts. All three stellar population parameters of cluster galaxies are positively correlated with velocity dispersion. Galaxies in clusters form a homogeneous class of objects that have similar distributions of line-strengths and stellar population parameters, and follow similar scaling relations regardless of cluster richness or morphology. We estimate the intrinsic scatter of the Gaussian distribution of metallicities to be 0.3 dex, while that of the alpha-element abundance ratio is 0.07 dex. The e-folding time of the exponential distribution of galaxy ages is estimated to be 900 Myr. The intrinsic scatters of the metallicity and alpha-element abundance ratio distributions can almost entirely be accounted for by the correlations with velocity dispersion and the intrinsic scatter about these relations. This implies that a galaxies mass plays the major role in determining its stellar population.Comment: 20 pages, 12 figures, 5 tables, accepted by MNRA

    Detecting Remote Evolutionary Relationships among Proteins by Large-Scale Semantic Embedding

    Get PDF
    Virtually every molecular biologist has searched a protein or DNA sequence database to find sequences that are evolutionarily related to a given query. Pairwise sequence comparison methods—i.e., measures of similarity between query and target sequences—provide the engine for sequence database search and have been the subject of 30 years of computational research. For the difficult problem of detecting remote evolutionary relationships between protein sequences, the most successful pairwise comparison methods involve building local models (e.g., profile hidden Markov models) of protein sequences. However, recent work in massive data domains like web search and natural language processing demonstrate the advantage of exploiting the global structure of the data space. Motivated by this work, we present a large-scale algorithm called ProtEmbed, which learns an embedding of protein sequences into a low-dimensional “semantic space.” Evolutionarily related proteins are embedded in close proximity, and additional pieces of evidence, such as 3D structural similarity or class labels, can be incorporated into the learning process. We find that ProtEmbed achieves superior accuracy to widely used pairwise sequence methods like PSI-BLAST and HHSearch for remote homology detection; it also outperforms our previous RankProp algorithm, which incorporates global structure in the form of a protein similarity network. Finally, the ProtEmbed embedding space can be visualized, both at the global level and local to a given query, yielding intuition about the structure of protein sequence space
    • …
    corecore