38 research outputs found

    From Ontology to Semantic Similarity: Calculation of Ontology-Based Semantic Similarity

    Get PDF
    Advances in high-throughput experimental techniques in the past decade have enabled the explosive increase of omics data, while effective organization, interpretation, and exchange of these data require standard and controlled vocabularies in the domain of biological and biomedical studies. Ontologies, as abstract description systems for domain-specific knowledge composition, hence receive more and more attention in computational biology and bioinformatics. Particularly, many applications relying on domain ontologies require quantitative measures of relationships between terms in the ontologies, making it indispensable to develop computational methods for the derivation of ontology-based semantic similarity between terms. Nevertheless, with a variety of methods available, how to choose a suitable method for a specific application becomes a problem. With this understanding, we review a majority of existing methods that rely on ontologies to calculate semantic similarity between terms. We classify existing methods into five categories: methods based on semantic distance, methods based on information content, methods based on properties of terms, methods based on ontology hierarchy, and hybrid methods. We summarize characteristics of each category, with emphasis on basic notions, advantages and disadvantages of these methods. Further, we extend our review to software tools implementing these methods and applications using these methods

    CDMF: A Deep Learning Model based on Convolutional and Dense-layer Matrix Factorization for Context-Aware Recommendation

    Get PDF
    We proposes a novel deep neural network based recommendation model named Convolutional and Dense-layer Matrix Factorization (CDMF) for Context-aware recommendation, which is to combine multi-source information from item description and tag information. CDMF adopts a convolution neural network to extract hidden feature from item description as document and then fuses it with tag information via a full connection layer, thus generates a comprehensive feature vector. Based on the matrix factorization method, CDMF makes rating prediction based on the fused information of both users and items. Experiments on a real dataset show that the proposed deep learning model obviously outperforms the state-of-art recommendation methods

    Constructing a gene semantic similarity network for the inference of disease genes

    Get PDF
    <p>Abstract</p> <p>Motivation</p> <p>The inference of genes that are truly associated with inherited human diseases from a set of candidates resulting from genetic linkage studies has been one of the most challenging tasks in human genetics. Although several computational approaches have been proposed to prioritize candidate genes relying on protein-protein interaction (PPI) networks, these methods can usually cover less than half of known human genes.</p> <p>Results</p> <p>We propose to rely on the biological process domain of the gene ontology to construct a gene semantic similarity network and then use the network to infer disease genes. We show that the constructed network covers about 50% more genes than a typical PPI network. By analyzing the gene semantic similarity network with the PPI network, we show that gene pairs tend to have higher semantic similarity scores if the corresponding proteins are closer to each other in the PPI network. By analyzing the gene semantic similarity network with a phenotype similarity network, we show that semantic similarity scores of genes associated with similar diseases are significantly different from those of genes selected at random, and that genes with higher semantic similarity scores tend to be associated with diseases with higher phenotype similarity scores. We further use the gene semantic similarity network with a random walk with restart model to infer disease genes. Through a series of large-scale leave-one-out cross-validation experiments, we show that the gene semantic similarity network can achieve not only higher coverage but also higher accuracy than the PPI network in the inference of disease genes.</p> <p>Contact</p> <p><email>[email protected]</email></p

    Exploring the Influencing Factors of IP Film Rating by Sentiment Analysis and GMM

    Get PDF
    Recently, intellectual property (IP) film has become an important accessory for entertainment, and its rating has become the focus of quality evaluation. However, existing research seldom conducts study on influencing factors of rating. In this paper, we use sentiment analysis and generalized method of moments (GMM) to explore the factors that affect IP film rating. We take advantage of production, broadcast, genre and audience feedback to construct six explanatory variables, including actor influence, screenwriter participation, broadcast time, broadcast platform, genre, and adaptation satisfaction. We use LLC, IPS and Sargan tests to conduct variable stability test and model setting test. From the regression results of 134 IP films that obtained by sample filtering, the impact of each influencing factor on the rating is obtained. We found that short-term historical rating, actor influence, adaptation satisfaction and screenwriter participation positively affect current rating. While, long-term historical rating has a negative impact on current rating. In addition, broadcast time and broadcast platform have imposed positive impact on IP film rating, and genre has only a weak impact on rating. Our work provides advice for IP film producers, prompting them to improve quality by emphasizing celebrity effects and author participation

    Does Daily Travel Pattern Disclose People’s Preference?

    Get PDF
    Existing studies normally focus on extracting temporal or periodical patterns of people’s daily travel for location based services. However, people’s characteristics and preference are actually paid much more attention by business. Therefore, how to capture characteristics from their daily travel patterns, is an interesting question. In order to address the research question, we first develop two basic measures in terms of repetitiveness of travel and then two advanced measures, to capture people’s activity of daily travel, and the colorfulness of lifestyle, respectively. Incorporating historical trajectories, with real-time positions from a location-based social network (LBSN), i.e. Foursquare, we conduct statistical analysis for people’s travel patterns in US cities. Finally, we illustrate people’s profiles of travel patterns and lifestyles. Results show that people’s preference can be inferred from the developed activity and colorfulness measures. Those findings demonstrate that proposed measures are supposed to be effectively adopted for researchers on travel pattern analysis and preference analysis, and further give suggestions to individuals for location-based decision making
    corecore