1,155 research outputs found

    Graph augmentation learning

    Get PDF
    Graph Augmentation Learning (GAL) provides outstanding solutions for graph learning in handling incomplete data, noise data, etc. Numerous GAL methods have been proposed for graph-based applications such as social network analysis and traffic flow forecasting. However, the underlying reasons for the effectiveness of these GAL methods are still unclear. As a consequence, how to choose optimal graph augmentation strategy for a certain application scenario is still in black box. There is a lack of systematic, comprehensive, and experimentally validated guideline of GAL for scholars. Therefore, in this survey, we in-depth review GAL techniques from macro (graph), meso (subgraph), and micro (node/edge) levels. We further detailedly illustrate how GAL enhance the data quality and the model performance. The aggregation mechanism of augmentation strategies and graph learning models are also discussed by different application scenarios, i.e., data-specific, model-specific, and hybrid scenarios. To better show the outperformance of GAL, we experimentally validate the effectiveness and adaptability of different GAL strategies in different downstream tasks. Finally, we share our insights on several open issues of GAL, including heterogeneity, spatio-temporal dynamics, scalability, and generalization. © 2022 ACM

    Representation Learning for Attributed Multiplex Heterogeneous Network

    Full text link
    Network embedding (or graph embedding) has been widely used in many real-world applications. However, existing methods mainly focus on networks with single-typed nodes/edges and cannot scale well to handle large networks. Many real-world networks consist of billions of nodes and edges of multiple types, and each node is associated with different attributes. In this paper, we formalize the problem of embedding learning for the Attributed Multiplex Heterogeneous Network and propose a unified framework to address this problem. The framework supports both transductive and inductive learning. We also give the theoretical analysis of the proposed framework, showing its connection with previous works and proving its better expressiveness. We conduct systematical evaluations for the proposed framework on four different genres of challenging datasets: Amazon, YouTube, Twitter, and Alibaba. Experimental results demonstrate that with the learned embeddings from the proposed framework, we can achieve statistically significant improvements (e.g., 5.99-28.23% lift by F1 scores; p<<0.01, t-test) over previous state-of-the-art methods for link prediction. The framework has also been successfully deployed on the recommendation system of a worldwide leading e-commerce company, Alibaba Group. Results of the offline A/B tests on product recommendation further confirm the effectiveness and efficiency of the framework in practice.Comment: Accepted to KDD 2019. Website: https://sites.google.com/view/gatn

    A comparison study on feature selection of DNA structural properties for promoter prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering the high-dimensionality and overfitting problems, it is unfeasible to utilize all available features for promoter prediction. Thus it is necessary to choose some appropriate features for the prediction task.</p> <p>Results</p> <p>This paper conducts an extensive comparison study on feature selection of DNA structural properties for promoter prediction. Firstly, to examine whether promoters possess some special structures, we carry out a systematical comparison among the profiles of thirteen structural features on promoter and non-promoter sequences. Secondly, we investigate the correlations between these structural features and promoter sequences. Thirdly, both filter and wrapper methods are utilized to select appropriate feature subsets from thirteen different kinds of structural features for promoter prediction, and the predictive power of the selected feature subsets is evaluated. Finally, we compare the prediction performance of the feature subsets selected in this paper with nine existing promoter prediction approaches.</p> <p>Conclusions</p> <p>Experimental results show that the structural features are differentially correlated to promoters. Specifically, DNA-bending stiffness, DNA denaturation and energy-related features are highly correlated with promoters. The predictive power for promoter sequences differentiates greatly among different structural features. Selecting the relevant features can significantly improve the accuracy of promoter prediction.</p

    Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities

    Full text link
    The vast proliferation of sensor devices and Internet of Things enables the applications of sensor-based activity recognition. However, there exist substantial challenges that could influence the performance of the recognition system in practical scenarios. Recently, as deep learning has demonstrated its effectiveness in many areas, plenty of deep methods have been investigated to address the challenges in activity recognition. In this study, we present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition. We first introduce the multi-modality of the sensory data and provide information for public datasets that can be used for evaluation in different challenge tasks. We then propose a new taxonomy to structure the deep methods by challenges. Challenges and challenge-related deep methods are summarized and analyzed to form an overview of the current research progress. At the end of this work, we discuss the open issues and provide some insights for future directions

    Neuroimaging of structural pathology and connectomics in traumatic brain injury: Toward personalized outcome prediction.

    Get PDF
    Recent contributions to the body of knowledge on traumatic brain injury (TBI) favor the view that multimodal neuroimaging using structural and functional magnetic resonance imaging (MRI and fMRI, respectively) as well as diffusion tensor imaging (DTI) has excellent potential to identify novel biomarkers and predictors of TBI outcome. This is particularly the case when such methods are appropriately combined with volumetric/morphometric analysis of brain structures and with the exploration of TBI-related changes in brain network properties at the level of the connectome. In this context, our present review summarizes recent developments on the roles of these two techniques in the search for novel structural neuroimaging biomarkers that have TBI outcome prognostication value. The themes being explored cover notable trends in this area of research, including (1) the role of advanced MRI processing methods in the analysis of structural pathology, (2) the use of brain connectomics and network analysis to identify outcome biomarkers, and (3) the application of multivariate statistics to predict outcome using neuroimaging metrics. The goal of the review is to draw the community's attention to these recent advances on TBI outcome prediction methods and to encourage the development of new methodologies whereby structural neuroimaging can be used to identify biomarkers of TBI outcome

    A Systematic Prediction of Multiple Drug-Target Interactions from Chemical, Genomic, and Pharmacological Data

    Get PDF
    In silico prediction of drug-target interactions from heterogeneous biological data can advance our system-level search for drug molecules and therapeutic targets, which efforts have not yet reached full fruition. In this work, we report a systematic approach that efficiently integrates the chemical, genomic, and pharmacological information for drug targeting and discovery on a large scale, based on two powerful methods of Random Forest (RF) and Support Vector Machine (SVM). The performance of the derived models was evaluated and verified with internally five-fold cross-validation and four external independent validations. The optimal models show impressive performance of prediction for drug-target interactions, with a concordance of 82.83%, a sensitivity of 81.33%, and a specificity of 93.62%, respectively. The consistence of the performances of the RF and SVM models demonstrates the reliability and robustness of the obtained models. In addition, the validated models were employed to systematically predict known/unknown drugs and targets involving the enzymes, ion channels, GPCRs, and nuclear receptors, which can be further mapped to functional ontologies such as target-disease associations and target-target interaction networks. This approach is expected to help fill the existing gap between chemical genomics and network pharmacology and thus accelerate the drug discovery processes

    Evaluating social science and humanities knowledge production: An exploratory analysis of dynamics in science systems

    Get PDF
    Knowledge is gaining increasing importance in modern-day society as a factor of production and, ultimately, growth. This article explores the dynamics in university knowledge production and its effect on the state of university-industry-policy exchange in the Netherlands. Science systems are said to be in transformation. The university has evolved from performing conventional research and educational functions to serving as an innovation-promoting knowledge hub; dynamics that have received mixed reactions. The social sciences and humanities (SSH) take a special position, insofar as their focus seems primarily to be placed on conventional research and educational functions, and not directly on (commercial) valorization. Societal changes are, however, pressing for a reconsideration of the role of SSH. In our article, we distinguish between three important new movements that seem to be affecting SSH. It is believed that these movements, which are already having an impact today, will considerably influence SSH in the future. These developments are further differentiation, synthesis between the various subdisciplines of SSH and the natural sciences, and shifts in paradigms. The aims of this article are twofold: (1) to assess what is believed to be the most likely development of SSH by means of discovering relevant subsets of factors influencing university knowledge production; and (2) to discover whether the knowledge production factors show characteristics of a general development similar to the "Mode 2" concept. A systematic qualitative database was created by means of 22 semi-structured personal interviews with key representatives from business, university and the policy sector. Our explanatory framework employs an artificial intelligence method, i.e. rough set analysis. On the basis of these results, we find that a small minority of the respondents prefers a closer relationship of SSH to society, government and industry, and other institutional centers of authority, whilst interdisciplinarity in particular is regarded as having an overall positive influence on the future of SSH in the Netherlands. Consequently, the idea of a clear distinction between Mode 1 and Mode 2 knowledge production, i.e. traditional knowledge and knowledge carried out in the context of application, is not supported by our data. © 2009 Interdisciplinary Centre for Comparative Research in the Social Sciences and ICCR Foundation

    Deep Learning for Genomics: A Concise Overview

    Full text link
    Advancements in genomic research such as high-throughput sequencing techniques have driven modern genomic studies into "big data" disciplines. This data explosion is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications.Comment: Invited chapter for Springer Book: Handbook of Deep Learning Application
    corecore