81 research outputs found

    Joint embedding in Hierarchical distance and semantic representation learning for link prediction

    Full text link
    The link prediction task aims to predict missing entities or relations in the knowledge graph and is essential for the downstream application. Existing well-known models deal with this task by mainly focusing on representing knowledge graph triplets in the distance space or semantic space. However, they can not fully capture the information of head and tail entities, nor even make good use of hierarchical level information. Thus, in this paper, we propose a novel knowledge graph embedding model for the link prediction task, namely, HIE, which models each triplet (\textit{h}, \textit{r}, \textit{t}) into distance measurement space and semantic measurement space, simultaneously. Moreover, HIE is introduced into hierarchical-aware space to leverage rich hierarchical information of entities and relations for better representation learning. Specifically, we apply distance transformation operation on the head entity in distance space to obtain the tail entity instead of translation-based or rotation-based approaches. Experimental results of HIE on four real-world datasets show that HIE outperforms several existing state-of-the-art knowledge graph embedding methods on the link prediction task and deals with complex relations accurately.Comment: Submitted to Big Data research one year ag

    ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network

    Full text link
    Automated ICD coding, which assigns the International Classification of Disease codes to patient visits, has attracted much research attention since it can save time and labor for billing. The previous state-of-the-art model utilized one convolutional layer to build document representations for predicting ICD codes. However, the lengths and grammar of text fragments, which are closely related to ICD coding, vary a lot in different documents. Therefore, a flat and fixed-length convolutional architecture may not be capable of learning good document representations. In this paper, we proposed a Multi-Filter Residual Convolutional Neural Network (MultiResCNN) for ICD coding. The innovations of our model are two-folds: it utilizes a multi-filter convolutional layer to capture various text patterns with different lengths and a residual convolutional layer to enlarge the receptive field. We evaluated the effectiveness of our model on the widely-used MIMIC dataset. On the full code set of MIMIC-III, our model outperformed the state-of-the-art model in 4 out of 6 evaluation metrics. On the top-50 code set of MIMIC-III and the full code set of MIMIC-II, our model outperformed all the existing and state-of-the-art models in all evaluation metrics. The code is available at https://github.com/foxlf823/Multi-Filter-Residual-Convolutional-Neural-Network

    Interpretable bilinear attention network with domain adaptation improves drug-target prediction

    Full text link
    Predicting drug-target interaction is key for drug discovery. Recent deep learning-based methods show promising performance but two challenges remain: (i) how to explicitly model and learn local interactions between drugs and targets for better prediction and interpretation; (ii) how to generalize prediction performance on novel drug-target pairs from different distribution. In this work, we propose DrugBAN, a deep bilinear attention network (BAN) framework with domain adaptation to explicitly learn pair-wise local interactions between drugs and targets, and adapt on out-of-distribution data. DrugBAN works on drug molecular graphs and target protein sequences to perform prediction, with conditional domain adversarial learning to align learned interaction representations across different distributions for better generalization on novel drug-target pairs. Experiments on three benchmark datasets under both in-domain and cross-domain settings show that DrugBAN achieves the best overall performance against five state-of-the-art baselines. Moreover, visualizing the learned bilinear attention map provides interpretable insights from prediction results.Comment: 16 pages, 6 figure

    Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction

    Full text link
    The prediction of molecular properties is a crucial task in the field of material and drug discovery. The potential benefits of using deep learning techniques are reflected in the wealth of recent literature. Still, these techniques are faced with a common challenge in practice: Labeled data are limited by the cost of manual extraction from literature and laborious experimentation. In this work, we propose a data-efficient property predictor by utilizing a learnable hierarchical molecular grammar that can generate molecules from grammar production rules. Such a grammar induces an explicit geometry of the space of molecular graphs, which provides an informative prior on molecular structural similarity. The property prediction is performed using graph neural diffusion over the grammar-induced geometry. On both small and large datasets, our evaluation shows that this approach outperforms a wide spectrum of baselines, including supervised and pre-trained graph neural networks. We include a detailed ablation study and further analysis of our solution, showing its effectiveness in cases with extremely limited data. Code is available at https://github.com/gmh14/Geo-DEG.Comment: 22 pages, 10 figures; ICML 202

    On the quantitative analysis of Deep Belief Networks

    Get PDF
    Deep Belief Networks (DBN’s) are generative models that contain many layers of hidden variables. Efficient greedy algorithms for learning and approximate inference have allowed these models to be applied successfully in many application domains. The main building block of a DBN is a bipartite undirected graphical model called a restricted Boltzmann machine (RBM). Due to the presence of the partition function, model selection, complexity control, and exact maximum likelihood learning in RBM's are intractable. We show that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and we present a novel AIS scheme for comparing RBM's with different architectures. We further show how an AIS estimator, along with approximate inference, can be used to estimate a lower bound on the log-probability that a DBN model with multiple hidden layers assigns to the test data. This is, to our knowledge, the first step towards obtaining quantitative results that would allow us to directly assess the performance of Deep Belief Networks as generative models of data

    Théorie de l’évidence pour suivi de visage

    Get PDF
    Le suivi de visage par caméra vidéo est abordé ici sous l’angle de la fusion évidentielle. La méthode proposée repose sur un apprentissage sommaire basé sur une initialisation supervisée. Le formalisme du modèle de croyances transférables est utilisé pour pallier l’incomplétude du modèle a priori de visage due au manque d’exhaustivité de la base d’apprentissage. L’algorithme se décompose en deux étapes. La phase de détection de visage synthétise un modèle évidentiel où les attributs du détecteur de Viola et Jones sont convertis en fonctions de croyance, et fusionnés avec des fonctions de masse couleur modélisant un détecteur de teinte chair, opérant dans un espace chromatique original obtenu par transformation logarithmique. Pour fusionner les sources couleur dépendantes, nous proposons un opérateur de compromis inspiré de la règle prudente de Denœux. Pour la phase de suivi, les probabilités pignistiques issues du modèle de visage garantissent la compatibilité entre les cadres crédibiliste et probabiliste. Elles alimentent un filtre particulaire classique qui permet le suivi du visage en temps réel. Nous analysons l’influence des paramètres du modèle évidentiel sur la qualité du suivi.This paper deals with real time face detection and tracking by a video camera. The method is based on a simple and fast initializing stage for learning. The transferable belief model is used to deal with the prior model incompleteness due to the lack of exhaustiveness of the learning stage. The algorithm works in two steps. The detection phase synthesizes an evidential face model by merging basic beliefs elaborated from the Viola and Jones face detector and from colour mass functions. These functions are computed from information sources in a logarithmic colour space. To deal with the colour information dependence in the fusion process, we propose a compromise operator close to the Denœux cautious rule. As regards the tracking phase, the pignistic probabilities from the face model guarantee the compatibility between the believes and the probability formalism. They are the inputs of a particle filter which ensures face tracking at video rate. The optimal parameter tuning of the evidential model is discussed

    Inversion pour image texturée : déconvolution myope non supervisée, choix de modèles, déconvolution-segmentation

    Get PDF
    This thesis is addressing a series of inverse problems of major importance in the fieldof image processing (image segmentation, model choice, parameter estimation, deconvolution)in the context of textured images. In all of the aforementioned problems theobservations are indirect, i.e., the textured images are affected by a blur and by noise. Thecontributions of this work belong to three main classes: modeling, methodological andalgorithmic. From the modeling standpoint, the contribution consists in the development of a newnon-Gaussian model for textures. The Fourier coefficients of the textured images are modeledby a Scale Mixture of Gaussians Random Field. The Power Spectral Density of thetexture has a parametric form, driven by a set of parameters that encode the texture characteristics.The methodological contribution is threefold and consists in solving three image processingproblems that have not been tackled so far in the context of indirect observationsof textured images. All the proposed methods are Bayesian and are based on the exploitingthe information encoded in the a posteriori law. The first method that is proposed is devotedto the myopic deconvolution of a textured image and the estimation of its parameters.The second method achieves joint model selection and model parameters estimation froman indirect observation of a textured image. Finally, the third method addresses the problemof joint deconvolution and segmentation of an image composed of several texturedregions, while estimating at the same time the parameters of each constituent texture.Last, but not least, the algorithmic contribution is represented by the development ofa new efficient version of the Metropolis Hastings algorithm, with a directional componentof the proposal function based on the”Newton direction” and the Fisher informationmatrix. This particular directional component allows for an efficient exploration of theparameter space and, consequently, increases the convergence speed of the algorithm.To summarize, this work presents a series of methods to solve three image processingproblems in the context of blurry and noisy textured images. Moreover, we present twoconnected contributions, one regarding the texture models andone meant to enhance theperformances of the samplers employed for all of the three methods.Ce travail est dédié à la résolution de plusieurs problèmes de grand intérêt en traitement d’images : segmentation, choix de modèle et estimation de paramètres, pour le cas spécifique d’images texturées indirectement observées (convoluées et bruitées). Dans ce contexte, les contributions de cette thèse portent sur trois plans différents : modéle, méthode et algorithmique.Du point de vue modélisation de la texture, un nouveaumodèle non-gaussien est proposé. Ce modèle est défini dans le domaine de Fourier et consiste en un mélange de Gaussiennes avec une Densité Spectrale de Puissance paramétrique.Du point de vueméthodologique, la contribution est triple –troisméthodes Bayésiennes pour résoudre de manière :–optimale–non-supervisée–des problèmes inverses en imagerie dans le contexte d’images texturées ndirectement observées, problèmes pas abordés dans la littérature jusqu’à présent.Plus spécifiquement,1. la première méthode réalise la déconvolution myope non-supervisée et l’estimation des paramètres de la texture,2. la deuxième méthode est dédiée à la déconvolution non-supervisée, le choix de modèle et l’estimation des paramètres de la texture et, finalement,3. la troisième méthode déconvolue et segmente une image composée de plusieurs régions texturées, en estimant au même temps les hyperparamètres (niveau du signal et niveau du bruit) et les paramètres de chaque texture.La contribution sur le plan algorithmique est représentée par une nouvelle version rapide de l’algorithme Metropolis-Hastings. Cet algorithme est basé sur une loi de proposition directionnelle contenant le terme de la ”direction de Newton”. Ce terme permet une exploration rapide et efficace de l’espace des paramètres et, de ce fait, accélère la convergence
    corecore