127 research outputs found
Link prediction in drug-target interactions network using similarity indices.
BACKGROUND: In silico drug-target interaction (DTI) prediction plays an integral role in drug repositioning: the discovery of new uses for existing drugs. One popular method of drug repositioning is network-based DTI prediction, which uses complex network theory to predict DTIs from a drug-target network. Currently, most network-based DTI prediction is based on machine learning - methods such as Restricted Boltzmann Machines (RBM) or Support Vector Machines (SVM). These methods require additional information about the characteristics of drugs, targets and DTIs, such as chemical structure, genome sequence, binding types, causes of interactions, etc., and do not perform satisfactorily when such information is unavailable. We propose a new, alternative method for DTI prediction that makes use of only network topology information attempting to solve this problem. RESULTS: We compare our method for DTI prediction against the well-known RBM approach. We show that when applied to the MATADOR database, our approach based on node neighborhoods yield higher precision for high-ranking predictions than RBM when no information regarding DTI types is available. CONCLUSION: This demonstrates that approaches purely based on network topology provide a more suitable approach to DTI prediction in the many real-life situations where little or no prior knowledge is available about the characteristics of drugs, targets, or their interactions
Bimodal network architectures for automatic generation of image annotation from text
Medical image analysis practitioners have embraced big data methodologies.
This has created a need for large annotated datasets. The source of big data is
typically large image collections and clinical reports recorded for these
images. In many cases, however, building algorithms aimed at segmentation and
detection of disease requires a training dataset with markings of the areas of
interest on the image that match with the described anomalies. This process of
annotation is expensive and needs the involvement of clinicians. In this work
we propose two separate deep neural network architectures for automatic marking
of a region of interest (ROI) on the image best representing a finding
location, given a textual report or a set of keywords. One architecture
consists of LSTM and CNN components and is trained end to end with images,
matching text, and markings of ROIs for those images. The output layer
estimates the coordinates of the vertices of a polygonal region. The second
architecture uses a network pre-trained on a large dataset of the same image
types for learning feature representations of the findings of interest. We show
that for a variety of findings from chest X-ray images, both proposed
architectures learn to estimate the ROI, as validated by clinical annotations.
There is a clear advantage obtained from the architecture with pre-trained
imaging network. The centroids of the ROIs marked by this network were on
average at a distance equivalent to 5.1% of the image width from the centroids
of the ground truth ROIs.Comment: Accepted to MICCAI 2018, LNCS 1107
Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents
Inferring the information structure of scientific
documents is useful for many NLP applications.
Existing approaches to this task require
substantial human effort. We propose
a framework for constraint learning that reduces
human involvement considerably. Our
model uses topic models to identify latent topics
and their key linguistic features in input
documents, induces constraints from this information
and maps sentences to their dominant
information structure categories through
a constrained unsupervised model. When
the induced constraints are combined with a
fully unsupervised model, the resulting model
challenges existing lightly supervised featurebased
models as well as unsupervised models
that use manually constructed declarative
knowledge. Our results demonstrate that useful
declarative knowledge can be learned from
data with very limited human involvement.This is the final published version. It first appeared at https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/472
Neural networks for open and closed Literature-based Discovery
Funder: Cambridge Commonwealth, European and International Trust; funder-id: http://dx.doi.org/10.13039/501100003343Funder: St. Edmund’s College, University of Cambridge; funder-id: http://dx.doi.org/10.13039/501100005705Literature-based Discovery (LBD) aims to discover new knowledge automatically from large collections of literature. Scientific literature is growing at an exponential rate, making it difficult for researchers to stay current in their discipline and easy to miss knowledge necessary to advance their research. LBD can facilitate hypothesis testing and generation and thus accelerate scientific progress. Neural networks have demonstrated improved performance on LBD-related tasks but are yet to be applied to it. We propose four graph-based, neural network methods to perform open and closed LBD. We compared our methods with those used by the state-of-the-art LION LBD system on the same evaluations to replicate recently published findings in cancer biology. We also applied them to a time-sliced dataset of human-curated peer-reviewed biological interactions. These evaluations and the metrics they employ represent performance on real-world knowledge advances and are thus robust indicators of approach efficacy. In the first experiments, our best methods performed 2-4 times better than the baselines in closed discovery and 2-3 times better in open discovery. In the second, our best methods performed almost 2 times better than the baselines in open discovery. These results are strong indications that neural LBD is potentially a very effective approach for generating new scientific discoveries from existing literature. The code for our models and other information can be found at: https://github.com/cambridgeltl/nn_for_LBD
Schur index and line operators
4d SCFTs and their invariants can be often enriched by
non-local BPS operators. In this paper we study the flavored Schur index of
several types of N = 2 SCFTs with and without line operators, using a series of
new integration formula of elliptic functions and Eisenstein series. We
demonstrate how to evaluate analytically the Schur index for a series of
class- theories and the SO(7) theory. For all
class- theories we obtain closed-form expressions for SU(2)
Wilson line index, and 't Hooft line index in some simple cases. We also
observe the relation between the line operator index with the characters of the
associated chiral algebras. Wilson line index for some other low rank gauge
theories are also studied.Comment: 72 pages, 9 figures, 5 table
Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches.
Background: Link prediction in biomedical graphs has several important
applications including predicting Drug-Target Interactions (DTI), Protein-Protein Interaction (PPI) prediction and Literature-Based Discovery (LBD). It can be done using a classifier to output the probability of link formation between nodes. Recently several works have used neural networks to create node representations which allow rich inputs to neural classifiers. Preliminary works were done on this and report promising results. However they did not use realistic settings like time-slicing, evaluate performances with comprehensive metrics or explain when or why neural network methods outperform. We investigated how inputs from four node representation algorithms affect performance of a neural link predictor
on random- and time-sliced biomedical graphs of real-world sizes (∼6 million edges) containing information relevant to DTI, PPI and LBD. We compared the performance of the neural link predictor to those of established baselines and report performance across five metrics.
Results: In random- and time-sliced experiments when the neural network
methods were able to learn good node representations and there was a negligible amount of disconnected nodes, those approaches outperformed the baselines. In the smallest graph (∼15,000 edges) and in larger graphs with approximately 14% disconnected nodes, baselines such as Common Neighbours proved a justifiable choice for link prediction. At low recall levels (∼0.3) the approaches were mostly equal, but at higher recall levels across all nodes and average performance at individual nodes, neural network approaches were superior. Analysis showed that neural network methods performed well on links between nodes with no previous common neighbours; potentially the most interesting links. Additionally, while neural network methods benefit from large amounts of data, they require considerable amounts of computational resources to utilise them.
Conclusions: Our results indicate that when there is enough data for the neural network methods to use and there are a negligible amount of disconnected nodes, those approaches outperform the baselines. At low recall levels the approaches are mostly equal but at higher recall levels and average performance at individual nodes, neural network approaches are superior. Performance at nodes without common neighbours which indicate more unexpected and perhaps more useful links account for this.This work was supported by Medical Research Council [grant number MR/M013049/1] and the Cambridge Commonwealth, European and International Trus
A Comparison and User-based Evaluation of Models of Textual Information Structure in the Context of Cancer Risk Assessment
BACKGROUND: Many practical tasks in biomedicine require accessing specific types of information in scientific literature; e.g. information about the results or conclusions of the study in question. Several schemes have been developed to characterize such information in scientific journal articles. For example, a simple section-based scheme assigns individual sentences in abstracts under sections such as Objective, Methods, Results and Conclusions. Some schemes of textual information structure have proved useful for biomedical text mining (BIO-TM) tasks (e.g. automatic summarization). However, user-centered evaluation in the context of real-life tasks has been lacking. METHODS: We take three schemes of different type and granularity - those based on section names, Argumentative Zones (AZ) and Core Scientific Concepts (CoreSC) - and evaluate their usefulness for a real-life task which focuses on biomedical abstracts: Cancer Risk Assessment (CRA). We annotate a corpus of CRA abstracts according to each scheme, develop classifiers for automatic identification of the schemes in abstracts, and evaluate both the manual and automatic classifications directly as well as in the context of CRA. RESULTS: Our results show that for each scheme, the majority of categories appear in abstracts, although two of the schemes (AZ and CoreSC) were developed originally for full journal articles. All the schemes can be identified in abstracts relatively reliably using machine learning. Moreover, when cancer risk assessors are presented with scheme annotated abstracts, they find relevant information significantly faster than when presented with unannotated abstracts, even when the annotations are produced using an automatic classifier. Interestingly, in this user-based evaluation the coarse-grained scheme based on section names proved nearly as useful for CRA as the finest-grained CoreSC scheme. CONCLUSIONS: We have shown that existing schemes aimed at capturing information structure of scientific documents can be applied to biomedical abstracts and can be identified in them automatically with an accuracy which is high enough to benefit a real-life task in biomedicine
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models
Text-guided image-to-video (I2V) generation aims to generate a coherent video
that preserves the identity of the input image and semantically aligns with the
input prompt. Existing methods typically augment pretrained text-to-video (T2V)
models by either concatenating the image with noised video frames channel-wise
before being fed into the model or injecting the image embedding produced by
pretrained image encoders in cross-attention modules. However, the former
approach often necessitates altering the fundamental weights of pretrained T2V
models, thus restricting the model's compatibility within the open-source
communities and disrupting the model's prior knowledge. Meanwhile, the latter
typically fails to preserve the identity of the input image. We present
I2V-Adapter to overcome such limitations. I2V-Adapter adeptly propagates the
unnoised input image to subsequent noised frames through a cross-frame
attention mechanism, maintaining the identity of the input image without any
changes to the pretrained T2V model. Notably, I2V-Adapter only introduces a few
trainable parameters, significantly alleviating the training cost and also
ensures compatibility with existing community-driven personalized models and
control tools. Moreover, we propose a novel Frame Similarity Prior to balance
the motion amplitude and the stability of generated videos through two
adjustable control coefficients. Our experimental results demonstrate that
I2V-Adapter is capable of producing high-quality videos. This performance,
coupled with its agility and adaptability, represents a substantial advancement
in the field of I2V, particularly for personalized and controllable
applications
Recent Advances in Hypertrophic Cardiomyopathy: A System Review
Hypertrophic cardiomyopathy (HCM) is a common genetic cardiovascular disease present in 1 in 500 of the general population, leading to the most frequent cause of sudden death in young people (including trained athletes), heart failure, and stroke. HCM is an autosomal dominant inheritance, which is associated with a large number of mutations in genes encoding proteins of the cardiac sarcomere. Over the last 20 years, the recognition, diagnosis, and treatment of HCM have been improved dramatically. And moreover, recent advancement in genomic medicine, the growing amount of data from genotype-phenotype correlation studies, and new pathways for HCM help the progress in understanding the diagnosis, mechanism, and treatment of HCM. In this chapter, we aim to outline the symptoms, complications, and diagnosis of HCM; update pathogenic variants (including miRNAs); review the treatment of HCM; and discuss current treatment and efforts to study HCM using induced pluripotent stem cell–derived cardiomyocytes and gene editing technologies. The authors ultimately hope that this chapter will stimulate further research, drive novel discoveries, and contribute to the precision medicine in diagnosis and therapy for HCM
- …