70 research outputs found
Recommended from our members
Structural determinants of mutability across cancer genomes
Cancer is a group of diseases which are characterised and actuated by somatic mutations. In cancer the distribution of mutations across the genome is inhomogeneous, with genomic and epigenomic features influencing mutational patterns. Previous studies have indicated that chromatin organization and replication time domains are correlated with and thus predictive of this variation. Here the role of alternative DNA structures was investigated across a multitude of whole-genome sequenced cancers.
Sequences that are predisposed to fold in alternative DNA structures can be identified by the primary DNA sequence of the human genome and are collectively known as non-B DNA motifs. More specifically, these include Z-DNA, G-quadruplexes, inverted repeats that can fold in cruciforms and hairpins, direct and short tandem repeats that can mediate the formation of slipped structures and a subset of mirror repeats that fold in intramolecular triple stranded DNA also known as H-DNA.
A systematic investigation of the association between each of those non-B DNA motifs and mutability was performed across thousands of whole genome sequenced tumours from different tissues. Non-B DNA motifs were more mutable than the surrounding regions and were found to be determinants of mutability across cancer types. Additionally, they could be used to predict variation in mutational density genome-wide. Exposed structural components and physical properties of non-B DNA motifs influenced the likelihood of mutagenesis, indicating that secondary structures are possibly causally implicated in mutagenesis. Furthermore, non-B DNA motifs increased the likelihood of recurrent mutations in the genome, which has direct implications for the identification of driver mutations in non-coding regions.
A detailed characterisation of indel mutagenesis was performed across the different cancer types. The analysis indicated the roles of different non-B DNA motif categories as well as sequence homologies in indel mutagenesis. In particular, sequence characteristics of a subset of non-B DNA motifs significantly influenced their relative mutational enrichment at specific indel categories. Finally, a method was developed to quantify replication and transcription strand asymmetries at indels systematically for the first time. As a result, mutational processes that are causally implicated in strand asymmetries at indels were identified and analysed. These included mismatch repair and transcription-coupled nucleotide excision repair both of which contributed to the observed transcriptional strand asymmetries for indels.Wellcome Trus
Pretrained Transformers for Text Ranking: BERT and Beyond
The goal of text ranking is to generate an ordered list of texts retrieved
from a corpus in response to a query. Although the most common formulation of
text ranking is search, instances of the task can also be found in many natural
language processing applications. This survey provides an overview of text
ranking with neural network architectures known as transformers, of which BERT
is the best-known example. The combination of transformers and self-supervised
pretraining has been responsible for a paradigm shift in natural language
processing (NLP), information retrieval (IR), and beyond. In this survey, we
provide a synthesis of existing work as a single point of entry for
practitioners who wish to gain a better understanding of how to apply
transformers to text ranking problems and researchers who wish to pursue work
in this area. We cover a wide range of modern techniques, grouped into two
high-level categories: transformer models that perform reranking in multi-stage
architectures and dense retrieval techniques that perform ranking directly.
There are two themes that pervade our survey: techniques for handling long
documents, beyond typical sentence-by-sentence processing in NLP, and
techniques for addressing the tradeoff between effectiveness (i.e., result
quality) and efficiency (e.g., query latency, model and index size). Although
transformer architectures and pretraining techniques are recent innovations,
many aspects of how they are applied to text ranking are relatively well
understood and represent mature techniques. However, there remain many open
research questions, and thus in addition to laying out the foundations of
pretrained transformers for text ranking, this survey also attempts to
prognosticate where the field is heading
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
Proceedings, MSVSCC 2016
Proceedings of the 10th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 14, 2016 at VMASC in Suffolk, Virginia
Intrinsically Disordered Proteins and Chronic Diseases
This book is an embodiment of a series of articles that were published as part of a Special Issue of Biomolecules. It is dedicated to exploring the role of intrinsically disordered proteins (IDPs) in various chronic diseases. The main goal of the articles is to describe recent progress in elucidating the mechanisms by which IDPs cause various human diseases, such as cancer, cardiovascular disease, amyloidosis, neurodegenerative diseases, diabetes, and genetic diseases, to name a few. Contributed by leading investigators in the field, this compendium serves as a valuable resource for researchers, clinicians as well as postdoctoral fellows and graduate student
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
An in-silico study: Investigating small molecule modulators of bio-molecular interactions
Small molecule inhibitors are commonly used to target protein targets that assist in the spread of diseases such as AIDS, cancer and deadly forms of influenza. Despite drug companies spending millions on R&D, the number of drugs that pass clinical trials is limited due to difficulties in engineering optimal non-covalent interactions. As many protein targets have the ability to rapidly evolve resistance, there is an urgent need for
methods that rapidly identify effective new compounds.
The thermodynamic driving force behind most biochemical reactions is known as the Gibbs free energy and it contains opposing dynamic and structural components that are known as the entropy (ΔS°) and enthalpy (ΔH°) respectively. ΔG° = ΔH° - TΔS°. Traditionally, drug design focussed on complementing the shape of an inhibitor to the binding cavity to optimise ΔG° favourability. However, this approach neglects the entropic contribution and phenomena such as Entropy-Enthalpy Compensation (EEC) often result in favourable bonding interactions not improving
ΔG°, due to entropic unfavorability. Similarly, attempts to optimise inhibitor entropy can also have unpredictable results. Experimental methods such as ITC report on global thermodynamics, but have difficulties identifying the underlying molecular rationale for measured values. However, computational techniques do not suffer from the same limitations.
MUP-I can promiscuously bind panels of hydrophobic ligands that possess incremental structural differences. Thus, small perturbations to the system can be studied through various in silico approaches. This work analyses the trends exhibited across these panels by examining the dynamic component via the calculation of per-unit entropies of protein, ligand and solvent. Two new methods were developed to assess the translational and rotational contributions to TΔS°, and a protocol created to study ligand internalisation. Synthesising this information with structural data obtained from spatial data on the
binding cavity, intermolecular contacts and H-bond analysis allowed detailed molecular rationale for the global thermodynamic signatures to be derived
GPU PERFORMANCE MODELLING AND OPTIMIZATION
Ph.DNUS-TU/E JOINT PH.D
Computational Image Analysis For Axonal Transport, Phenotypic Profiling, And Digital Pathology
Recent advances in fluorescent probes, microscopy, and imaging platforms have revolutionized biology and medicine, generating multi-dimensional image datasets at unprecedented scales. Traditional, low-throughput methods of image analysis are inadequate to handle the increased “volume, velocity, and variety” that characterize the realm of big data. Thus, biomedical imaging requires a new set of tools, which include advanced computer vision and machine learning algorithms. In this work, we develop computational image analysis solutions to biological questions at the level of single-molecules, cells, and tissues. At the molecular level, we dissect the regulation of dynein-dynactin transport initiation using in vitro reconstitution, single-particle tracking, super-resolution microscopy, live-cell imaging in neurons, and computational modeling. We show that at least two mechanisms regulate dynein transport initiation neurons: (1) cytoplasmic linker proteins, which are regulated by phosphorylation, increase the capture radius around the microtubule, thus reducing the time cargo spends in a diffusive search; and (2) a spatial gradient of tyrosinated alpha-tubulin enriched in the distal axon increases the affinity of dynein-dynactin for microtubules. Together, these mechanisms support a multi-modal recruitment model where interacting layers of regulation provide efficient, robust, and spatiotemporal control of transport initiation. At the cellular level, we develop and train deep residual convolutional neural networks on a large and diverse set of cellular microscopy images. Then, we apply networks trained for one task as deep feature extractors for unsupervised phenotypic profiling in a different task. We show that neural networks trained on one dataset encode robust image phenotypes that are sufficient to cluster subcellular structures by type and separate drug compounds by the mechanism of action, without additional training, supporting the strength and flexibility of this approach. Future applications include phenotypic profiling in image-based screens, where clustering genetic or drug treatments by image phenotypes may reveal novel relationships among genetic or pharmacologic pathways. Finally, at the tissue level, we apply deep learning pipelines in digital pathology to segment cardiac tissue and classify clinical heart failure using whole-slide images of cardiac histopathology. Together, these results demonstrate the power and promise of computational image analysis, computer vision, and deep learning in biological image analysis
- …