12,157 research outputs found
The physicist's guide to one of biotechnology's hottest new topics: CRISPR-Cas
Clustered regularly interspaced short palindromic repeats (CRISPR) and
CRISPR-associated proteins (Cas) constitute a multi-functional, constantly
evolving immune system in bacteria and archaea cells. A heritable, molecular
memory is generated of phage, plasmids, or other mobile genetic elements that
attempt to attack the cell. This memory is used to recognize and interfere with
subsequent invasions from the same genetic elements. This versatile prokaryotic
tool has also been used to advance applications in biotechnology. Here we
review a large body of CRISPR-Cas research to explore themes of evolution and
selection, population dynamics, horizontal gene transfer, specific and
cross-reactive interactions, cost and regulation, non-immunological CRISPR
functions that boost host cell robustness, as well as applicable mechanisms for
efficient and specific genetic engineering. We offer future directions that can
be addressed by the physics community. Physical understanding of the CRISPR-Cas
system will advance uses in biotechnology, such as developing cell lines and
animal models, cell labeling and information storage, combatting antibiotic
resistance, and human therapeutics.Comment: 75 pages, 15 figures, Physical Biology (2018
Functional Diversity and Structural Disorder in the Human Ubiquitination Pathway
The ubiquitin-proteasome system plays a central role in cellular regulation and protein quality control (PQC). The system is built as a pyramid of increasing complexity, with two E1 (ubiquitin activating), few dozen E2 (ubiquitin conjugating) and several hundred E3 (ubiquitin ligase) enzymes. By collecting and analyzing E3 sequences from the KEGG BRITE database and literature, we assembled a coherent dataset of 563 human E3s and analyzed their various physical features. We found an increase in structural disorder of the system with multiple disorder predictors (IUPred - E1: 5.97%, E2: 17.74%, E3: 20.03%). E3s that can bind E2 and substrate simultaneously (single subunit E3, ssE3) have significantly higher disorder (22.98%) than E3s in which E2 binding (multi RING-finger, mRF, 0.62%), scaffolding (6.01%) and substrate binding (adaptor/substrate recognition subunits, 17.33%) functions are separated. In ssE3s, the disorder was localized in the substrate/adaptor binding domains, whereas the E2-binding RING/HECT-domains were structured. To demonstrate the involvement of disorder in E3 function, we applied normal modes and molecular dynamics analyses to show how a disordered and highly flexible linker in human CBL (an E3 that acts as a regulator of several tyrosine kinase-mediated signalling pathways) facilitates long-range conformational changes bringing substrate and E2-binding domains towards each other and thus assisting in ubiquitin transfer. E3s with multiple interaction partners (as evidenced by data in STRING) also possess elevated levels of disorder (hubs, 22.90% vs. non-hubs, 18.36%). Furthermore, a search in PDB uncovered 21 distinct human E3 interactions, in 7 of which the disordered region of E3s undergoes induced folding (or mutual induced folding) in the presence of the partner. In conclusion, our data highlights the primary role of structural disorder in the functions of E3 ligases that manifests itself in the substrate/adaptor binding functions as well as the mechanism of ubiquitin transfer by long-range conformational transitions. © 2013 Bhowmick et al
Evaluation of the current knowledge limitations in breast cancer research: a gap analysis
BACKGROUND
A gap analysis was conducted to determine which areas of breast cancer research, if targeted by researchers and funding bodies, could produce the greatest impact on patients.
METHODS
Fifty-six Breast Cancer Campaign grant holders and prominent UK breast cancer researchers participated in a gap analysis of current breast cancer research. Before, during and following the meeting, groups in seven key research areas participated in cycles of presentation, literature review and discussion. Summary papers were prepared by each group and collated into this position paper highlighting the research gaps, with recommendations for action.
RESULTS
Gaps were identified in all seven themes. General barriers to progress were lack of financial and practical resources, and poor collaboration between disciplines. Critical gaps in each theme included: (1) genetics (knowledge of genetic changes, their effects and interactions); (2) initiation of breast cancer (how developmental signalling pathways cause ductal elongation and branching at the cellular level and influence stem cell dynamics, and how their disruption initiates tumour formation); (3) progression of breast cancer (deciphering the intracellular and extracellular regulators of early progression, tumour growth, angiogenesis and metastasis); (4) therapies and targets (understanding who develops advanced disease); (5) disease markers (incorporating intelligent trial design into all studies to ensure new treatments are tested in patient groups stratified using biomarkers); (6) prevention (strategies to prevent oestrogen-receptor negative tumours and the long-term effects of chemoprevention for oestrogen-receptor positive tumours); (7) psychosocial aspects of cancer (the use of appropriate psychosocial interventions, and the personal impact of all stages of the disease among patients from a range of ethnic and demographic backgrounds).
CONCLUSION
Through recommendations to address these gaps with future research, the long-term benefits to patients will include: better estimation of risk in families with breast cancer and strategies to reduce risk; better prediction of drug response and patient prognosis; improved tailoring of treatments to patient subgroups and development of new therapeutic approaches; earlier initiation of treatment; more effective use of resources for screening populations; and an enhanced experience for people with or at risk of breast cancer and their families. The challenge to funding bodies and researchers in all disciplines is to focus on these gaps and to drive advances in knowledge into improvements in patient care
TargetMine, an Integrated Data Warehouse for Candidate Gene Prioritisation and Target Discovery
Prioritising candidate genes for further experimental characterisation is a
non-trivial challenge in drug discovery and biomedical research in general. An
integrated approach that combines results from multiple data types is best
suited for optimal target selection. We developed TargetMine, a data warehouse
for efficient target prioritisation. TargetMine utilises the InterMine
framework, with new data models such as protein-DNA interactions integrated in a
novel way. It enables complicated searches that are difficult to perform with
existing tools and it also offers integration of custom annotations and in-house
experimental data. We proposed an objective protocol for target prioritisation
using TargetMine and set up a benchmarking procedure to evaluate its
performance. The results show that the protocol can identify known
disease-associated genes with high precision and coverage. A demonstration
version of TargetMine is available at http://targetmine.nibio.go.jp/
Type III Secretion Effectors with Arginine N-Glycosyltransferase Activity
Type III secretion systems are used by many Gram-negative bacterial pathogens to inject proteins, known as effectors, into the cytosol of host cells. These virulence factors interfere with a diverse array of host signal transduction pathways and cellular processes. Many effectors have catalytic activities to promote post-translational modifications of host proteins. This review focuses on a family of effectors with glycosyltransferase activity that catalyze addition of N-acetyl-d-glucosamine to specific arginine residues in target proteins, leading to reduced NF-κB pathway activation and impaired host cell death. This family includes NleB from Citrobacter rodentium, NleB1 and NleB2 from enteropathogenic and enterohemorrhagic Escherichia coli, and SseK1, SseK2, and SseK3 from Salmonella enterica. First, we place these effectors in the general framework of the glycosyltransferase superfamily and in the particular context of the role of glycosylation in bacterial pathogenesis. Then, we provide detailed information about currently known members of this family, their role in virulence, and their targetsSpanish Ministerio de Economía, Industria y Competitividad , Agencia Estatal de Investigación, and the European Regional Development Fund, grant number SAF2016‐75365‐REuropean Union’s Horizon 2020 e Marie Skłodowska‐Curie grant agreement No 84262
Computational analysis and prediction of protein-RNA interactions
Protein-RNA interactions are essential for many important processes including all phases of protein production, regulation of gene expression, and replication and assembly of many viruses. This dissertation has two related goals: 1) predicting RNA-binding sites in proteins from protein sequence, structure, and conservation information, and 2) characterizing protein-RNA interactions.
We present several machine learning classifiers for predicting RNA-binding sites in proteins based on the protein sequence, protein structure, and conservation information. Our first classifier uses only amino acid sequence information as input and predicts RNA-binding sites with an area under the receiver operator characteristic curve (AUC) of 0.74. Using the neighboring amino acids in the protein structure improves prediction performance over using sequence alone. We show that using evolutionary information in the form of position specific scoring matrices provides a further significant improvement in predictions. Finally, we create an ensemble classifier that combines the predictions of the sequence, structure, and PSSM based classifiers and gives the best prediction performance, with an AUC of 0.81.
We construct the Protein-RNA Interaction Database, PRIDB, a comprehensive collection of all protein-RNA complexes in the PDB. PRIDB focuses on characterizing the molecular interaction at the protein-RNA interface in terms of van der Waals contacts, direct hydrogen bonds, and water-mediated hydrogen bonds. We perform an extensive analysis of the RNA-binding characteristics of a non-redundant dataset of 181 proteins to determine general characteristics of protein-RNA binding sites. We find that the overall interaction propensities for Watson-Crick paired nucleotides and non Watson-Crick paired nucleotides are very similar, with the propensities for amino acids binding to single stranded nucleotides showing more differences. We find that van der Waals contacts are more numerous than hydrogen bonds and amino acids interact with RNA through their side chain atoms more frequently than their main chain atoms. We also find that contacts to the RNA base are not as frequent as contacts to the RNA backbone.
Together, the prediction and characterization presented in this dissertation have increased our understanding of how proteins and RNA interact
Ab initio RNA folding
RNA molecules are essential cellular machines performing a wide variety of
functions for which a specific three-dimensional structure is required. Over
the last several years, experimental determination of RNA structures through
X-ray crystallography and NMR seems to have reached a plateau in the number of
structures resolved each year, but as more and more RNA sequences are being
discovered, need for structure prediction tools to complement experimental data
is strong. Theoretical approaches to RNA folding have been developed since the
late nineties when the first algorithms for secondary structure prediction
appeared. Over the last 10 years a number of prediction methods for 3D
structures have been developed, first based on bioinformatics and data-mining,
and more recently based on a coarse-grained physical representation of the
systems. In this review we are going to present the challenges of RNA structure
prediction and the main ideas behind bioinformatic approaches and physics-based
approaches. We will focus on the description of the more recent physics-based
phenomenological models and on how they are built to include the specificity of
the interactions of RNA bases, whose role is critical in folding. Through
examples from different models, we will point out the strengths of
physics-based approaches, which are able not only to predict equilibrium
structures, but also to investigate dynamical and thermodynamical behavior, and
the open challenges to include more key interactions ruling RNA folding.Comment: 28 pages, 18 figure
The future of laboratory medicine - A 2014 perspective.
Predicting the future is a difficult task. Not surprisingly, there are many examples and assumptions that have proved to be wrong. This review surveys the many predictions, beginning in 1887, about the future of laboratory medicine and its sub-specialties such as clinical chemistry and molecular pathology. It provides a commentary on the accuracy of the predictions and offers opinions on emerging technologies, economic factors and social developments that may play a role in shaping the future of laboratory medicine
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Corpus annotation for mining biomedical events from literature
<p>Abstract</p> <p>Background</p> <p>Advanced Text Mining (TM) such as semantic enrichment of papers, event or relation extraction, and intelligent Question Answering have increasingly attracted attention in the bio-medical domain. For such attempts to succeed, text annotation from the biological point of view is indispensable. However, due to the complexity of the task, semantic annotation has never been tried on a large scale, apart from relatively simple term annotation.</p> <p>Results</p> <p>We have completed a new type of semantic annotation, event annotation, which is an addition to the existing annotations in the GENIA corpus. The corpus has already been annotated with POS (Parts of Speech), syntactic trees, terms, etc. The new annotation was made on half of the GENIA corpus, consisting of 1,000 Medline abstracts. It contains 9,372 sentences in which 36,114 events are identified. The major challenges during event annotation were (1) to design a scheme of annotation which meets specific requirements of text annotation, (2) to achieve biology-oriented annotation which reflect biologists' interpretation of text, and (3) to ensure the homogeneity of annotation quality across annotators. To meet these challenges, we introduced new concepts such as Single-facet Annotation and Semantic Typing, which have collectively contributed to successful completion of a large scale annotation.</p> <p>Conclusion</p> <p>The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain.</p
- …