180 research outputs found
IKZF1 Deletions with COBL Breakpoints Are Not Driven by RAG-Mediated Recombination Events in Acute Lymphoblastic Leukemia
IKZF1 deletion (ΔIKZF1) is an important predictor of relapse in both childhood and adult B-cell precursor acute lymphoblastic leukemia (B-ALL). Previously, we revealed that COBL is a hotspot for breakpoints in leukemia and could promote IKZF1 deletions. Through an international collaboration, we provide a detailed genetic and clinical picture of B-ALL with COBL rearrangements (COBL-r). Patients with B-ALL and IKZF1 deletion (n = 133) were included. IKZF1 ∆1-8 were associated with large alterations within chromosome 7: monosomy 7 (18%), isochromosome 7q (10%), 7p loss (19%), and interstitial deletions (53%). The latter included COBL-r, which were found in 12% of the IKZF1 ∆1-8 cohort. Patients with COBL-r are mostly classified as intermediate cytogenetic risk and frequently harbor ETV6, PAX5, CDKN2A/B deletions. Overall, 56% of breakpoints were located within COBL intron 5. Cryptic recombination signal sequence motifs were broadly distributed within the sequence of COBL, and no enrichment for the breakpoint cluster region was found. In summary, a diverse spectrum of alterations characterizes ΔIKZF1 and they also include deletion breakpoints within COBL. We confirmed that COBL is a hotspot associated with ΔIKZF1, but these rearrangements are not driven by RAG-mediated recombination
Composite structural motifs of binding sites for delineating biological functions of proteins
Most biological processes are described as a series of interactions between
proteins and other molecules, and interactions are in turn described in terms
of atomic structures. To annotate protein functions as sets of interaction
states at atomic resolution, and thereby to better understand the relation
between protein interactions and biological functions, we conducted exhaustive
all-against-all atomic structure comparisons of all known binding sites for
ligands including small molecules, proteins and nucleic acids, and identified
recurring elementary motifs. By integrating the elementary motifs associated
with each subunit, we defined composite motifs which represent
context-dependent combinations of elementary motifs. It is demonstrated that
function similarity can be better inferred from composite motif similarity
compared to the similarity of protein sequences or of individual binding sites.
By integrating the composite motifs associated with each protein function, we
define meta-composite motifs each of which is regarded as a time-independent
diagrammatic representation of a biological process. It is shown that
meta-composite motifs provide richer annotations of biological processes than
sequence clusters. The present results serve as a basis for bridging atomic
structures to higher-order biological phenomena by classification and
integration of binding site structures.Comment: 34 pages, 7 figure
The role of viral genomics in understanding COVID-19 outbreaks in long-term care facilities
We reviewed all genomic epidemiology studies on COVID-19 in long-term care facilities (LTCFs) that had been published to date. We found that staff and residents were usually infected with identical, or near identical, SARS-CoV-2 genomes. Outbreaks usually involved one predominant cluster, and the same lineages persisted in LTCFs despite infection control measures. Outbreaks were most commonly due to single or few introductions followed by a spread rather than a series of seeding events from the community into LTCFs. The sequencing of samples taken consecutively from the same individuals at the same facilities showed the persistence of the same genome sequence, indicating that the sequencing technique was robust over time. When combined with local epidemiology, genomics allowed probable transmission sources to be better characterised. The transmission between LTCFs was detected in multiple studies. The mortality rate among residents was high in all facilities, regardless of the lineage. Bioinformatics methods were inadequate in a third of the studies reviewed, and reproducing the analyses was difficult because sequencing data were not available in many facilities
FLORA: a novel method to predict protein function from structure in diverse superfamilies
Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues
Spatial growth rate of emerging SARS-CoV-2 lineages in England, September 2020–December 2021
This paper uses a robust method of spatial epidemiological analysis to assess the spatial growth rate of multiple lineages of SARS-CoV-2 in the local authority areas of England, September 2020-December 2021. Using the genomic surveillance records of the COVID-19 Genomics UK (COG-UK) Consortium, the analysis identifies a substantial (7.6-fold) difference in the average rate of spatial growth of 37 sample lineages, from the slowest (Delta AY.4.3) to the fastest (Omicron BA.1). Spatial growth of the Omicron (B.1.1.529 and BA) variant was found to be 2.81× faster than the Delta (B.1.617.2 and AY) variant and 3.76× faster than the Alpha (B.1.1.7 and Q) variant. In addition to AY.4.2 (a designated variant under investigation, VUI-21OCT-01), three Delta sublineages (AY.43, AY.98 and AY.120) were found to display a statistically faster rate of spatial growth than the parent lineage and would seem to merit further investigation. We suggest that the monitoring of spatial growth rates is a potentially valuable adjunct to outbreak response procedures for emerging SARS-CoV-2 variants in a defined population
DODO: an efficient orthologous genes assignment tool based on domain architectures. Domain based ortholog detection
<p>Abstract</p> <p>Background</p> <p>Orthologs are genes derived from the same ancestor gene loci after speciation events. Orthologous proteins usually have similar sequences and perform comparable biological functions. Therefore, ortholog identification is useful in annotations of newly sequenced genomes. With rapidly increasing number of sequenced genomes, constructing or updating ortholog relationship between all genomes requires lots of effort and computation time. In addition, elucidating ortholog relationships between distantly related genomes is challenging because of the lower sequence similarity. Therefore, an efficient ortholog detection method that can deal with large number of distantly related genomes is desired.</p> <p>Results</p> <p>An efficient ortholog detection pipeline DODO (DOmain based Detection of Orthologs) is created on the basis of domain architectures in this study. Supported by domain composition, which usually directly related with protein function, DODO could facilitate orthologs detection across distantly related genomes. DODO works in two main steps. Starting from domain information, it first assigns protein groups according to their domain architectures and further identifies orthologs within those groups with much reduced complexity. Here DODO is shown to detect orthologs between two genomes in considerably shorter period of time than traditional methods of reciprocal best hits and it is more significant when analyzed a large number of genomes. The output results of DODO are highly comparable with other known ortholog databases.</p> <p>Conclusions</p> <p>DODO provides a new efficient pipeline for detection of orthologs in a large number of genomes. In addition, a database established with DODO is also easier to maintain and could be updated relatively effortlessly. The pipeline of DODO could be downloaded from <url>http://140.109.42.19:16080/dodo_web/home.htm</url></p
Emergence and maintenance of actionable genetic drivers at medulloblastoma relapse
BACKGROUND: 90% of tumors) and established genetic drivers (e.g. SHH/WNT/P53 mutations; 60% of rMB events) were maintained from diagnosis. Critically, acquired and maintained rMB events converged on targetable pathways which were significantly enriched at relapse (e.g. DNA damage-signaling) and specific events (e.g. 3p loss) predicted survival post-relapse. CONCLUSIONS: rMB is defined by the emergence of novel events and pathways, in concert with selective maintenance of established genetic drivers. Together, these define the actionable genetic landscape of rMB and provide a basis for improved clinical management and development of stratified therapeutics, across disease-course
CLIMB-COVID: continuous integration supporting decentralised sequencing for SARS-CoV-2 genomic surveillance.
Funder: Wellcome TrustIn response to the ongoing SARS-CoV-2 pandemic in the UK, the COVID-19 Genomics UK (COG-UK) consortium was formed to rapidly sequence SARS-CoV-2 genomes as part of a national-scale genomic surveillance strategy. The network consists of universities, academic institutes, regional sequencing centres and the four UK Public Health Agencies. We describe the development and deployment of CLIMB-COVID, an encompassing digital infrastructure to address the challenge of collecting and integrating both genomic sequencing data and sample-associated metadata produced across the COG-UK network
- …