50 research outputs found

    Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space

    Get PDF
    We present an analysis of 203 completed genomes in the Gene3D resource (including 17 eukaryotes), which demonstrates that the number of protein families is continually expanding over time and that singleton-sequences appear to be an intrinsic part of the genomes. A significant proportion of the proteomes can be assigned to fewer than 6000 well-characterized domain families with the remaining domain-like regions belonging to a much larger number of small uncharacterized families that are largely species specific. Our comprehensive domain annotation of 203 genomes enables us to provide more accurate estimates of the number of multi-domain proteins found in the three kingdoms of life than previous calculations. We find that 67% of eukaryotic sequences are multi-domain compared with 56% of sequences in prokaryotes. By measuring the domain coverage of genome sequences, we show that the structural genomics initiatives should aim to provide structures for less than a thousand structurally uncharacterized Pfam families to achieve reasonable structural annotation of the genomes. However, in large families, additional structures should be determined as these would reveal more about the evolution of the family and enable a greater understanding of how function evolves

    Gene3D: modelling protein structure, function and evolution

    Get PDF
    The Gene3D release 4 database and web portal () provide a combined structural, functional and evolutionary view of the protein world. It is focussed on providing structural annotation for protein sequences without structural representatives—including the complete proteome sets of over 240 different species. The protein sequences have also been clustered into whole-chain families so as to aid functional prediction. The structural annotation is generated using HMM models based on the CATH domain families; CATH is a repository for manually deduced protein domains. Amongst the changes from the last publication are: the addition of over 100 genomes and the UniProt sequence database, domain data from Pfam, metabolic pathway and functional data from COGs, KEGG and GO, and protein–protein interaction data from MINT and BIND. The website has been rebuilt to allow more sophisticated querying and the data returned is presented in a clearer format with greater functionality. Furthermore, all data can be downloaded in a simple XML format, allowing users to carry out complex investigations at their own computers

    The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis

    Get PDF
    The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath/) currently contains 43 229 domains classified into 1467 superfamilies and 5107 sequence families. Each structural family is expanded with sequence relatives from GenBank and completed genomes, using a variety of efficient sequence search protocols and reliable thresholds. This extended CATH protein family database contains 616 470 domain sequences classified into 23 876 sequence families. This results in the significant expansion of the CATHHMMmodel library to include models built from the CATH sequence relatives, giving a10%increase in coveragefor detecting remote homologues. An improved Dictionary of Homologous superfamilies (DHS) (http://www.biochem.ucl.ac.uk/bsm/dhs/) containing specific sequence, structural and functional information for each superfamily in CATH considerably assists manual validation of homologues. Information on sequence relatives in CATH superfamilies, GenBank and completed genomes is presented in the CATH associated DHS and Gene3D resources. Domain partnership information can be obtained from Gene3D (http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/). A new CATH server has been implemented (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl) providing automatic classification of newly determined sequences and structures using a suite of rapid sequence and structure comparison methods. The statistical significance of matches is assessed and links are provided to the putative superfamily or fold group to which the query sequence or structure is assigned

    Understanding the Concentration Dependence of Viral Capsid Assembly Kinetics - the Origin of the Lag Time and Identifying the Critical Nucleus Size

    Get PDF
    The kinetics for the assembly of viral proteins into a population of capsids can be measured in vitro with size exclusion chromatography or dynamic light scattering, but extracting mechanistic information from these studies is challenging. For example, it is not straightforward to determine the critical nucleus size or the elongation time (the time required for a nucleated partial capsid to grow completion). We show that, for two theoretical models of capsid assembly, the critical nucleus size can be determined from the concentration dependence of the assembly reaction half-life and the elongation time is revealed by the length of the lag phase. Furthermore, we find that the system becomes kinetically trapped when nucleation becomes fast compared to elongation. Implications of this constraint for determining elongation mechanisms from experimental assembly data are discussed.Comment: Submitted to Biophysical Journa

    Dynamic Pathways for Viral Capsid Assembly

    Get PDF
    We develop a class of models with which we simulate the assembly of particles into T1 capsid-like objects using Newtonian dynamics. By simulating assembly for many different values of system parameters, we vary the forces that drive assembly. For some ranges of parameters, assembly is facile, while for others, assembly is dynamically frustrated by kinetic traps corresponding to malformed or incompletely formed capsids. Our simulations sample many independent trajectories at various capsomer concentrations, allowing for statistically meaningful conclusions. Depending on subunit (i.e., capsomer) geometries, successful assembly proceeds by several mechanisms involving binding of intermediates of various sizes. We discuss the relationship between these mechanisms and experimental evaluations of capsid assembly processes.Comment: 13 pages, 13 figures. Submitted to Biophys.

    The search campaign to identify and Image the Philae Lander on the surface of comet 67P/Churyumov-Gerasimenko

    Get PDF
    On the 12th of November 2014, the Rosetta Philae Lander descended to make the first soft touchdown on the surface of a comet – comet 67P/Churyumov- Gerasimenko. That soft touchdown did occur but due to the failure in the firing of its two harpoons, Philae bounced and travelled across the comet making contact with the surface twice more before finally landing in a shaded rocky location somewhere on the southern hemisphere of the comet. The search campaign, led by ESA, involved multiple teams across Europe with a wide range of techniques used in support of it. This search campaign would continue through 2015 where a prime candidate on the surface was identified and on into 2016 to end on the 2nd of September 2016 when a definitive and conclusive image was taken of the lander on the surface of the comet, confirming the prime candidate to indeed be Philae

    Planetary Defense Ground Zero: MASCOT's View on the Rocks - an Update between First Images and Sample Return

    Get PDF
    At 01:57:20 UTC on October 3rd, 2018, after 3½ years of cruise aboard the JAXA spacecraft HAYABUSA2 and about 3 months in the vicinity of its target, the MASCOT lander was separated successfully by from an altitude of 41 m. After a free-fall of only ~5m51s MASCOT made first contact with C-type near-Earth and potentially hazardous asteroid (162173) Ryugu, by hitting a big boulder. MASCOT then bounced for ~11m3s, in the process already gathering valuable information on mechanical properties of the surface before it came to rest. It was able to perform science measurements at 3 different locations on the surface of Ryugu and took many images of its spectacular pitch-black landscape. MASCOT’s payload suite was designed to investigate the fine-scale structure, multispectral reflectance, thermal characteristics and magnetic properties of the surface. Somewhat unexpectedly, MASCOT encountered very rugged terrain littered with large surface boulders. Observing in-situ, it confirmed the absence of fine particles and dust as already implied by the remote sensing instruments aboard the HAYABUSA2 spacecraft. After some 17h of operations, MASCOT‘s mission ended with the last communication contact as it followed Ryugu’s rotation beyond the horizon as seen from HAYABUSA2. Soon after, its primary battery was depleted. We present a broad overview of the recent scientific results of the MASCOT mission from separation through descent, landing and in-situ investigations on Ryugu until the end of its operation and relate them to the needs of planetary defense interactions with asteroids. We also recall the agile, responsive and sometimes serendipitous creation of MASCOT, the two-year rush of building and delivering it to JAXA’s HAYABUSA2 spacecraft in time for launch, and the four years of in-flight operations and on-ground testing to make the most of the brief on-surface mission

    The Application of a Thermal Mathematical Model during Lander Operations on the surfaces of Small bodies

    No full text
    Landing on small bodies like asteroids and comets is an extreme challenging operation. The unknown thermal and physical properties of the surface after a multi-annual journey in deep space require a high degree of flexibility for operations planning and execution during the landing and consecutive on-surface mission phases. The presentation describes the planning and analysis loop used to establish and execute the operations for two of these landers, the comet lander Philae on 67P/Churyumov–Gerasimenko and the asteroid lander Mascot on Ryugu, by the DLR Lander Control Centre in Cologne in collaboration with the other mission partners. This planning and analysis loop, focusing on power and thermal aspects, is mainly built by the Thermal Mathematical Model (TMM) of the respective lander interacting with an Operations Planning Tool, which e.g. translates the planned operation in dissipation profiles for all relevant nodes of the TMM. Moreover as one of the reason for thermal analysis during operation is the life extension of the batteries, the TMM is prepared with a detailed electrical and thermal model for the batteries and their management and, in case of Philae, it is also interacting with a Solar Array Illumination and Power Prediction tool. In addition, to plan long-term surface operations, the TMM needs to include a dedicated and flexibly adaptable comet or asteroid surface model because, as especially seen during the post-landing operations of Philae, the impact factor of the real environmental conditions found after landing is considerable. On the other hand the TMM needs to support its alternating application between moving and resting on-surface configurations as required by Mascot, the Mobile Asteroid Surface Scout. The presentation shows details of the necessary TMM preparation in order to optimize its application during real-time operation

    BioMap: Gene Family based Integration of Heterogeneous Biological Databases Using AutoMed Metadata

    No full text
    This paper presents an extensible architecture that can be used to support the integration of biological data sets. Biological research frequently requires this kind of synthesis. However, the data models on which biological data sets have been constructed are heterogeneous and diffi cult to use together. Our architecture uses the AutoMed data integration toolkit to store the schemas of data sources, together with the transformation from these schemas into a global integrated schema. The transformation encompasses two parts; the incremental construction of a global schema which unifi es the various data source schemas, and the identifi cation of semantically identical labels for entities. Entities in the unifi ed resource are integrated using PFScape. This categorises the entities into clusters based on sequence similarity, allowing the use of family information in the annotation of expression data and experimental target selection. 1
    corecore