117 research outputs found

    Using neural networks and evolutionary information in decoy discrimination for protein tertiary structure prediction

    Get PDF
    Background: We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.Results: Results have shown that the best performing neural network is the one that uses input information comprising of PSI-BLAST [1] profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested in discriminating the native structure from a set of decoys for all decoy datasets tested. Conclusion: This method is demonstrated to be viable, and furthermore evolutionary information is successfully used in the neural networks to improve decoy discrimination

    Properties of Graphene: A Theoretical Perspective

    Full text link
    In this review, we provide an in-depth description of the physics of monolayer and bilayer graphene from a theorist's perspective. We discuss the physical properties of graphene in an external magnetic field, reflecting the chiral nature of the quasiparticles near the Dirac point with a Landau level at zero energy. We address the unique integer quantum Hall effects, the role of electron correlations, and the recent observation of the fractional quantum Hall effect in the monolayer graphene. The quantum Hall effect in bilayer graphene is fundamentally different from that of a monolayer, reflecting the unique band structure of this system. The theory of transport in the absence of an external magnetic field is discussed in detail, along with the role of disorder studied in various theoretical models. We highlight the differences and similarities between monolayer and bilayer graphene, and focus on thermodynamic properties such as the compressibility, the plasmon spectra, the weak localization correction, quantum Hall effect, and optical properties. Confinement of electrons in graphene is nontrivial due to Klein tunneling. We review various theoretical and experimental studies of quantum confined structures made from graphene. The band structure of graphene nanoribbons and the role of the sublattice symmetry, edge geometry and the size of the nanoribbon on the electronic and magnetic properties are very active areas of research, and a detailed review of these topics is presented. Also, the effects of substrate interactions, adsorbed atoms, lattice defects and doping on the band structure of finite-sized graphene systems are discussed. We also include a brief description of graphane -- gapped material obtained from graphene by attaching hydrogen atoms to each carbon atom in the lattice.Comment: 189 pages. submitted in Advances in Physic

    TMFoldRec: a statistical potential-based transmembrane protein fold recognition tool.

    Get PDF
    BACKGROUND: Transmembrane proteins (TMPs) are the key components of signal transduction, cell-cell adhesion and energy and material transport into and out from the cells. For the deep understanding of these processes, structure determination of transmembrane proteins is indispensable. However, due to technical difficulties, only a few transmembrane protein structures have been determined experimentally. Large-scale genomic sequencing provides increasing amounts of sequence information on the proteins and whole proteomes of living organisms resulting in the challenge of bioinformatics; how the structural information should be gained from a sequence. RESULTS: Here, we present a novel method, TMFoldRec, for fold prediction of membrane segments in transmembrane proteins. TMFoldRec based on statistical potentials was tested on a benchmark set containing 124 TMP chains from the PDBTM database. Using a 10-fold jackknife method, the native folds were correctly identified in 77 % of the cases. This accuracy overcomes the state-of-the-art methods. In addition, a key feature of TMFoldRec algorithm is the ability to estimate the reliability of the prediction and to decide with an accuracy of 70 %, whether the obtained, lowest energy structure is the native one. CONCLUSION: These results imply that the membrane embedded parts of TMPs dictate the TM structures rather than the soluble parts. Moreover, predictions with reliability scores make in this way our algorithm applicable for proteome-wide analyses. AVAILABILITY: The program is available upon request for academic use

    Exploring Fold Space Preferences of New-born and Ancient Protein Superfamilies

    Get PDF
    The evolution of proteins is one of the fundamental processes that has delivered the diversity and complexity of life we see around ourselves today. While we tend to define protein evolution in terms of sequence level mutations, insertions and deletions, it is hard to translate these processes to a more complete picture incorporating a polypeptide's structure and function. By considering how protein structures change over time we can gain an entirely new appreciation of their long-term evolutionary dynamics. In this work we seek to identify how populations of proteins at different stages of evolution explore their possible structure space. We use an annotation of superfamily age to this space and explore the relationship between these ages and a diverse set of properties pertaining to a superfamily's sequence, structure and function. We note several marked differences between the populations of newly evolved and ancient structures, such as in their length distributions, secondary structure content and tertiary packing arrangements. In particular, many of these differences suggest a less elaborate structure for newly evolved superfamilies when compared with their ancient counterparts. We show that the structural preferences we report are not a residual effect of a more fundamental relationship with function. Furthermore, we demonstrate the robustness of our results, using significant variation in the algorithm used to estimate the ages. We present these age estimates as a useful tool to analyse protein populations. In particularly, we apply this in a comparison of domains containing greek key or jelly roll motifs

    Structural Constraints Identified with Covariation Analysis in Ribosomal RNA

    Get PDF
    Covariation analysis is used to identify those positions with similar patterns of sequence variation in an alignment of RNA sequences. These constraints on the evolution of two positions are usually associated with a base pair in a helix. While mutual information (MI) has been used to accurately predict an RNA secondary structure and a few of its tertiary interactions, early studies revealed that phylogenetic event counting methods are more sensitive and provide extra confidence in the prediction of base pairs. We developed a novel and powerful phylogenetic events counting method (PEC) for quantifying positional covariation with the Gutell lab’s new RNA Comparative Analysis Database (rCAD). The PEC and MI-based methods each identify unique base pairs, and jointly identify many other base pairs. In total, both methods in combination with an N-best and helix-extension strategy identify the maximal number of base pairs. While covariation methods have effectively and accurately predicted RNAs secondary structure, only a few tertiary structure base pairs have been identified. Analysis presented herein and at the Gutell lab’s Comparative RNA Web (CRW) Site reveal that the majority of these latter base pairs do not covary with one another. However, covariation analysis does reveal a weaker although significant covariation between sets of nucleotides that are in proximity in the three-dimensional RNA structure. This reveals that covariation analysis identifies other types of structural constraints beyond the two nucleotides that form a base pair

    Transcriptional Regulation of Glucose Sensors in Pancreatic ÎČ-Cells and Liver: An Update

    Get PDF
    Pancreatic ÎČ-cells and the liver play a key role in glucose homeostasis. After a meal or in a state of hyperglycemia, glucose is transported into the ÎČ-cells or hepatocytes where it is metabolized. In the ÎČ-cells, glucose is metabolized to increase the ATP:ADP ratio, resulting in the secretion of insulin stored in the vesicle. In the hepatocytes, glucose is metabolized to CO2, fatty acids or stored as glycogen. In these cells, solute carrier family 2 (SLC2A2) and glucokinase play a key role in sensing and uptaking glucose. Dysfunction of these proteins results in the hyperglycemia which is one of the characteristics of type 2 diabetes mellitus (T2DM). Thus, studies on the molecular mechanisms of their transcriptional regulations are important in understanding pathogenesis and combating T2DM. In this paper, we will review a recent update on the progress of gene regulation of glucose sensors in the liver and ÎČ-cells

    Accurate Protein Structure Annotation through Competitive Diffusion of Enzymatic Functions over a Network of Local Evolutionary Similarities

    Get PDF
    High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks

    Gaia Early Data Release 3: Summary of the contents and survey properties

    Get PDF
    ABSTRACT: Context. We present the early installment of the third Gaia data release, Gaia EDR3, consisting of astrometry and photometry for 1.8 billion sources brighter than magnitude 21, complemented with the list of radial velocities from Gaia DR2. Aims. A summary of the contents of Gaia EDR3 is presented, accompanied by a discussion on the differences with respect to Gaia DR2 and an overview of the main limitations which are present in the survey. Recommendations are made on the responsible use of Gaia EDR3 results. Methods. The raw data collected with the Gaia instruments during the first 34 months of the mission have been processed by the Gaia Data Processing and Analysis Consortium and turned into this early third data release, which represents a major advance with respect to Gaia DR2 in terms of astrometric and photometric precision, accuracy, and homogeneity. Results. Gaia EDR3 contains celestial positions and the apparent brightness in G for approximately 1.8 billion sources. For 1.5 billion of those sources, parallaxes, proper motions, and the (GBP ? GRP) colour are also available. The passbands for G, GBP, and GRP are provided as part of the release. For ease of use, the 7 million radial velocities from Gaia DR2 are included in this release, after the removal of a small number of spurious values. New radial velocities will appear as part of Gaia DR3. Finally, Gaia EDR3 represents an updated materialisation of the celestial reference frame (CRF) in the optical, the Gaia-CRF3, which is based solely on extragalactic sources. The creation of the source list for Gaia EDR3 includes enhancements that make it more robust with respect to high proper motion stars, and the disturbing effects of spurious and partially resolved sources. The source list is largely the same as that for Gaia DR2, but it does feature new sources and there are some notable changes. The source list will not change for Gaia DR3. Conclusions. Gaia EDR3 represents a significant advance over Gaia DR2, with parallax precisions increased by 30 per cent, proper motion precisions increased by a factor of 2, and the systematic errors in the astrometry suppressed by 30-40% for the parallaxes and by a factor ~2.5 for the proper motions. The photometry also features increased precision, but above all much better homogeneity across colour, magnitude, and celestial position. A single passband for G, GBP, and GRP is valid over the entire magnitude and colour range, with no systematics above the 1% levelThe Gaia mission and data processing have financially been supported by ; the Spanish Ministry of Economy (MINECO/FEDER, UE) through grants ESP2016-80079-C2-1-R, ESP2016-80079-C2-2-R, RTI2018-095076-B-C21, RTI2018-095076-B-C22, BES-2016-078499, and BES-2017-083126 and the Juan de la Cierva formación 2015 grant FJCI-2015-2671, the Spanish Ministry of Education, Culture, and Sports through grant FPU16/03827, the Spanish Ministry of Science and Innovation (MICINN) through grant AYA2017-89841P for project “Estudio de las propiedades de los fósiles estelares en el entorno del Grupo Local” and through grant TIN2015-65316-P for project “Computación de Altas Prestaciones VII

    Pulsations in main sequence OBAF-type stars

    Get PDF
    CONTEXT: The third Gaia data release provides photometric time series covering 34 months for about 10 million stars. For many of those stars, a characterisation in Fourier space and their variability classification are also provided. This paper focuses on intermediate- to high-mass (IHM) main sequence pulsators (M ≄  1.3 M⊙) of spectral types O, B, A, or F, known as ÎČ Cep, slowly pulsating B (SPB), ÎŽ Sct, and Îł Dor stars. These stars are often multi-periodic and display low amplitudes, making them challenging targets to analyse with sparse time series. AIMS: We investigate the extent to which the sparse Gaia DR3 data can be used to detect OBAF-type pulsators and discriminate them from other types of variables. We aim to probe the empirical instability strips and compare them with theoretical predictions. The most populated variability class is that of the ÎŽ Sct variables. For these stars, we aim to confirm their empirical period-luminosity (PL) relation, and verify the relation between their oscillation amplitude and rotation. METHODS: All datasets used in this analysis are part of the Gaia DR3 data release. The photometric time series were used to perform a Fourier analysis, while the global astrophysical parameters necessary for the empirical instability strips were taken from the Gaia DR3 gspphot tables, and the v sin i data were taken from the Gaia DR3 esphs tables. The ή Sct PL relation was derived using the same photometric parallax method as the one recently used to establish the PL relation for classical Cepheids using Gaia data. RESULTS: We show that for nearby OBAF-type pulsators, the Gaia DR3 data are precise and accurate enough to pinpoint them in the Hertzsprung-Russell (HR) diagram. We find empirical instability strips covering broader regions than theoretically predicted. In particular, our study reveals the presence of fast rotating gravity-mode pulsators outside the strips, as well as the co-existence of rotationally modulated variables inside the strips as reported before in the literature. We derive an extensive period–luminosity relation for ÎŽ Sct stars and provide evidence that the relation features different regimes depending on the oscillation period. We demonstrate how stellar rotation attenuates the amplitude of the dominant oscillation mode of ÎŽ Sct stars. CONCLUSIONS: The Gaia DR3 time-series photometry already allows for the detection of the dominant (non-)radial oscillation mode in about 100 000 intermediate- and high-mass dwarfs across the entire sky. This detection capability will increase as the time series becomes longer, allowing the additional delivery of frequencies and amplitudes of secondary pulsation modes

    Gaia Data Release 3: Mapping the asymmetric disc of the Milky Way

    Get PDF
    With the most recent Gaia data release the number of sources with complete 6D phase space information (position and velocity) has increased to well over 33 million stars, while stellar astrophysical parameters are provided for more than 470 million sources, in addition to the identification of over 11 million variable stars. Using the astrophysical parameters and variability classifications provided in Gaia DR3, we select various stellar populations to explore and identify non-axisymmetric features in the disc of the Milky Way in both configuration and velocity space. Using more about 580 thousand sources identified as hot OB stars, together with 988 known open clusters younger than 100 million years, we map the spiral structure associated with star formation 4-5 kpc from the Sun. We select over 2800 Classical Cepheids younger than 200 million years, which show spiral features extending as far as 10 kpc from the Sun in the outer disc. We also identify more than 8.7 million sources on the red giant branch (RGB), of which 5.7 million have line-of-sight velocities, allowing the velocity field of the Milky Way to be mapped as far as 8 kpc from the Sun, including the inner disc. The spiral structure revealed by the young populations is consistent with recent results using Gaia EDR3 astrometry and source lists based on near infrared photometry, showing the Local (Orion) arm to be at least 8 kpc long, and an outer arm consistent with what is seen in HI surveys, which seems to be a continuation of the Perseus arm into the third quadrant. Meanwhile, the subset of RGB stars with velocities clearly reveals the large scale kinematic signature of the bar in the inner disc, as well as evidence of streaming motions in the outer disc that might be associated with spiral arms or bar resonances. (abridged
    • 

    corecore