Search CORE

15 research outputs found

Composite graphs showing the following: Descriptor variation along the regions before, at and after the analysed PSSE; the reliability value (or % of helical structure at each loci) and the p-value for the descriptor: Number of contacts, type “HBMM”. Data are drawn from the datamart containing PSSEs of length = 12 AARs; the consensus definition of a helix element is from “PDB-DSSP-Stride”, and the redundancy is 70% similarity at the sequence level.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

Composite graphs showing the following: Descriptor variation along the regions before, at and after the analysed PSSE; the reliability value (or % of helical structure at each loci) and the p-value for the descriptor: Number of contacts, type “HBMM”. Data are drawn from the datamart containing PSSEs of length = 12 AARs; the consensus definition of a helix element is from “PDB-DSSP-Stride”, and the redundancy is 70% similarity at the sequence level.</p

FigShare

The KS test applied for sliding windows in all size helices using single parameter analysis.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

The KS test applied for sliding windows in all size helices using single parameter analysis.</p

FigShare

Differences in the variation behaviour of two selected descriptors around α-helices (solid lines) and β-strands (dotted lines).

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

The plots above present the behaviour of the A) EP@Cα average values for 1811 α-helices in (α+β)+(α/β) proteins and 7773 β-strands in (α+β)+(α/β) proteins. B) HBMM_WNASurf average values for α-helices in (α+β)+(α/β) proteins and β-strands in (α+β)+(α/β) proteins. The average number of this contact type is higher in and around α-helices than in and around β-strands. As shown, there are clear differences in signal pattern in the cases presented in A and B.</p

FigShare

Variation in the number of descriptors that passed both the normal distribution test and the no mutual correlation test for different helix sizes.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

Variation in the number of descriptors that passed both the normal distribution test and the no mutual correlation test for different helix sizes.</p

FigShare

Study of specific nanoenvironments containing α-helices in all-α and (α+β)+(α/β) proteins

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

<div>Protein secondary structure elements (PSSEs) such as α-helices, β-strands, and turns are the primary building blocks of the tertiary protein structure. Our primary interest here is to reveal the characteristics of the nanoenvironment formed by both PSSEs and their surrounding amino acid residues (AARs), which might contribute to the general understanding of how proteins fold. The characteristics of such nanoenvironments must be specific to each secondary structure element, and we have set our goal here to gather the fullest possible description of the α-helical nanoenvironment. In general, this postulate (the existence of specific nanoenvironments for specific protein substructures/neighbourhoods/regions with distinct functionality) was already successfully explored and confirmed for some protein regions, such as protein-protein interfaces and enzyme catalytic sites. Consequently, PSSEs were the obvious next choice for additional work for further evidence showing that specific nanoenvironments (having characteristics fully describable by means of structural and physical chemical descriptors) do exist for the corresponding and determined intraprotein regions. The nanoenvironment of α-helices (nEoαH) is defined as any region of the protein where this secondary structure element type is detected. The nEoαH, therefore, includes not only the α-helix amino acid residues but also the residues immediately around the α-helix. The hypothesis that motivated this work is that it might in fact be possible to detect a postulated “signal” or “signature” that distinguishes the specific location of α-helices. This “signal” must be discernible by tracking differences in the values of physical, chemical, physicochemical, structural and geometric descriptors immediately before (or after) the PSSE from those in the region along the α-helices. The search for this specific nanoenvironment “signal” was made possible by aligning previously selected α-helices of equal length. Afterward, we calculated the average value, standard deviation and mean square error at each aligned residue position for each selected descriptor. We applied Student’s t-test, the Kolmogorov-Smirnov test and MANOVA statistical tests to the dataset constructed as described above, and the results confirmed that the hypothesized “signal”/“signature” is both existing/identifiable and capable of distinguishing the presence of an α-helix inside the specific nanoenvironment, contextualized as a specific region within the whole protein. However, such conclusion might rarely be reached if only one descriptor is considered at a time. A more accurate signal with broader coverage is achieved only if one applies multivariate analysis, which means that several descriptors (usually approximately 10 descriptors) should be considered at the same time. To a limited extent (up to a maximum of 15% of cases), such conclusion is also possible with only a single descriptor, and the conclusion is also possible in general for up to 50–80% of cases when no less than 5 nonlinear descriptors are selected and considered. Using all the descriptors considered in this work, provided all assumptions about data characteristics for this analysis are met, multivariate analysis regularly reached a coverage and accuracy above 90%. Understanding how secondary structure elements are formed and maintained within a protein structure could enable a more detailed understanding of how proteins reach their final 3D structure and consequently, their function. Likewise, this knowledge may also improve the tools used to determine how good a structure is by means of comparing the “signal” around a selected PSSE with the one obtained from the best (resolution and quality wise) protein structures available.</div

FigShare

The Protein Secondary Structure Sting Analyzer (PS3A) panels contain four types of plots.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

In the case shown, 987 α-helices that are 15 amino acid residues long were examined from the datamart in which we removed 70% of the redundancy at the whole protein sequence level, and all instances of α-helices were taken from both all-α and (α+β)+(α/β) proteins. The consensus definition used to determine the presence of an α-helical structure within proteins was the PDB-DSSP-Stride–the most rigorous one. The total number of such proteins is indicated in the Supporting Information in Figure B in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0200018#pone.0200018.s001" target="_blank">S1 File</a>. Plots produced by the PS3A software: A) XY plot for average values (± SEM) for the selected descriptor: electrostatic potential at the α-carbon atom (CA). Negative numbers along the x-axes indicate locations to the left of the N-terminal of the examined/central PSSE, and positive ones follow its C-terminal end. B) The degree of occupancy per AAR position or “reliability”, which is the estimate of how accurately the signal may be observed in A) above. This estimate is only based on how many amino acid residues are present at any location of the positional alignment of the PSSE. The maximum value (100% reliability) is assumed for the ensemble of studied samples along the PSSE. Outside the PSSE, the reliability is usually lower than 100%. C) The sequence logo presents which amino acid type is more frequently found at each positional alignment location–basically indicating the consensus sequence of the PSSE for a selected length (also shown at the bottom part of the logo). The amino acid position numbers (shown on the upper part of the plot) follow the same convention described for A) above. D) The ECDF curve shows how the descriptor average values inside the PSSE region are different from the corresponding values outside the selected PSSE. All of these plots (for each selected PSSE length, type of protein and redundancy level) may be accessed at <a href="https://www.ps3a.cbi.cnptia.embrapa.br/" target="_blank">https://www.ps3a.cbi.cnptia.embrapa.br</a>.</p

FigShare

Comparison of the average values of 8 descriptors, normalized (by inverse coefficient of variation) done by dividing the parameter values with the corresponding standard deviation, and calculated for regions inside (17 AAARs) and outside the PSSE.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

The following descriptors are likely to show the postulated “signal” (the differences between the inside and outside descriptor values per position are higher than 1): 1. Hbmm, 15. Hbmm_WNADist, 29. Hbmm_WNASurf, 61. Number_Unused_Contact_WNADist, 62. Number_Unused_Contact_WNASurf, 63. Dihedral_Angle_PHI, 64. Dihedral_Angle_PSI, 66. Density. The two shadowed descriptors are expected to show differences, as these descriptors are basically part of the definition of the investigated PSSE.</p

FigShare

The p-value of Student's t-test evaluation for a selected descriptor value along the “sliding window” for positionally aligned PSSE sequences.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

The coverage of the sequence containing a PSSE is from the N- to the C-terminal ends (± 32 AAR). The sequences includes the PSSE plus 32 residues before its N-terminal and 32 residues after its C-terminal. The “sliding window” size in this particular case is the same size as the selected PSSE length (12 AAR). Student’s t-test is used for each position of the sliding window. This test measures how much the data inside the “sliding window” differ from the data outside the windows. The p-values are shown along the y-axes. A p-value that approaches zero in any particular region means that within this region, the descriptor values differ from the values outside the region in a statistically significant manner. The arrows indicate the direction of movement for the “sliding window box” (shown here before, at and after the PSSE), and the solid arrow indicates the exact position of the N-terminal of the PSSE. Shadowed boxes indicate the size of the sliding window placed at three specific positions. The region with a p-value approximating zero coincides with the positional alignment of the α-helix that has the exact same size. The sharp invagination around AAR position 52 is not as representative (too short compared to the PSSE under investigation) as the one directly on top and over the whole analysed PSSE.</p

FigShare

Grouping of same-length α-helices using consensus definitions based on the PDB, DSSP and Stride classifications.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

There are four possible consensus groups. (A) PDB-DSSP-Stride: when the secondary structure element starts and finishes at the same corresponding amino acid residue location and hence, has the same length according to the PDB, DSSP and Stride definitions. (B), (C) and (D) when the secondary structure elements start but do NOT finish at the same amino acid residue, as defined by one of the three criteria used: PDB-DSSP, PDB-Stride, and DSSP-Stride definitions, respectively.</p

FigShare

An example of an α-helix (in a specific (α+β) protein) and its nanoenvironment: The synthetic gene encoded DcpS bound to the inhibitor DG157493 (3bl9.pdb) has fourteen α-helices, and each helix has its own nanoenvironment.

Author: Goran Neshich (9940)
Inácio Henrique Yano (5497472)
Ivan Mazoni (516011)
José Augusto Salim (5497481)
José Gilberto Jardine (5497475)
Luiz César Borro (5497478)
Publication venue
Publication date
Field of study

Highlighted inside the transparent spheres is an α-helix (ribbon, purple). The nanoenvironment includes the amino acid residues of the α-helix and the amino acid residues around the helix that are within reach of the probing sphere, whose radius was previously selected. The pre- and postregions (extension by 32 AARs each) are not shown here for the sake of clarity of the basic definition.</p

FigShare