37 research outputs found
Modified z-score vectors can be interpreted to determine direction of feature-wise change.
<p>We show the modified z-score vector for TSR1 as a heat map (more information in “Visualization of Results” section in the Methods) and representative images of TSR1 for the wild-type and the <i>Δrpd3</i> screen; TSR1 has strongly positive modified z-score in mother features for distance between proteins, distance to protein mass centre, distance to mass centre, and distance to bud neck, and negative ones in features for distance to cell periphery. Thus, we expect the former features to be larger in the wild-type than the <i>Δrpd3</i> and the latter to be larger in the <i>Δrpd3</i> than in the wild-type. In other words, we expect the GFP in <i>Δrpd3</i> mother cells to be denser, closer to the nuclei, closer to the bud neck, and further from the cell periphery relative to the wild-type. This interpretation is consistent with the localization patterns seen in the images.</p
Curated examples from the top 20 ranked genes from the mean profile.
<p>Representative images from the wild-type and <i>Δrpd3</i> screens are shown for 4 newly-found genes exhibiting localization changes, along with a description the change and the rank of the gene using the mean profile. In addition, we show two examples of false positives in the top 20 ranked genes. Cell segmentations are outlined in blue, with mother-bud associations shown as white circles. The white cross-outed regions are artifacts discarded by the image analysis software.</p
Example of a false positive generated by outliers in single-cell data.
<p>ADD37 is reported as a top-ranked gene for localization change using mean profiles despite no obvious phenotype. A shows the feature for distance between proteins is very strongly negative in bud cells in bin 3. The single cell-data for this bin is shown in B, which shows a single cell has a disproportionately high value for this feature, skewing the mean. Looking up this cell in the micrograph in C shows that the cell is either mis-segmented or expressing the protein differently from other cells. Feature abbreviations can be found in the “Visualization of Results” section in the methods.</p
Fraction of known localization changes retrieved against the log<sub>10</sub> of fraction of all genes retrieved for the <i>k</i>NN method with mean, truncated mean and median profiling.
<p>We also show the naĂŻve method and the X = Y curve for comparison.</p
Introduction of Premature Stop Codons as an Evolutionary Strategy To Rescue Signaling Network Function
The
cellular concentrations of key components of signaling networks
are tightly regulated, as deviations from their optimal ranges can
have negative effects on signaling function. For example, overexpression
of the yeast mating pathway mitogen-activated protein kinase (MAPK)
Fus3 decreases pathway output, in part by sequestering individual
components away from functional multiprotein complexes. Using a synthetic
biology approach, we investigated potential mechanisms by which selection
could compensate for a decrease in signaling activity caused by overexpression
of Fus3. We overexpressed a library of random mutants of Fus3 and
used cell sorting to select variants that rescued mating pathway activity.
Our results uncovered that one remarkable way in which selection can
compensate for protein overexpression is by introducing premature
stop codons at permitted positions. Because of the low efficiency
with which premature stop codons are read through, the resulting cellular
concentration of active Fus3 returns to values within the range required
for proper signaling. Our results underscore the importance of interpreting
genotypic variation at the systems rather than at the individual gene
level, as mutations can have opposite effects on protein and network
function
Met4 binding by Cbf1 and Met31 is dependent upon ordering but not orientation.
<p>We scanned all intergenic regions in yeast with the models presented in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone-0001199-g001" target="_blank">Fig. 1</a> with different orientations and orderings relative to the gene start point. We then ranked all genes in the genome based on the strength of their strongest upstream binding site, and we present here the corresponding expression changes as determined by microarrays. The experiments that each column represent correspond to those in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone-0001199-g006" target="_blank">Fig. 6</a>. For columns 1–9 (marked with a gray box) we expect regulated genes to have increased expression and therefore to be red. For columns 10–13 (marked with a blue box) we expect regulated genes to have a decreased expression and therefore to be green. Since the Met31 matrix is asymmetric, it could bind with two different orientations. Those circles labeled “Met31” have the same orientation as the Met31 logo in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone-0001199-g001" target="_blank">Fig. 1</a>. Those circles labeled “Inverted” have the opposite orientation (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone-0001199-g005" target="_blank">Fig. 5</a>). The optimal combination in the lower right corner allows for either orientation of Met31. The arrow signifies the gene start. The average expression change for the top 30 genes was calculated for each combination of sites for both the induced (columns 1 to 9) and repressed (column 10 to 13) experiments and are reported next to their respective columns.</p
The Met4 activation model based on our analysis.
<p>We summarize here the spacing, ordering, and orientation constraints we used to define functional Met4 binding sites. Since Met31 can bind with either orientation, we show logos for both Met31 orientations. The distances between each set of Cbf1 and Met31 sites were plotted with red boxes on a cosine wave for 23 high-ranking genes to show helical preferences. The arrow represents the translational start, and the allowed distance between the Met4 stabilization complex and the translational start is written above it. The expression data on the right is what was predicted by this model, and is described in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone-0001199-g006" target="_blank">Fig. 6</a>.</p
Cbf1 and Met31 sequence logos.
<p>Sequence logos were made as described in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#s2" target="_blank">Materials and Methods</a>. The height of each letter is proportional to the frequency of that base at that position. The height of the letter stack is the information content at that position. The cosine wave represents the helical twist of B-form DNA. The sequence logos were generated using the standard Delila programs <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone.0001199-Schneider1" target="_blank">[19]</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone.0001199-Schneider3" target="_blank">[21]</a>.</p
Met4 binding is within 450 bases of the gene start, but not within 100 bases.
<p>We varied the allowed distance that the Met31 binding site can be from the gene start point in our models, and quantified how this spacing constraint affected our ability to predict microarray expression data. A) We plotted the average expression change of the top 30 hits in the genome for different maximum spacings from the gene start. The top line corresponds to data from experiments where we expected increased expression (columns 1 to 9 in B), and the lower line is from experiments where we expected decreased expression (columns 10 to 13 in B). The microarray data that corresponds to our gene-ranking are shown in B. The conditions for each column in the microarrays correspond to the labeled columns in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001199#pone-0001199-g006" target="_blank">Fig. 6</a>.</p
Unsupervised Clustering of Subcellular Protein Expression Patterns in High-Throughput Microscopy Images Reveals Protein Complexes and Functional Relationships between Proteins
<div><p>Protein subcellular localization has been systematically characterized in budding yeast using fluorescently tagged proteins. Based on the fluorescence microscopy images, subcellular localization of many proteins can be classified automatically using supervised machine learning approaches that have been trained to recognize predefined image classes based on statistical features. Here, we present an unsupervised analysis of protein expression patterns in a set of high-resolution, high-throughput microscope images. Our analysis is based on 7 biologically interpretable features which are evaluated on automatically identified cells, and whose cell-stage dependency is captured by a continuous model for cell growth. We show that it is possible to identify most previously identified localization patterns in a cluster analysis based on these features and that similarities between the inferred expression patterns contain more information about protein function than can be explained by a previous manual categorization of subcellular localization. Furthermore, the inferred cell-stage associated to each fluorescence measurement allows us to visualize large groups of proteins entering the bud at specific stages of bud growth. These correspond to proteins localized to organelles, revealing that the organelles must be entering the bud in a stereotypical order. We also identify and organize a smaller group of proteins that show subtle differences in the way they move around the bud during growth. Our results suggest that biologically interpretable features based on explicit models of cell morphology will yield unprecedented power for pattern discovery in high-resolution, high-throughput microscopy images.</p></div