38 research outputs found

    PROTEOFORMER 2.0 : further developments in the ribosome profiling-assisted proteogenomic hunt for new proteoforms

    Get PDF
    PROTEOFORMER is a pipeline that enables the automated processing of data derived from ribosome profiling (RIBO-seq, i. e. the sequencing of ribosome-protected mRNA fragments). As such, genome-wide ribosome occupancies lead to the delineation of data-specific translation product candidates and these can improve the mass spectrometry-based identification. Since its first publication, different upgrades, new features and extensions have been added to the PROTEOFORMER pipeline. Some of the most important upgrades include P-site offset calculation during mapping, comprehensive data preexploration, the introduction of two alternative proteoform calling strategies and extended pipeline output features. These novelties are illustrated by analyzing ribosome profiling data of human HCT116 and Jurkat data. The different proteoform calling strategies are used alongside one another and in the end combined together with reference sequences from UniProt. Matching mass spectrometry data are searched against this extended search space with MaxQuant. Overall, besides annotated proteoforms, this pipeline leads to the identification and validation of different categories of new proteoforms, including translation products of up-and downstream open reading frames, 5 and 3 extended and truncated proteoforms, single amino acid variants, splice variants and translation products of so-called noncoding regions. Further, proof-of-concept is reported for the improvement of spectrum matching by including Prosit, a deep neural network strategy that adds extra fragmentation spectrum intensity features to the analysis. In the light of ribosome profiling-driven proteogenomics, it is shown that this allows validating the spectrum matches of newly identified proteoforms with elevated stringency. These updates and novel conclusions provide new insights and lessons for the ribosome profiling-based proteogenomic research field. More practical information on the pipeline, raw code, the user manual (README) and explanations on the different modes of availability can be found at the GitHub repository of PROTEOFORMER: https://github. com/ Biobix/proteoformer

    Generating high quality libraries for DIA MS with empirically corrected peptide predictions.

    Get PDF
    Data-independent acquisition approaches typically rely on experiment-specific spectrum libraries, requiring offline fractionation and tens to hundreds of injections. We demonstrate a library generation workflow that leverages fragmentation and retention time prediction to build libraries containing every peptide in a proteome, and then refines those libraries with empirical data. Our method specifically enables rapid, experiment-specific library generation for non-model organisms, which we demonstrate using the malaria parasite Plasmodium falciparum, and non-canonical databases, which we show by detecting missense variants in HeLa

    Building ProteomeTools based on a complete synthetic human proteome.

    Get PDF
    We describe ProteomeTools, a project building molecular and digital tools from the human proteome to facilitate biomedical research. Here we report the generation and multimodal liquid chromatography-tandem mass spectrometry analysis of \u3e330,000 synthetic tryptic peptides representing essentially all canonical human gene products, and we exemplify the utility of these data in several applications. The resource (available at http://www.proteometools.org) will be extended to \u3e1 million peptides, and all data will be shared with the community via ProteomicsDB and ProteomeXchange

    Die Rolle der elterlichen Selbstwirksamkeit im häuslichen Lernumfeld von Kindergartenkindern

    No full text
    Parental self-efficacy is an essential predictor of beneficial parenting practices, parenting skills, and positive child development (Albanese et al., 2019; Ardelt & Eccles, 2001; P. K. Coleman & Karraker, 2000; T. L. Jones & Prinz, 2005; Schuengel & Oosterman, 2019; Stievenart & Martinez Perez, 2020; Verhage et al., 2013; Wilson et al., 2014; Wittkowski et al., 2017). It describes the parents’ belief in their efficaciousness in influencing the child and its environment in such a way that it supports child development (Ardelt & Eccles, 2001). Parental self-efficacy as a parental belief (Sigel & McGillicuddy–De Lisi, 2002) is part of the home learning environment. The home learning environment has proven to be an important factor for beneficial child development and later school performance (Kluczniok et al., 2013; Lehrl et al., 2012; Sammons et al., 2015; Tamis- LeMonda et al., 2017). Studies indicate that the home learning environment can be structured into structural family characteristics (e.g., socio-economic background or family language), beliefs (e.g., parental self-efficacy), and processes or process quality (e.g., parent-child activities), whereby the processes have a direct effect on child development (Anders et al., 2011; Kluczniok et al., 2013; NICHD Early Child Care Research Network, 2003). This thesis follows the structure of the home learning environment, called the home learning environment model, and presents its component’s interrelationships. In the first study, the construct of parental self-efficacy is investigated in more detail. The construct of parental self-efficacy and, in particular, its content-specificity is not well understood: Parental self-efficacy can either refer to parents’ general perception of how well they judge themselves in their role as parents or refer to a specific parental task. To investigate this, it was tested whether (a) general and task-related parental self-efficacy could be assessed separately or (b) be mapped in a hierarchical model. Results indicate that general and task-related parental self-efficacy are separate dimensions. Furthermore, general and task-related parental self-efficacy were tested for differences in family characteristics. Results suggested that parents with a non-German family language experienced lower general parental self-efficacy and perceived themselves to be less self-efficacious in caring for a sick child. Parents with a university degree felt more efficacious in communicating a responsible media use but less efficacious in caring for a sick child than parents who did not have a university degree. The second study investigated the relationship of parental self-efficacy with family characteristics and home learning activities of native-born German parents and parents with a Turkish immigration background. Little is known (a) about the relationships between structural characteristics, parental self-efficacy, and home learning activities, especially for Turkish immigrant families with average educational levels and income, and (b) whether parental self-efficacy and home learning activities and their relationship are affected by the parents’ immigration background. Results showed that parental self-efficacy and the educational level but not the immigration background were significant predictors of home learning activities. The immigration background was related to the number of home learning activities via parental self-efficacy. However, there was no direct relationship between the immigration background and the home learning activities. This indicates the importance of parental self-efficacy for home learning activities regardless of the immigration background. Surprisingly, parents with a Turkish immigration background felt significantly more self-efficacious than native-born German parents. The third study investigated the relationship of parental self-efficacy and home learning activities with child outcomes at the children’s transition from preschool to elementary school. The interplay between parental self-efficacy, home learning activities, and preschool children’s socio-emotional and language skills has not yet been investigated. By linking these variables, this study went beyond previous research that concentrated on relationships between two factors. Findings indicate that the more self-efficacious parents felt, the more home learning activities they offered, and the higher they rated their children’s language skills at age 5. Moreover, parents who felt more efficacious in supporting their children’s language skills also described their children as having fewer socio-emotional problems. Also, parents whose children were about to transition from preschool to elementary school did not significantly undertake more school-preparatory home learning activities than parents of children who were not to enter elementary school. This thesis contributes to better understand the structure of parental self-efficacy in terms of the relationship between different levels of measurement. Furthermore, this thesis was able to show that parents with an immigration background do not generally perceive themselves as less self-efficacious in parenting their children, but that other family characteristics and the context are also decisive for this relationship. Finally, parental self-efficacy emerged as a significant predictor of the number of home learning activities, emphasizing the importance of parental self-efficacy for improving the home learning environment

    Parental self-efficacy in relation to family characteristics

    No full text
    Parental self-efficacy (PSE) is an essential predictor of parenting practices and child development. The content-specificity of PSE is not well understood: Previous studies are based on either measure of general parental self-efficacy or task-specific parental self-efficacy but not measures of both constructs. Thus, we do not know how both constructs are related. With data from the “AQuaFam” study, we compared four-factor models to investigate the structure of PSE. It was a priority whether (1) task-specific and general PSE could be assessed separately or (2) be mapped in a hierarchical model with task-specific PSE factors and a superordinate factor of general PSE. A Chi-square test shows no significant model improvement, which indicates general and task-specific PSE being separate dimensions. US studies suggest that low-income parents, migrants, or parents with a lower educational status experience lower PSE. To adequately support these parents, we need to know whether differences according to families’ background characteristics occur in task-specific and general PSE beliefs. We tested general PSE and PSE in four parenting tasks for differences according to families’ background characteristics. Parents with a university degree they were self-efficacious in communicating responsible media use than parents without a university degree. Parents with a non-German family language they were self-efficacious in communicating a responsible media use, caring for a sick child, and in their general PSE compared to parents with German as a family language. The results of the group differences are discussed in the context of how to support different parent groups

    ProteomicsML: An Online Platform for Community-Curated Datasets and Tutorials for Machine Learning in Proteomics

    No full text
    Dataset acquisition and curation are often the hardest and most time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based LC-IM-MS datasets, due to the high-throughput data structure with high levels of noise and complexity between raw and machine learning-ready formats. While predictive proteomics is a field on the rise, when predicting peptide behavior in LC-IM-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based datasets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides datasets that are useful for comparing state-of-the-art (SOTA) machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available on https://www.proteomicsml.org/ and we welcome the entire proteomics community to contribute to the project at https://github.com/proteomicsml/

    Spectral prediction features as a solution for the search space size problem in proteogenomics

    No full text
    Proteogenomics approaches often struggle with the distinction between true and false peptide-to-spectrum matches as the database size enlarges. However, features extracted from tandem mass spectrometry intensity predictors can enhance the peptide identification rate and can provide extra confidence for peptide-to-spectrum matching in a proteogenomics context. To that end, features from the spectral intensity pattern predictors MS2PIP and Prosit were combined with the canonical scores from MaxQuant in the Percolator postprocessing tool for protein sequence databases constructed out of ribosome profiling and nanopore RNA-Seq analyses. The presented results provide evidence that this approach enhances both the identification rate as well as the validation stringency in a proteogenomic setting

    ProteomicsML: An Online Platform for Community-Curated Datasets and Tutorials for Machine Learning in Proteomics

    Get PDF
    Dataset acquisition and curation are often the hardest and most time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based LC-IM-MS datasets, due to the high-throughput data structure with high levels of noise and complexity between raw and machine learning-ready formats. While predictive proteomics is a field on the rise, when predicting peptide behavior in LC-IM-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based datasets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides datasets that are useful for comparing state-of-the-art (SOTA) machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available on https://www.proteomicsml.org/ and we welcome the entire proteomics community to contribute to the project at https://github.com/proteomicsml/
    corecore