13 research outputs found
Pipelines and Systems for Threshold-Avoiding Quantification of LC-MS/MS Data
The accurate processing of complex liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) data from biological samples is a major challenge for metabolomics, proteomics, and related approaches. Here, we present the pipelines and systems for threshold-avoiding quantification (PASTAQ) LC-MS/MS preprocessing toolset, which allows highly accurate quantification of data-dependent acquisition LC-MS/MS datasets. PASTAQ performs compound quantification using single-stage (MS1) data and implements novel algorithms for high-performance and accurate quantification, retention time alignment, feature detection, and linking annotations from multiple identification engines. PASTAQ offers straightforward parameterization and automatic generation of quality control plots for data and preprocessing assessment. This design results in smaller variance when analyzing replicates of proteomes mixed with known ratios and allows the detection of peptides over a larger dynamic concentration range compared to widely used proteomics preprocessing tools. The performance of the pipeline is also demonstrated in a biological human serum dataset for the identification of gender-related proteins.</p
Clusterwise Peak Detection and Filtering Based on Spatial Distribution to Efficiently Mine Mass Spectrometry Imaging Data
Mass spectrometry imaging (MSI) has the potential to reveal the localization of thousands of biomolecules such as metabolites and lipids in tissue sections. The increase in both mass and spatial resolution of today's instruments brings on considerable challenges in terms of data processing; accurately extracting meaningful signals from the large data sets generated by MSI without losing information that could be clinically relevant is one of the most fundamental tasks of analysis software. Ion images of the biomolecules are generated by visualizing their intensities in 2-D space using mass spectra collected across the tissue section. The intensities are often calculated by summing each compound's signal between predefined sets of borders (bins) in the m/z dimension. This approach, however, can result in mixed signals from different compounds in the same bin or splitting the signal from one compound between two adjacent bins, leading to low quality ion images. To remedy this problem, we propose a novel data processing approach. Our approach consists of a sensitive peak detection method able to discover both faint and localized signals by utilizing clusterwise kernel density estimates (KDEs) of peak distributions. We show that our method can recall more ground-truth molecules, molecule fragments, and isotopes than existing methods based on binning. Furthermore, it automatically detects previously reported molecular ions of lipids, including those close in m/z, in an experimental data set
MSIWarp : A General Approach to Mass Alignment in Mass Spectrometry Imaging
Mass spectrometry imaging (MSI) is a technique that provides comprehensive molecular information with high spatial resolution from tissue. Today, there is a strong push toward sharing data sets through public repositories in many research fields where MSI is commonly applied; yet, there is no standardized protocol for analyzing these data sets in a reproducible manner. Shifts in the mass-to-charge ratio (m/z) of molecular peaks present a major obstacle that can make it impossible to distinguish one compound from another. Here, we present a label-free m/z alignment approach that is compatible with multiple instrument types and makes no assumptions on the sample's molecular composition. Our approach, MSIWarp (https://github.com/horvatovichlab/MSIWarp), finds an m/z recalibration function by maximizing a similarity score that considers both the intensity and m/z position of peaks matched between two spectra. MSIWarp requires only centroid spectra to find the recalibration function and is thereby readily applicable to almost any MSI data set. To deal with particularly misaligned or peak-sparse spectra, we provide an option to detect and exclude spurious peak matches with a tailored random sample consensus (RANSAC) procedure. We evaluate our approach with four publicly available data sets from both time-of-flight (TOF) and Orbitrap instruments and demonstrate up to 88% improvement in m/z alignment
Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder
Attention deficit/hyperactivity disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 individuals diagnosed with ADHD and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, finding important new information about the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes and around brain-expressed regulatory marks. Analyses of three replication studies: a cohort of individuals diagnosed with ADHD, a self-reported ADHD sample and a meta-analysis of quantitative measures of ADHD symptoms in the population, support these findings while highlighting study-specific differences on genetic overlap with educational attainment. Strong concordance with GWAS of quantitative population measures of ADHD symptoms supports that clinical diagnosis of ADHD is an extreme expression of continuous heritable traits
Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder
Attention deficit/hyperactivity disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 individuals diagnosed with ADHD and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, finding important new information about the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes and around brain-expressed regulatory marks. Analyses of three replication studies: a cohort of individuals diagnosed with ADHD, a self-reported ADHD sample and a meta-analysis of quantitative measures of ADHD symptoms in the population, support these findings while highlighting study-specific differences on genetic overlap with educational attainment. Strong concordance with GWAS of quantitative population measures of ADHD symptoms supports that clinical diagnosis of ADHD is an extreme expression of continuous heritable traits
Recommended from our members
Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder.
Attention deficit/hyperactivity disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 individuals diagnosed with ADHD and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, finding important new information about the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes and around brain-expressed regulatory marks. Analyses of three replication studies: a cohort of individuals diagnosed with ADHD, a self-reported ADHD sample and a meta-analysis of quantitative measures of ADHD symptoms in the population, support these findings while highlighting study-specific differences on genetic overlap with educational attainment. Strong concordance with GWAS of quantitative population measures of ADHD symptoms supports that clinical diagnosis of ADHD is an extreme expression of continuous heritable traits
Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder
Attention deficit/hyperactivity disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 individuals diagnosed with ADHD and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, finding important new information about the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes and around brain-expressed regulatory marks. Analyses of three replication studies: a cohort of individuals diagnosed with ADHD, a self-reported ADHD sample and a meta-analysis of quantitative measures of ADHD symptoms in the population, support these findings while highlighting study-specific differences on genetic overlap with educational attainment. Strong concordance with GWAS of quantitative population measures of ADHD symptoms supports that clinical diagnosis of ADHD is an extreme expression of continuous heritable traits