222 research outputs found

    Semi-automated assembly of high-quality diploid human reference genomes

    Get PDF
    The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements

    Prenatal exposures and exposomics of asthma

    Get PDF
    This review examines the causal investigation of preclinical development of childhood asthma using exposomic tools. We examine the current state of knowledge regarding early-life exposure to non-biogenic indoor air pollution and the developmental modulation of the immune system. We examine how metabolomics technologies could aid not only in the biomarker identification of a particular asthma phenotype, but also the mechanisms underlying the immunopathologic process. Within such a framework, we propose alternate components of exposomic investigation of asthma in which, the exposome represents a reiterative investigative process of targeted biomarker identification, validation through computational systems biology and physical sampling of environmental medi

    Optical map guided genome assembly

    Get PDF
    Background The long reads produced by third generation sequencing technologies have significantly boosted the results of genome assembly but still, genome-wide assemblies solely based on read data cannot be produced. Thus, for example, optical mapping data has been used to further improve genome assemblies but it has mostly been applied in a post-processing stage after contig assembly. Results We proposeOpticalKermitwhich directly integrates genome wide optical maps into contig assembly. We show how genome wide optical maps can be used to localize reads on the genome and then we adapt the Kermit method, which originally incorporated genetic linkage maps to the miniasm assembler, to use this information in contig assembly. Our experimental results show that incorporating genome wide optical maps to the contig assembly of miniasm increases NGA50 while the number of misassemblies decreases or stays the same. Furthermore, when compared to the Canu assembler,OpticalKermitproduces an assembly with almost three times higher NGA50 with a lower number of misassemblies on realA. thalianareads. Conclusions OpticalKermitsuccessfully incorporates optical mapping data directly to contig assembly of eukaryotic genomes. Our results show that this is a promising approach to improve the contiguity of genome assemblies.Peer reviewe

    A proposal for a coordinated effort for the determination of brainwide neuroanatomical connectivity in model organisms at a mesoscopic scale

    Get PDF
    In this era of complete genomes, our knowledge of neuroanatomical circuitry remains surprisingly sparse. Such knowledge is however critical both for basic and clinical research into brain function. Here we advocate for a concerted effort to fill this gap, through systematic, experimental mapping of neural circuits at a mesoscopic scale of resolution suitable for comprehensive, brain-wide coverage, using injections of tracers or viral vectors. We detail the scientific and medical rationale and briefly review existing knowledge and experimental techniques. We define a set of desiderata, including brain-wide coverage; validated and extensible experimental techniques suitable for standardization and automation; centralized, open access data repository; compatibility with existing resources, and tractability with current informatics technology. We discuss a hypothetical but tractable plan for mouse, additional efforts for the macaque, and technique development for human. We estimate that the mouse connectivity project could be completed within five years with a comparatively modest budget.Comment: 41 page

    Assessing Recent Smoking Status by Measuring Exhaled Carbon Monoxide Levels

    Get PDF
    The main expectations of applying proteomics technologies to clinical questions are the discovery of disease related biomarkers. Despite technological advancement to increase proteome coverage and depth to meet these expectations the number of generated biomarkers for clinical use is small. One of the reasons is that found potential biomarkers often are false discoveries. Small sample sizes, in combination with patient sample heterogeneity increase the risk of false discoveries. To be able to extract relevant biological information from such data, high demands are put on the experimental design and the use of sensitive and quantitatively accurate technologies. The overall aim of this thesis was to apply quantitative proteomics methods for biomarker discovery in clinical samples. A method for reducing bias by controlling for individual variation in smoking habits is described in paper I. The aim of the method was objective assessment of recent smoking in clinical studies on inflammatory responses. In paper II, the proteome of alveolar macrophages obtained from smoking subjects with and without the inflammatory lung disease chronic obstructive pulmonary disease (COPD) were quantified by two-dimensional gel-electrophoresis (2-DE). A gender focused analysis showed protein level differences within the female group, with down-regulation of lysosomal pathway and up-regulation of oxidative pathway in COPD patients. Paper III, a mass spectrometry based proteomics analysis of tumour samples, contributes to the molecular understanding of vulvar squamous cell carcinoma (VSCC) and we identified a high risk patient subgroup of HPV-negative tumours based on the expression of four proteins, further suggesting that this subgroup is characterized by an altered ubiquitin-proteasome signalling pathway. Paper III describes a data analysis workflow for the extraction of biological information from quantitative mass spectrometry based proteomics data. High patient-to-patient tumour proteome variability was addressed by using pathway profiling on individual tumour data, followed by comparison of pathway association ranks in a multivariate analysis. We show that pathway data on individual tumour level can detect subpopulations of patients and identify pathways of specific importance in pre-defined clinical groups by the use of multivariate statistics. In paper IV, the potentials and limits of quantitative mass spectrometry on clinical samples was evaluated by defining the quantitative accuracy of isobaric labels and label-free quantification. Quantification by isobaric labels in combination with pI pre-fractionation showed a lower limit of quantification (LOQ) than a label-free analysis without pI pre-fractionation, and 6-plex TMT were more sensitive than 8-plex iTRAQ. Precursor mixing measured by isolation interference (MS1 interference) is more linked to the quantitative accuracy of isobaric labels than reporter ion interference (MS2 interference). Based on that we could define recommendations for how much isolation interference that can be accepted; in our data <30% isolation interference had little effect the quantitative accuracy. In conclusion, getting biological knowledge from proteomics studies requires a careful study design, control of possible confounding factors and the use of clinical data to identify disease subtypes. Further, to be able to draw conclusions from the data, the analysis requires accurate quantitative data and robust statistical tools to detect significant protein alterations. Methods around these issues are developed and discussed in this thesis

    Exhaled carbon monoxide in asthmatics: a meta-analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The non-invasive assessment of airway inflammation is potentially advantageous in asthma management. Exhaled carbon monoxide (eCO) measurement is cheap and has been proposed to reflect airway inflammation and oxidative stress but current data are conflicting. The purpose of this meta-analysis is to determine whether eCO is elevated in asthmatics, is regulated by steroid treatment and reflects disease severity and control.</p> <p>Methods</p> <p>A systematic search for English language articles published between 1997 and 2009 was performed using Medline, Embase and Cochrane databases. Observational studies comparing eCO in non-smoking asthmatics and healthy subjects or asthmatics before and after steroid treatment were included. Data were independently extracted by two investigators and analyzed to generate weighted mean differences using either a fixed or random effects meta-analysis depending upon the degree of heterogeneity.</p> <p>Results</p> <p>18 studies were included in the meta-analysis. The eCO level was significantly higher in asthmatics as compared to healthy subjects and in intermittent asthma as compared to persistent asthma. However, eCO could not distinguish between steroid-treated asthmatics and steroid-free patients nor separate controlled and partly-controlled asthma from uncontrolled asthma in cross-sectional studies. In contrast, eCO was significantly reduced following a course of corticosteroid treatment.</p> <p>Conclusions</p> <p>eCO is elevated in asthmatics but levels only partially reflect disease severity and control. eCO might be a potentially useful non-invasive biomarker of airway inflammation and oxidative stress in nonsmoking asthmatics.</p

    The MeerKAT international GHz tiered extragalactic exploration (MIGHTEE) survey

    Get PDF
    The MIGHTEE large survey project will survey four of the most well-studied extragalactic deep fields, totalling 20 square degrees to µJy sensitivity at Giga-Hertz frequencies, as well as an ultra-deep image of a single ∼1 deg2 MeerKAT pointing. The observations will provide radio continuum, spectral line and polarisation information. As such, MIGHTEE, along with the excellent multi-wavelength data already available in these deep fields, will allow a range of science to be achieved. Specifically, MIGHTEE is designed to significantly enhance our understanding of, (i) the evolution of AGN and star-formation activity over cosmic time, as a function of stellar mass and environment, free of dust obscuration; (ii) the evolution of neutral hydrogen in the Universe and how this neutral gas eventually turns into stars after moving through the molecular phase, and how efficiently this can fuel AGN activity; (iii) the properties of cosmic magnetic fields and how they evolve in clusters, filaments and galaxies. MIGHTEE will reach similar depth to the planned SKA all-sky survey, and thus will provide a pilot to the cosmology experiments that will be carried out by the SKA over a much larger survey volume

    A Novel Cre Recombinase Imaging System for Tracking Lymphotropic Virus Infection In Vivo

    Get PDF
    BACKGROUND:Detection, isolation, and identification of individual virus infected cells during long term infection are critical to advance our understanding of mechanisms of pathogenesis for latent/persistent viruses. However, current approaches to study these viruses in vivo have been hampered by low sensitivity and effects of cell-type on expression of viral encoded reporter genes. We have designed a novel Cre recombinase (Cre)-based murine system to overcome these problems, and thereby enable tracking and isolation of individual in vivo infected cells. METHODOLOGY/PRINCIPAL FINDINGS:Murine gammaherpesvirus 68 (MHV-68) was used as a prototypic persistent model virus. A Cre expressing recombinant virus was constructed and characterised. The virus is attenuated both in lytic virus replication, producing ten-fold lower lung virus titres than wild type virus, and in the establishment of latency. However, despite this limitation, when the sEGFP7 mouse line containing a Cre-activated enhanced green fluorescent protein (EGFP) was infected with the Cre expressing virus, sites of latent and persistent virus infection could be identified within B cells and macrophages of the lymphoid system on the basis of EGFP expression. Importantly, the use of the sEGFP7 mouse line which expresses high levels of EGFP allowed individual virus positive cells to be purified by FACSorting. Virus gene expression could be detected in these cells. Low numbers of EGFP positive cells could also be detected in the bone marrow. CONCLUSIONS/SIGNIFICANCE:The use of this novel Cre-based virus/mouse system allowed identification of individual latently infected cells in vivo and may be useful for the study and long-term monitoring of other latent/persistent virus infections

    Using breath carbon monoxide to validate self-reported tobacco smoking in remote Australian Indigenous communities

    Get PDF
    Background: This paper examines the specificity and sensitivity of a breath carbon monoxide (BCO) test and\ud optimum BCO cutoff level for validating self-reported tobacco smoking in Indigenous Australians in Arnhem Land,\ud Northern Territory (NT).\ud \ud Methods: In a sample of 400 people (≥16 years) interviewed about tobacco use in three communities, both selfreported\ud smoking and BCO data were recorded for 309 study participants. Of these, 249 reported smoking tobacco\ud within the preceding 24 hours, and 60 reported they had never smoked or had not smoked tobacco for ≥6\ud months. The sample was opportunistically recruited using quotas to reflect age and gender balances in the\ud communities where the combined Indigenous populations comprised 1,104 males and 1,215 females (≥16 years).\ud Local Indigenous research workers assisted researchers in interviewing participants and facilitating BCO tests using\ud a portable hand-held analyzer.\ud \ud Results: A BCO cutoff of ≥7 parts per million (ppm) provided good agreement between self-report and BCO\ud (96.0% sensitivity, 93.3% specificity). An alternative cutoff of ≥5 ppm increased sensitivity from 96.0% to 99.6% with no change in specificity (93.3%). With data for two self-reported nonsmokers who also reported that they smoked\ud cannabis removed from the analysis, specificity increased to 96.6%.\ud \ud Conclusion: In these disadvantaged Indigenous populations, where data describing smoking are few, testing for\ud BCO provides a practical, noninvasive, and immediate method to validate self-reported smoking. In further studies\ud of tobacco smoking in these populations, cannabis use should be considered where self-reported nonsmokers\ud show high BCO
    corecore