1,005 research outputs found

    High-Performance approaches for Phylogenetic Placement, and its application to species and diversity quantification

    Get PDF
    In den letzten Jahren haben Fortschritte in der Hochdurchsatz-Genesequenzierung, in Verbindung mit dem anhaltenden exponentiellen Wachstum und der Verfügbarkeit von Rechenressourcen, zu fundamental neuen analytischen Ansätzen in der Biologie geführt. Es ist nun möglich den genetischen Inhalt ganzer Organismengemeinschaften anhand einzelner Umweltproben umfassend zu sequenzieren. Solche Methoden sind besonders für die Mikrobiologie relevant. Die Mikrobiologie war zuvor weitgehend auf die Untersuchung jener Mikroben beschränkt, welche im Labor (d.h., in vitro) kultiviert werden konnten, was jedoch lediglich einen kleinen Teil der in der Natur vorkommenden Diversität abdeckt. Im Gegensatz dazu ermöglicht die Hochdurchsatzsequenzierung nun die direkte Erfassung der genetischen Sequenzen eines Mikrobioms, wie es in seiner natürlichen Umgebung vorkommt (d.h., in situ). Ein typisches Ziel von Mikrobiomstudien besteht in der taxonomischen Klassifizierung der in einer Probe enthaltenen Sequenzen (Querysequenzen). Üblicherweise werden phylogenetische Methoden eingesetzt, um detaillierte taxonomische Beziehungen zwischen Querysequenzen und vertrauenswürdigen Referenzsequenzen, die von bereits klassifizierten Organismen stammen, zu bestimmen. Aufgrund des hohen Volumens (106 10 ^ 6 bis 109 10 ^ 9 ) von Querysequenzen, die aus einer Mikrobiom-Probe mittels Hochdurchsatzsequenzierung generiert werden können, ist eine akkurate phylogenetische Baumrekonstruktion rechnerisch nicht mehr möglich. Darüber hinaus erzeugen derzeit üblicherweise verwendete Sequenzierungstechnologien vergleichsweise kurze Sequenzen, die ein begrenztes phylogenetisches Signal aufweisen, was zu einer Instabilität bei der Inferenz der Phylogenien aus diesen Sequenzen führt. Ein weiteres typisches Ziel von Mikrobiomstudien besteht in der Quantifizierung der Diversität innerhalb einer Probe, bzw. zwischen mehreren Proben. Auch hierfür werden üblicherweise phylogenetische Methoden verwendet. Oftmals setzen diese Methoden die Inferenz eines phylogenetischen Baumes voraus, welcher entweder alle Sequenzen, oder eine geclusterte Teilmenge dieser Sequenzen, umfasst. Wie bei der taxonomischen Identifizierung können Analysen, die auf dieser Art von Bauminferenz basieren, zu ungenauen Ergebnissen führen und/oder rechnerisch nicht durchführbar sein. Im Gegensatz zu einer umfassenden phylogenetischen Inferenz ist die phylogenetische Platzierung eine Methode, die den phylogenetischen Kontext einer Querysequenz innerhalb eines etablierten Referenzbaumes bestimmt. Dieses Verfahren betrachtet den Referenzbaum typischerweise als unveränderlich, d.h. der Referenzbaum wird vor, während oder nach der Platzierung einer Sequenz nicht geändert. Dies erlaubt die phylogenetische Platzierung einer Sequenz in linearer Zeit in Bezug auf die Größe des Referenzbaums durchzuführen. In Kombination mit taxonomischen Informationen über die Referenzsequenzen ermöglicht die phylogenetische Platzierung somit die taxonomische Identifizierung einer Sequenz. Darüber hinaus erlaubt eine phylogenetische Platzierung die Anwendung einer Vielzahl zusätzlicher Analyseverfahren, die beispielsweise die Zuordnung der Zusammensetzungen humaner Mikrobiome zu klinisch-diagnostischen Eigenschaften ermöglicht. In dieser Dissertation präsentiere ich meine Arbeit bezüglich des Entwurfs, der Implementierung, und Verbesserung von EPA-ng, einer Hochleistungsimplementierung der phylogenetischen Platzierung anhand des Maximum-Likelihood Modells. EPA-ng wurde entwickelt um auf Milliarden von Querysequenzen zu skalieren und auf Tausenden von Kernen in Systemen mit gemeinsamem und verteiltem Speicher ausgeführt zu werden. EPA-ng beschleunigt auch die Verarbeitungsgeschwindigkeit auf einzelnen Kernen um das bis zu 3030-fache, im Vergleich zu dessen direkten Konkurrenzprogrammen. Vor kurzem haben wir eine zusätzliche Methode für EPA-ng eingeführt, welche die Platzierung in wesentlich größeren Referenzbäumen ermöglicht. Hierfür verwenden wir einen aktiven Speicherverwaltungsansatz, bei dem reduzierter Speicherverbrauch gegen größere Ausführungszeiten eingetauscht wird. Zusätzlich präsentiere ich einen massiv-parallelen Ansatz um die Diversität einer Probe zu quantifizieren, welcher auf den Ergebnissen phylogenetischer Platzierungen basiert. Diese Software, genannt \toolname{SCRAPP}, kombiniert aktuelle Methoden für die Maximum-Likelihood basierte phylogenetische Inferenz mit Methoden zur Abgrenzung molekularer Spezien. Daraus resultiert eine Verteilung der Artenanzahl auf den Kanten eines Referenzbaums für eine gegebene Probe. Darüber hinaus beschreibe ich einen neuartigen Ansatz zum Clustering von Platzierungsergebnissen, anhand dessen der Benutzer den Rechenaufwand reduzieren kann

    Phylogenetic Analysis of SARS-CoV-2 Data Is Difficult

    Get PDF
    Numerous studies covering some aspects of SARS-CoV-2 data analyses are being published on a daily basis, including a regularly updated phylogeny on nextstrain.org. Here, we review the difficulties of inferring reliable phylogenies by example of a data snapshot comprising a quality-filtered subset of 8,736 out of all 16,453 virus sequences available on May 5, 2020 from gisaid.org. We find that it is difficult to infer a reliable phylogeny on these data due to the large number of sequences in conjunction with the low number of mutations. We further find that rooting the inferred phylogeny with some degree of confidence either via the bat and pangolin outgroups or by applying novel computational methods on the ingroup phylogeny does not appear to be credible. Finally, an automatic classification of the current sequences into subclasses using the mPTP tool for molecular species delimitation is also, as might be expected, not possible, as the sequences are too closely related. We conclude that, although the application of phylogenetic methods to disentangle the evolution and spread of COVID-19 provides some insight, results of phylogenetic analyses, in particular those conducted under the default settings of current phylogenetic inference tools, as well as downstream analyses on the inferred phylogenies, should be considered and interpreted with extreme caution

    Long-read metabarcoding of the eukaryotic rDNA operon to phylogenetically and taxonomically resolve environmental diversity

    Get PDF
    High‐throughput DNA metabarcoding of amplicon sizes below 500 bp has revolutionized the analysis of environmental microbial diversity. However, these short regions contain limited phylogenetic signal, which makes it impractical to use environmental DNA in full phylogenetic inferences. This lesser phylogenetic resolution of short amplicons may be overcome by new long‐read sequencing technologies. To test this idea, we amplified soil DNA and used PacBio Circular Consensus Sequencing (CCS) to obtain an ~4500‐bp region spanning most of the eukaryotic small subunit (18S) and large subunit (28S) ribosomal DNA genes. We first treated the CCS reads with a novel curation workflow, generating 650 high‐quality operational taxonomic units (OTUs) containing the physically linked 18S and 28S regions. To assign taxonomy to these OTUs, we developed a phylogeny‐aware approach based on the 18S region that showed greater accuracy and sensitivity than similarity‐based methods. The taxonomically annotated OTUs were then combined with available 18S and 28S reference sequences to infer a well‐resolved phylogeny spanning all major groups of eukaryotes, allowing us to accurately derive the evolutionary origin of environmental diversity. A total of 1,019 sequences were included, of which a majority (58%) corresponded to the new long environmental OTUs. The long reads also allowed us to directly investigate the relationships among environmental sequences themselves, which represents a key advantage over the placement of short reads on a reference phylogeny. Together, our results show that long amplicons can be treated in a full phylogenetic framework to provide greater taxonomic resolution and a robust evolutionary perspective to environmental DNA

    The ATLAS3D project - XXIX : The new look of early-type galaxies and surrounding fields disclosed by extremely deep optical images

    Get PDF
    Date of Acceptance: 25/09/2014Galactic archaeology based on star counts is instrumental to reconstruct the past mass assembly of Local Group galaxies. The development of new observing techniques and data reduction, coupled with the use of sensitive large field of view cameras, now allows us to pursue this technique in more distant galaxies exploiting their diffuse low surface brightness (LSB) light. As part of the ATLAS3D project, we have obtained with the MegaCam camera at the Canada-France-Hawaii Telescope extremely deep, multiband images of nearby early-type galaxies (ETGs). We present here a catalogue of 92 galaxies from the ATLAS3D sample, which are located in low- to medium-density environments. The observing strategy and data reduction pipeline, which achieve a gain of several magnitudes in the limiting surface brightness with respect to classical imaging surveys, are presented. The size and depth of the survey are compared to other recent deep imaging projects. The paper highlights the capability of LSB-optimized surveys at detecting new prominent structures that change the apparent morphology of galaxies. The intrinsic limitations of deep imaging observations are also discussed, among those, the contamination of the stellar haloes of galaxies by extended ghost reflections, and the cirrus emission from Galactic dust. The detection and systematic census of fine structures that trace the present and past mass assembly of ETGs are one of the prime goals of the project. We provide specific examples of each type of observed structures - tidal tails, stellar streams and shells - and explain how they were identified and classified. We give an overview of the initial results. The detailed statistical analysis will be presented in future papers.Peer reviewedFinal Accepted Versio

    The EUropean Network of National Schizophrenia Networks Studying Gene-Environment Interactions (EU-GEI): Incidence and First-Episode Case-Control Programme.

    Get PDF
    PURPOSE: The EUropean Network of National Schizophrenia Networks Studying Gene-Environment Interactions (EU-GEI) study contains an unparalleled wealth of comprehensive data that allows for testing hypotheses about (1) variations in incidence within and between countries, including by urbanicity and minority ethnic groups; and (2) the role of multiple environmental and genetic risk factors, and their interactions, in the development of psychotic disorders. METHODS: Between 2010 and 2015, we identified 2774 incident cases of psychotic disorders during 12.9 million person-years at risk, across 17 sites in 6 countries (UK, The Netherlands, France, Spain, Italy, and Brazil). Of the 2774 incident cases, 1130 cases were assessed in detail and form the case sample for case-control analyses. Across all sites, 1497 controls were recruited and assessed. We collected data on an extensive range of exposures and outcomes, including demographic, clinical (e.g. premorbid adjustment), social (e.g. childhood and adult adversity, cannabis use, migration, discrimination), cognitive (e.g. IQ, facial affect processing, attributional biases), and biological (DNA via blood sample/cheek swab). We describe the methodology of the study and some descriptive results, including representativeness of the cohort. CONCLUSIONS: This resource constitutes the largest and most extensive incidence and case-control study of psychosis ever conducted.The EU-GEI Study is funded by grant agreement HEALTH-F2-2010-241909 (Project EU-GEI) from the European Community’s Seventh Framework Programme, and grant 2012/0417-0 from the São Paulo Research Foundatio

    Jumping to conclusions, general intelligence, and psychosis liability: findings from the multi-centre EU-GEI case-control study.

    Get PDF
    BACKGROUND: The 'jumping to conclusions' (JTC) bias is associated with both psychosis and general cognition but their relationship is unclear. In this study, we set out to clarify the relationship between the JTC bias, IQ, psychosis and polygenic liability to schizophrenia and IQ. METHODS: A total of 817 first episode psychosis patients and 1294 population-based controls completed assessments of general intelligence (IQ), and JTC, and provided blood or saliva samples from which we extracted DNA and computed polygenic risk scores for IQ and schizophrenia. RESULTS: The estimated proportion of the total effect of case/control differences on JTC mediated by IQ was 79%. Schizophrenia polygenic risk score was non-significantly associated with a higher number of beads drawn (B = 0.47, 95% CI -0.21 to 1.16, p = 0.17); whereas IQ PRS (B = 0.51, 95% CI 0.25-0.76, p < 0.001) significantly predicted the number of beads drawn, and was thus associated with reduced JTC bias. The JTC was more strongly associated with the higher level of psychotic-like experiences (PLEs) in controls, including after controlling for IQ (B = -1.7, 95% CI -2.8 to -0.5, p = 0.006), but did not relate to delusions in patients. CONCLUSIONS: Our findings suggest that the JTC reasoning bias in psychosis might not be a specific cognitive deficit but rather a manifestation or consequence, of general cognitive impairment. Whereas, in the general population, the JTC bias is related to PLEs, independent of IQ. The work has the potential to inform interventions targeting cognitive biases in early psychosis.EU HEALTH-F2-2009-24190

    Facial Emotion Recognition in Psychosis and Associations With Polygenic Risk for Schizophrenia: Findings From the Multi-Center EU-GEI Case-Control Study

    Get PDF
    BACKGROUND AND HYPOTHESIS: Facial Emotion Recognition is a key domain of social cognition associated with psychotic disorders as a candidate intermediate phenotype. In this study, we set out to investigate global and specific facial emotion recognition deficits in first-episode psychosis, and whether polygenic liability to psychotic disorders is associated with facial emotion recognition. STUDY DESIGN: 828 First Episode Psychosis (FEP) patients and 1308 population-based controls completed assessments of the Degraded Facial Affect Recognition Task (DFAR) and a subsample of 524 FEP and 899 controls provided blood or saliva samples from which we extracted DNA, performed genotyping and computed polygenic risk scores for schizophrenia (SZ), bipolar disorder (BD), and major depressive disorder (MD). STUDY RESULTS: A worse ability to globally recognize facial emotion expressions was found in patients compared with controls [B= -1.5 (0.6), 95% CI -2.7 to -0.3], with evidence for stronger effects on negative emotions (fear [B = -3.3 (1.1), 95% CI -5.3 to -1.2] and anger [B = -2.3 (1.1), 95% CI -4.6 to -0.1]) than on happiness [B = 0.3 (0.7), 95% CI -1 to 1.7]. Pooling all participants, and controlling for confounds including case/control status, facial anger recognition was associated significantly with Schizophrenia Polygenic Risk Score (SZ PRS) [B = -3.5 (1.7), 95% CI -6.9 to -0.2]. CONCLUSIONS: Psychosis is associated with impaired recognition of fear and anger, and higher SZ PRS is associated with worse facial anger recognition. Our findings provide evidence that facial emotion recognition of anger might play a role as an intermediate phenotype for psychosis

    Transdiagnostic dimensions of psychopathology at first episode psychosis: findings from the multinational EU-GEI study.

    Get PDF
    BACKGROUND: The value of the nosological distinction between non-affective and affective psychosis has frequently been challenged. We aimed to investigate the transdiagnostic dimensional structure and associated characteristics of psychopathology at First Episode Psychosis (FEP). Regardless of diagnostic categories, we expected that positive symptoms occurred more frequently in ethnic minority groups and in more densely populated environments, and that negative symptoms were associated with indices of neurodevelopmental impairment. METHOD: This study included 2182 FEP individuals recruited across six countries, as part of the EUropean network of national schizophrenia networks studying Gene-Environment Interactions (EU-GEI) study. Symptom ratings were analysed using multidimensional item response modelling in Mplus to estimate five theory-based models of psychosis. We used multiple regression models to examine demographic and context factors associated with symptom dimensions. RESULTS: A bifactor model, composed of one general factor and five specific dimensions of positive, negative, disorganization, manic and depressive symptoms, best-represented associations among ratings of psychotic symptoms. Positive symptoms were more common in ethnic minority groups. Urbanicity was associated with a higher score on the general factor. Men presented with more negative and less depressive symptoms than women. Early age-at-first-contact with psychiatric services was associated with higher scores on negative, disorganized, and manic symptom dimensions. CONCLUSIONS: Our results suggest that the bifactor model of psychopathology holds across diagnostic categories of non-affective and affective psychosis at FEP, and demographic and context determinants map onto general and specific symptom dimensions. These findings have implications for tailoring symptom-specific treatments and inform research into the mood-psychosis spectrum

    Daily use of high-potency cannabis is associated with more positive symptoms in first-episode psychosis patients: the EU-GEI case-control study.

    Get PDF
    BACKGROUND: Daily use of high-potency cannabis has been reported to carry a high risk for developing a psychotic disorder. However, the evidence is mixed on whether any pattern of cannabis use is associated with a particular symptomatology in first-episode psychosis (FEP) patients. METHOD: We analysed data from 901 FEP patients and 1235 controls recruited across six countries, as part of the European Network of National Schizophrenia Networks Studying Gene-Environment Interactions (EU-GEI) study. We used item response modelling to estimate two bifactor models, which included general and specific dimensions of psychotic symptoms in patients and psychotic experiences in controls. The associations between these dimensions and cannabis use were evaluated using linear mixed-effects models analyses. RESULTS: In patients, there was a linear relationship between the positive symptom dimension and the extent of lifetime exposure to cannabis, with daily users of high-potency cannabis having the highest score (B = 0.35; 95% CI 0.14-0.56). Moreover, negative symptoms were more common among patients who never used cannabis compared with those with any pattern of use (B = -0.22; 95% CI -0.37 to -0.07). In controls, psychotic experiences were associated with current use of cannabis but not with the extent of lifetime use. Neither patients nor controls presented differences in depressive dimension related to cannabis use. CONCLUSIONS: Our findings provide the first large-scale evidence that FEP patients with a history of daily use of high-potency cannabis present with more positive and less negative symptoms, compared with those who never used cannabis or used low-potency types.The work was supported by: Clinician Scientist Medical Research Council fellowship (project reference MR/M008436/1) to MDF; the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care South London at King's College Hospital NHS Foundation Trust to DQ; DFG Heisenberg professorship (no. 389624707) to UR. National Institute for Health Research (NIHR) Biomedical Research Centre for Mental Health at South London and Maudsley NHS Foundation Trust and King’s College London. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. The EU-GEI Project is funded by the European Community’s Seventh Framework Programme under grant agreement No. HEALTH-F2-2010-241909 (Project EU-GEI). The Brazilian study was funded by the São Paulo Research Foundation under grant number 2012/0417-0

    The relationship between genetic liability, childhood maltreatment, and IQ: findings from the EU-GEI multicentric case-control study

    Get PDF
    This study investigated if the association between childhood maltreatment and cognition among psychosis patients and community controls was partially accounted for by genetic liability for psychosis. Patients with first-episode psychosis (N = 755) and unaffected controls (N = 1219) from the EU-GEI study were assessed for childhood maltreatment, intelligence quotient (IQ), family history of psychosis (FH), and polygenic risk score for schizophrenia (SZ-PRS). Controlling for FH and SZ-PRS did not attenuate the association between childhood maltreatment and IQ in cases or controls. Findings suggest that these expressions of genetic liability cannot account for the lower levels of cognition found among adults maltreated in childhood
    corecore