1,554 research outputs found

    Binary Black Hole Astrophysics with Gravitational Waves

    Get PDF
    Gravitational Waves (GWs) have quickly emerged as powerful, indispensabletools for studying gravity in the strong field regime and high-energy astrophysical phenomena since they were first directly detected by the Laser Interferometer Gravitational-Wave Observatory (LIGO) on September 14, 2015. Over the course of this dissertation work gravitational-wave astronomy has begun to mature, going from 11 GW observations when I began to 90 at the time of writing, just before the next observing run begins. As the network of GW observatories continues to grow and these observations become a regular occurrence, the entire population of merging compact objects observed with GWs will provide a unique probe of the astrophysics of their formation and evolution along with the cosmic expansion of the universe. In this dissertation I present four studies that I have led using GWs to better understand the astrophysics of the currently most detected GW source, binary black holes (BBHs). We first present a novel data-driven technique to look for deviations from modeled gravitational waveforms in the data, coherent across the network of observatories, along with an analysis of the first gravitational- wave transient catalog (GWTC-1). The following three studies present the three different approaches to modeling populations of BBHs, using parametric, semi- parametric and non-parametric models. The first of these studies uses a parametric model that imposes a gap in the mass distribution of black holes, looking for evidence of effects caused by pair-instability supernovae. The second study introduces a semi-parametric model that aims to take advantage of the benefits of both parametric and non-parametric methods, by imposing a flexible perturbation to an underlying simpler parametric description. This study was among the first data-driven studies revealing possible structure in the mass distribution of BBHs using GWTC-2, namely an additional peak at 10M⊙ . The final study introduces a novel non-parametric model for hierarchically inferring population properties of GW sources, and performs the most comprehensive data-driven study of the BBH population to date. This study is also the first that uses non-parametric models to simultaneously infer the distributions of BBH masses, spins and redshifts. This dissertation contains previously published and unpublished material

    The consolidated European synthesis of CH₄ and N₂O emissions for the European Union and United Kingdom: 1990–2019

    Get PDF
    Knowledge of the spatial distribution of the fluxes of greenhouse gases (GHGs) and their temporal variability as well as flux attribution to natural and anthropogenic processes is essential to monitoring the progress in mitigating anthropogenic emissions under the Paris Agreement and to inform its global stocktake. This study provides a consolidated synthesis of CH₄ and N₂O emissions using bottom-up (BU) and top-down (TD) approaches for the European Union and UK (EU27 + UK) and updates earlier syntheses (Petrescu et al., 2020, 2021). The work integrates updated emission inventory data, process-based model results, data-driven sector model results and inverse modeling estimates, and it extends the previous period of 1990–2017 to 2019. BU and TD products are compared with European national greenhouse gas inventories (NGHGIs) reported by parties under the United Nations Framework Convention on Climate Change (UNFCCC) in 2021. Uncertainties in NGHGIs, as reported to the UNFCCC by the EU and its member states, are also included in the synthesis. Variations in estimates produced with other methods, such as atmospheric inversion models (TD) or spatially disaggregated inventory datasets (BU), arise from diverse sources including within-model uncertainty related to parameterization as well as structural differences between models. By comparing NGHGIs with other approaches, the activities included are a key source of bias between estimates, e.g., anthropogenic and natural fluxes, which in atmospheric inversions are sensitive to the prior geospatial distribution of emissions. For CH₄ emissions, over the updated 2015–2019 period, which covers a sufficiently robust number of overlapping estimates, and most importantly the NGHGIs, the anthropogenic BU approaches are directly comparable, accounting for mean emissions of 20.5 Tg CH₄ yrc (EDGARv6.0, last year 2018) and 18.4 Tg CH₄ yr⁻¹ (GAINS, last year 2015), close to the NGHGI estimates of 17.5±2.1 Tg CH₄ yr⁻¹. TD inversion estimates give higher emission estimates, as they also detect natural emissions. Over the same period, high-resolution regional TD inversions report a mean emission of 34 Tg CH₄ yr⁻¹. Coarser-resolution global-scale TD inversions result in emission estimates of 23 and 24 Tg CH₄ yr⁻¹ inferred from GOSAT and surface (SURF) network atmospheric measurements, respectively. The magnitude of natural peatland and mineral soil emissions from the JSBACH–HIMMELI model, natural rivers, lake and reservoir emissions, geological sources, and biomass burning together could account for the gap between NGHGI and inversions and account for 8 Tg CH₄ yr⁻¹. For N₂O emissions, over the 2015–2019 period, both BU products (EDGARv6.0 and GAINS) report a mean value of anthropogenic emissions of 0.9 Tg N₂O yr⁻¹, close to the NGHGI data (0.8±55 % Tg N₂O yr⁻¹). Over the same period, the mean of TD global and regional inversions was 1.4 Tg N₂O yr⁻¹ (excluding TOMCAT, which reported no data). The TD and BU comparison method defined in this study can be operationalized for future annual updates for the calculation of CH₄ and N₂O budgets at the national and EU27 + UK scales. Future comparability will be enhanced with further steps involving analysis at finer temporal resolutions and estimation of emissions over intra-annual timescales, which is of great importance for CH₄ and N₂O, and may help identify sector contributions to divergence between prior and posterior estimates at the annual and/or inter-annual scale. Even if currently comparison between CH₄ and N₂O inversion estimates and NGHGIs is highly uncertain because of the large spread in the inversion results, TD inversions inferred from atmospheric observations represent the most independent data against which inventory totals can be compared. With anticipated improvements in atmospheric modeling and observations, as well as modeling of natural fluxes, TD inversions may arguably emerge as the most powerful tool for verifying emission inventories for CH₄, N₂O and other GHGs. The referenced datasets related to figures are visualized at https://doi.org/10.5281/zenodo.7553800 (Petrescu et al., 2023)

    Data- og ekspertdreven variabelseleksjon for prediktive modeller i helsevesenet : mot økt tolkbarhet i underbestemte maskinlæringsproblemer

    Get PDF
    Modern data acquisition techniques in healthcare generate large collections of data from multiple sources, such as novel diagnosis and treatment methodologies. Some concrete examples are electronic healthcare record systems, genomics, and medical images. This leads to situations with often unstructured, high-dimensional heterogeneous patient cohort data where classical statistical methods may not be sufficient for optimal utilization of the data and informed decision-making. Instead, investigating such data structures with modern machine learning techniques promises to improve the understanding of patient health issues and may provide a better platform for informed decision-making by clinicians. Key requirements for this purpose include (a) sufficiently accurate predictions and (b) model interpretability. Achieving both aspects in parallel is difficult, particularly for datasets with few patients, which are common in the healthcare domain. In such cases, machine learning models encounter mathematically underdetermined systems and may overfit easily on the training data. An important approach to overcome this issue is feature selection, i.e., determining a subset of informative features from the original set of features with respect to the target variable. While potentially raising the predictive performance, feature selection fosters model interpretability by identifying a low number of relevant model parameters to better understand the underlying biological processes that lead to health issues. Interpretability requires that feature selection is stable, i.e., small changes in the dataset do not lead to changes in the selected feature set. A concept to address instability is ensemble feature selection, i.e. the process of repeating the feature selection multiple times on subsets of samples of the original dataset and aggregating results in a meta-model. This thesis presents two approaches for ensemble feature selection, which are tailored towards high-dimensional data in healthcare: the Repeated Elastic Net Technique for feature selection (RENT) and the User-Guided Bayesian Framework for feature selection (UBayFS). While RENT is purely data-driven and builds upon elastic net regularized models, UBayFS is a general framework for ensembles with the capabilities to include expert knowledge in the feature selection process via prior weights and side constraints. A case study modeling the overall survival of cancer patients compares these novel feature selectors and demonstrates their potential in clinical practice. Beyond the selection of single features, UBayFS also allows for selecting whole feature groups (feature blocks) that were acquired from multiple data sources, as those mentioned above. Importance quantification of such feature blocks plays a key role in tracing information about the target variable back to the acquisition modalities. Such information on feature block importance may lead to positive effects on the use of human, technical, and financial resources if systematically integrated into the planning of patient treatment by excluding the acquisition of non-informative features. Since a generalization of feature importance measures to block importance is not trivial, this thesis also investigates and compares approaches for feature block importance rankings. This thesis demonstrates that high-dimensional datasets from multiple data sources in the medical domain can be successfully tackled by the presented approaches for feature selection. Experimental evaluations demonstrate favorable properties of both predictive performance, stability, as well as interpretability of results, which carries a high potential for better data-driven decision support in clinical practice.Moderne datainnsamlingsteknikker i helsevesenet genererer store datamengder fra flere kilder, som for eksempel nye diagnose- og behandlingsmetoder. Noen konkrete eksempler er elektroniske helsejournalsystemer, genomikk og medisinske bilder. Slike pasientkohortdata er ofte ustrukturerte, høydimensjonale og heterogene og hvor klassiske statistiske metoder ikke er tilstrekkelige for optimal utnyttelse av dataene og god informasjonsbasert beslutningstaking. Derfor kan det være lovende å analysere slike datastrukturer ved bruk av moderne maskinlæringsteknikker for å øke forståelsen av pasientenes helseproblemer og for å gi klinikerne en bedre plattform for informasjonsbasert beslutningstaking. Sentrale krav til dette formålet inkluderer (a) tilstrekkelig nøyaktige prediksjoner og (b) modelltolkbarhet. Å oppnå begge aspektene samtidig er vanskelig, spesielt for datasett med få pasienter, noe som er vanlig for data i helsevesenet. I slike tilfeller må maskinlæringsmodeller håndtere matematisk underbestemte systemer og dette kan lett føre til at modellene overtilpasses treningsdataene. Variabelseleksjon er en viktig tilnærming for å håndtere dette ved å identifisere en undergruppe av informative variabler med hensyn til responsvariablen. Samtidig som variabelseleksjonsmetoder kan lede til økt prediktiv ytelse, fremmes modelltolkbarhet ved å identifisere et lavt antall relevante modellparametere. Dette kan gi bedre forståelse av de underliggende biologiske prosessene som fører til helseproblemer. Tolkbarhet krever at variabelseleksjonen er stabil, dvs. at små endringer i datasettet ikke fører til endringer i hvilke variabler som velges. Et konsept for å adressere ustabilitet er ensemblevariableseleksjon, dvs. prosessen med å gjenta variabelseleksjon flere ganger på en delmengde av prøvene i det originale datasett og aggregere resultater i en metamodell. Denne avhandlingen presenterer to tilnærminger for ensemblevariabelseleksjon, som er skreddersydd for høydimensjonale data i helsevesenet: "Repeated Elastic Net Technique for feature selection" (RENT) og "User-Guided Bayesian Framework for feature selection" (UBayFS). Mens RENT er datadrevet og bygger på elastic net-regulariserte modeller, er UBayFS et generelt rammeverk for ensembler som muliggjør inkludering av ekspertkunnskap i variabelseleksjonsprosessen gjennom forhåndsbestemte vekter og sidebegrensninger. En case-studie som modellerer overlevelsen av kreftpasienter sammenligner disse nye variabelseleksjonsmetodene og demonstrerer deres potensiale i klinisk praksis. Utover valg av enkelte variabler gjør UBayFS det også mulig å velge blokker eller grupper av variabler som representerer de ulike datakildene som ble nevnt over. Kvantifisering av viktigheten av variabelgrupper spiller en nøkkelrolle for forståelsen av hvorvidt datakildene er viktige for responsvariablen. Tilgang til slik informasjon kan føre til at bruken av menneskelige, tekniske og økonomiske ressurser kan forbedres dersom informasjonen integreres systematisk i planleggingen av pasientbehandlingen. Slik kan man redusere innsamling av ikke-informative variabler. Siden generaliseringen av viktighet av variabelgrupper ikke er triviell, undersøkes og sammenlignes også tilnærminger for rangering av viktigheten til disse variabelgruppene. Denne avhandlingen viser at høydimensjonale datasett fra flere datakilder fra det medisinske domenet effektivt kan håndteres ved bruk av variabelseleksjonmetodene som er presentert i avhandlingen. Eksperimentene viser at disse kan ha positiv en effekt på både prediktiv ytelse, stabilitet og tolkbarhet av resultatene. Bruken av disse variabelseleksjonsmetodene bærer et stort potensiale for bedre datadrevet beslutningsstøtte i klinisk praksis

    The consolidated European synthesis of CH4 and N2O emissions for the European Union and United Kingdom : 1990-2019

    Get PDF
    Funding Information: We thank Aurélie Paquirissamy, Géraud Moulas and the ARTTIC team for the great managerial support offered during the project. FAOSTAT statistics are produced and disseminated with the support of its member countries to the FAO regular budget. Annual, gap-filled and harmonized NGHGI uncertainty estimates for the EU and its member states were provided by the EU GHG inventory team (European Environment Agency and its European Topic Centre on Climate change mitigation). Most top-down inverse simulations referred to in this paper rely for the derivation of optimized flux fields on observational data provided by surface stations that are part of networks like ICOS (datasets: 10.18160/P7E9-EKEA , Integrated Non-CO Observing System, 2018a, and 10.18160/B3Q6-JKA0 , Integrated Non-CO Observing System, 2018b), AGAGE, NOAA (Obspack Globalview CH: 10.25925/20221001 , Schuldt et al., 2017), CSIRO and/or WMO GAW. We thank all station PIs and their organizations for providing these valuable datasets. We acknowledge the work of other members of the EDGAR group (Edwin Schaaf, Jos Olivier) and the outstanding scientific contribution to the VERIFY project of Peter Bergamaschi. Timo Vesala thanks ICOS-Finland, University of Helsinki. The TM5-CAMS inversions are available from https://atmosphere.copernicus.eu (last access: June 2022); Arjo Segers acknowledges support from the Copernicus Atmosphere Monitoring Service, implemented by the European Centre for Medium-Range Weather Forecasts on behalf of the European Commission (grant no. CAMS2_55). This research has been supported by the European Commission, Horizon 2020 Framework Programme (VERIFY, grant no. 776810). Ronny Lauerwald received support from the CLand Convergence Institute. Prabir Patra received support from the Environment Research and Technology Development Fund (grant no. JPMEERF20182002) of the Environmental Restoration and Conservation Agency of Japan. Pierre Regnier received financial support from the H2020 project ESM2025 – Earth System Models for the Future (grant no. 101003536). David Basviken received support from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (METLAKE, grant no. 725546). Greet Janssens-Maenhout received support from the European Union's Horizon 2020 research and innovation program (CoCO, grant no. 958927). Tuula Aalto received support from the Finnish Academy (grants nos. 351311 and 345531). Sönke Zhaele received support from the ERC consolidator grant QUINCY (grant no. 647204).Peer reviewedPublisher PD

    Landscape genetic analysis of population structure and barriers to gene flow in boreal woodland caribou (Rangifer tarandus caribou)

    Get PDF
    This study examines patterns of population genetic structure and gene flow of boreal woodland caribou (Rangifer tarandus caribou), which are experiencing declining population sizes across North America. Compared to previous studies, I used fine-scale landscape genetic analyses with intensive sampling to identify genetic subdivisions within a single range and anthropogenic and natural drivers of genetic discontinuity. The Brightsand Range of Ontario is among the southernmost boreal woodland caribou populations and contains actively managed and unmanaged forests. This range provided a unique opportunity to examine the drivers of population subdivision using fecal DNA samples (n = 788) previously obtained from non-invasive surveys. I used 12 microsatellite markers to investigate genetic diversity, identify patterns of genetic structure, and delineate barriers to gene flow. I found high connectivity among most sites, with low but significant population genetic substructure (Fst=0.009, p<0.001). The Mantel test identified a weak pattern of isolation by distance, and genetic clustering algorithms failed to identify a biologically meaningful pattern of population substructure. MEMGENE analysis and multiple regression analysis based on univariate resistances in CIRCUITSCAPE indicated that wildfires acted as a barrier to gene flow, with sites separated by burned areas having higher genetic differentiation than expected due to isolation by distance alone. The POPGRAPH analysis identified genetically isolated sites among the managed portion of the range, and CIRCUITSCAPE analysis showed that the range is highly fragmented within the managed portion and contains limited connectivity corridors, whereas the unmanaged portion had high connectivity throughout. Overall, this study suggests that boreal woodland caribou are weakly genetically differentiated across the Brightsand Range, with isolation by distance and isolation by resistance contributing to variation in allele frequencies. However, while genetic differentiation was weak, conservation efforts will be required within the managed forest area to reduce the loss of genetic diversity by improving landscape connectivity

    Molecular signals of arms race evolution between RNA viruses and their hosts

    Get PDF
    Viruses are intracellular parasites that hijack their hosts’ cellular machinery to replicate themselves. This creates an evolutionary “arms race” between hosts and viruses, where the former develop mechanisms to restrict viral infection and the latter evolve ways to circumvent these molecular barriers. In this thesis, I explore examples of this virus-host molecular interplay, focusing on events in the evolutionary histories of both viruses and hosts. The thesis begins by examining how recombination, the exchange of genetic material between related viruses, expands the genomic diversity of the Sarbecovirus subgenus, which includes SARS-CoV responsible for the 2002 SARS epidemic and SARS-CoV-2 responsible for the COVID-19 pandemic. On the host side, I examine the evolutionary interaction between RNA viruses and two interferon-stimulated genes expressed in hosts. First, I show how the 2′-5′-oligoadenylate synthetase 1 (OAS1) gene of horseshoe bats (Rhinolophoidea), the reservoir host of sarbecoviruses, lost its anti-coronaviral activity at the base of this bat superfamily. By reconstructing the Rhinolophoidea common ancestor OAS1 protein, I first validate the loss of antiviral function and highlight the implications of this event in the virus-host association between sarbecoviruses and horseshoe bat hosts. Second, I focus on the evolution of the human butyrophilin subfamily 3 member A3 (BTN3A3) gene which restricts infection by avian influenza A viruses (IAV). The evolutionary analysis reveals that BTN3A3’s anti-IAV function was gained within the primates and that specific amino acid substitutions need to be acquired in IAVs’ NP protein to evade the human BTN3A3 activity. Gain of BTN3A3-evasion-conferring substitutions correlate with all major human IAV pandemics and epidemics, making these NP residues key markers for IAV transmissibility potential to humans. In the final part of the thesis, I present a novel approach for evaluating dinucleotide compositional biases in virus genomes. An application of my metric on the Flaviviridae virus family uncovers how ancestral host shifts of these viruses correlate with adaptive shifts in their genomes’ dinucleotide representation. Collectively, the contents of this thesis extend our understanding of how viruses interact with their hosts along their intertangled evolution and provide insights into virus host switching and pandemic preparedness

    Exemplars as a least-committed alternative to dual-representations in learning and memory

    Get PDF
    Despite some notable counterexamples, the theoretical and empirical exchange between the fields of learning and memory is limited. In an attempt to promote further theoretical exchange, I explored how learning and memory may be conceptualized as distinct algorithms that operate on a the same representations of past experiences. I review representational and process assumptions in learning and memory, by the example of evaluative conditioning and false recognition, and identified important similarities in the theoretical debates. Based on my review, I identify global matching memory models and their exemplar representation as a promising candidate for a common representational substrate that satisfies the principle of least commitment. I then present two cases in which exemplar-based global matching models, which take characteristics of the stimulus material and context into account, suggest parsimonious explanations for empirical dissociations in evaluative conditioning and false recognition in long-term memory. These explanations suggest reinterpretations of findings that are commonly taken as evidence for dual-representation models. Finally, I report the same approach provides also provides a natural unitary account of false recognition in short-term memory, a finding which challenges the assumption that short-term memory is insulated from long-term memory. Taken together, this work illustrates the broad explanatory scope and the integrative and yet parsimonious potential of exemplar-based global matching models
    corecore