81 research outputs found
Unsupervised Keyword Extraction from Polish Legal Texts
In this work, we present an application of the recently proposed unsupervised
keyword extraction algorithm RAKE to a corpus of Polish legal texts from the
field of public procurement. RAKE is essentially a language and domain
independent method. Its only language-specific input is a stoplist containing a
set of non-content words. The performance of the method heavily depends on the
choice of such a stoplist, which should be domain adopted. Therefore, we
complement RAKE algorithm with an automatic approach to selecting non-content
words, which is based on the statistical properties of term distribution
Evaluating model simulations of twentieth-century sea-level rise. Part II: regional sea-level changes
Twentieth-century regional sea level changes are estimated from 12 climate models from phase 5 of the Climate Model Intercomparison Project (CMIP5). The output of the CMIP5 climate model simulations was used to calculate the global and regional sea level changes associated with dynamic sea level, atmospheric loading, glacier mass changes, and ice sheet surface mass balance contributions. The contribution from groundwater depletion, reservoir storage, and dynamic ice sheet mass changes are estimated from observations as they are not simulated by climate models. All contributions are summed, including the glacial isostatic adjustment (GIA) contribution, and compared to observational estimates from 27 tide gauge records over the twentieth century (1900–2015). A general agreement is found between the simulated sea level and tide gauge records in terms of interannual to multidecadal variability over 1900–2015. But climate models tend to systematically underestimate the observed sea level trends, particularly in the first half of the twentieth century. The corrections based on attributable biases between observations and models that have been identified in Part I of this two-part paper result in an improved explanation of the spatial variability in observed sea level trends by climate models. Climate models show that the spatial variability in sea level trends observed by tide gauge records is dominated by the GIA contribution and the steric contribution over 1900–2015. Climate models also show that it is important to include all contributions to sea level changes as they cause significant local deviations; note, for example, the groundwater depletion around India, which is responsible for the low twentieth-century sea level rise in the region
N-body simulations of gravitational dynamics
We describe the astrophysical and numerical basis of N-body simulations, both
of collisional stellar systems (dense star clusters and galactic centres) and
collisionless stellar dynamics (galaxies and large-scale structure). We explain
and discuss the state-of-the-art algorithms used for these quite different
regimes, attempt to give a fair critique, and point out possible directions of
future improvement and development. We briefly touch upon the history of N-body
simulations and their most important results.Comment: invited review (28 pages), to appear in European Physics Journal Plu
Statistical Language Modelling
Grammar-based natural language processing has reached a level where it can `understand' language to a limited degree in restricted domains. For example, it is possible to parse textual material very accurately and assign semantic relations to parts of sentences. An alternative approach originates from the work of Shannon over half a century ago [41], [42]. This approach assigns probabilities to linguistic events, where mathematical models are used to represent statistical knowledge. Once models are built, we decide which event is more likely than the others according to their probabilities. Although statistical methods currently use a very impoverished representation of speech and language (typically finite state), it is possible to train the underlying models from large amounts of data. Importantly, such statistical approaches often produce useful results. Statistical approaches seem especially well-suited to spoken language which is often spontaneous or conversational and not readily amenable to standard grammar-based approaches
The OpenMolcas Web: A Community-Driven Approach to Advancing Computational Chemistry
The developments of the open-source OpenMolcas chemistry software environment since spring 2020 are described, with a focus on novel functionalities accessible in the stable branch of the package or via interfaces with other packages. These developments span a wide range of topics in computational chemistry and are presented in thematic sections: electronic structure theory, electronic spectroscopy simulations, analytic gradients and molecular structure optimizations, ab initio molecular dynamics, and other new features. This report offers an overview of the chemical phenomena and processes OpenMolcas can address, while showing that OpenMolcas is an attractive platform for state-of-the-art atomistic computer simulations
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel
A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants. © 2014 Macmillan Publishers Limited. All rights reserved
Design and construction of the MicroBooNE detector
This paper describes the design and construction of the MicroBooNE liquid
argon time projection chamber and associated systems. MicroBooNE is the first
phase of the Short Baseline Neutrino program, located at Fermilab, and will
utilize the capabilities of liquid argon detectors to examine a rich assortment
of physics topics. In this document details of design specifications, assembly
procedures, and acceptance tests are reported
Measurement of the longitudinal diffusion of ionization electrons in the MicroBooNE detector
Abstract: Accurate knowledge of electron transport properties is vital to understanding the information provided by liquid argon time projection chambers (LArTPCs). Ionization electron drift-lifetime, local electric field distortions caused by positive ion accumulation, and electron diffusion can all significantly impact the measured signal waveforms. This paper presents a measurement of the effective longitudinal electron diffusion coefficient, DL, in MicroBooNE at the nominal electric field strength of 273.9 V/cm. Historically, this measurement has been made in LArTPC prototype detectors. This represents the first measurement in a large-scale (85 tonne active volume) LArTPC operating in a neutrino beam. This is the largest dataset ever used for this measurement. Using a sample of ∼70,000 through-going cosmic ray muon tracks tagged with MicroBooNE's cosmic ray tagger system, we measure DL = 3.74+0.28 -0.29 cm2/s
Riociguat treatment in patients with chronic thromboembolic pulmonary hypertension: Final safety data from the EXPERT registry
Objective: The soluble guanylate cyclase stimulator riociguat is approved for the treatment of adult patients with pulmonary arterial hypertension (PAH) and inoperable or persistent/recurrent chronic thromboembolic pulmonary hypertension (CTEPH) following Phase
- …