554 research outputs found

    Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm

    Get PDF
    Abstract: Background: Tokenization is an important component of language processing yet there is no widely accepted tokenization method for English texts, including biomedical texts. Other than rule based techniques, tokenization in the biomedical domain has been regarded as a classification task. Biomedical classifier-based tokenizers either split or join textual objects through classification to form tokens. The idiosyncratic nature of each biomedical tokenizer’s output complicates adoption and reuse. Furthermore, biomedical tokenizers generally lack guidance on how to apply an existing tokenizer to a new domain (subdomain). We identify and complete a novel tokenizer design pattern and suggest a systematic approach to tokenizer creation. We implement a tokenizer based on our design pattern that combines regular expressions and machine learning. Our machine learning approach differs from the previous split-join classification approaches. We evaluate our approach against three other tokenizers on the task of tokenizing biomedical text. Results: Medpost and our adapted Viterbi tokenizer performed best with a 92.9% and 92.4% accuracy respectively. Conclusions: Our evaluation of our design pattern and guidelines supports our claim that the design pattern and guidelines are a viable approach to tokenizer construction (producing tokenizers matching leading custom-built tokenizers in a particular domain). Our evaluation also demonstrates that ambiguous tokenizations can be disambiguated through POS tagging. In doing so, POS tag sequences and training data have a significant impact on proper text tokenization

    Visualizing Software Architecture Evolution Using Change-Sets

    Full text link
    When trying to understand the evolution of a software system it can be useful to visualize the evolution of the sys-tem’s architecture. Existing tools for viewing architectural evolution assume that what a user is interested in can be described in an unbroken sequence of time, for example the changes over the last six months. We present an alternative approach that provides a lightweight method for examining the net effect of any set of changes on a system’s architec-ture. We also present Motive, a prototype tool that imple-ments this approach, and demonstrate how it can be used to answer questions about software evolution by describing case studies we conducted on two Java systems.

    Improving dissolve spatial operations in a simple feature model

    Full text link
    [EN] This paper presents an algorithm to improve the performance of a spatial operation called `dissolve¿ widely used in Geographic Information System (GIS) through spatial database systems. In simple feature models (lacking of persistent topology) executing some common spatial operations requires a high amount of system resources. Such common operations occur for example in the `OpenGIS Simple Features for SQL¿ protocol (SFS), a client-server interoperability standard defined by `The Open Geospatial Consortium, Inc.¿ (OGC). The specific spatial operation studied in this paper is called `dissolve¿. It is carried out using the union spatial operator defined by OGC) and consists of removing the boundaries between adjacent polygons. The proposed algorithm improves substantially the performance of this spatial operation and it needs between 100 and 1000 times less amount of resources. This way it enables the database server to carry out this spatial operation on huge datasets containing up to millions of geometries. To check and to validate this algorithm a new open source software package (PGAT) has been developed.This project has been developed in the University of Victoria (British Columbia, Canada) thanks to the Grant awarded by "La Secretaria de Estado de Universidades e Investigacion del Ministerio de Educacion y Ciencia" from Spain (Ref. 2006-0264).Martínez Llario, JC.; Weber-Jahnke, JH.; Coll-Aliaga, E. (2009). Improving dissolve spatial operations in a simple feature model. Advances in Engineering Software. 40(3):170-175. doi:10.1016/j.advengsoft.2008.03.014S17017540

    Roadmap on photonic, electronic and atomic collision physics: I. Light-matter interaction

    Get PDF
    We publish three Roadmaps on photonic, electronic and atomic collision physics in order to celebrate the 60th anniversary of the ICPEAC conference. In Roadmap I, we focus on light-matter interaction. In this area, studies of ultrafast electronic and molecular dynamics have been rapidly growing, with the advent of new light sources such as attosecond lasers and X-ray free electron lasers. In parallel, experiments with established synchrotron radiation sources and femtosecond lasers using cutting- edge detection schemes are revealing new scientific insights that have never been exploited. Relevant theories are also being rapidly developed. Target samples for photon-impact experiments are expanding from atoms and small molecules to complex systems such as biomolecules, fullerene, clusters and solids. This Roadmap aims at looking back along the road, explaining the development of these fields, and looking forward, collecting contributions from twenty leading groups from the field

    Production of He-4 and (4) in Pb-Pb collisions at root(NN)-N-S=2.76 TeV at the LHC

    Get PDF
    Results on the production of He-4 and (4) nuclei in Pb-Pb collisions at root(NN)-N-S = 2.76 TeV in the rapidity range vertical bar y vertical bar <1, using the ALICE detector, are presented in this paper. The rapidity densities corresponding to 0-10% central events are found to be dN/dy4(He) = (0.8 +/- 0.4 (stat) +/- 0.3 (syst)) x 10(-6) and dN/dy4 = (1.1 +/- 0.4 (stat) +/- 0.2 (syst)) x 10(-6), respectively. This is in agreement with the statistical thermal model expectation assuming the same chemical freeze-out temperature (T-chem = 156 MeV) as for light hadrons. The measured ratio of (4)/He-4 is 1.4 +/- 0.8 (stat) +/- 0.5 (syst). (C) 2018 Published by Elsevier B.V.Peer reviewe

    Azimuthal anisotropy of charged jet production in root s(NN)=2.76 TeV Pb-Pb collisions

    Get PDF
    We present measurements of the azimuthal dependence of charged jet production in central and semi-central root s(NN) = 2.76 TeV Pb-Pb collisions with respect to the second harmonic event plane, quantified as nu(ch)(2) (jet). Jet finding is performed employing the anti-k(T) algorithm with a resolution parameter R = 0.2 using charged tracks from the ALICE tracking system. The contribution of the azimuthal anisotropy of the underlying event is taken into account event-by-event. The remaining (statistical) region-to-region fluctuations are removed on an ensemble basis by unfolding the jet spectra for different event plane orientations independently. Significant non-zero nu(ch)(2) (jet) is observed in semi-central collisions (30-50% centrality) for 20 <p(T)(ch) (jet) <90 GeV/c. The azimuthal dependence of the charged jet production is similar to the dependence observed for jets comprising both charged and neutral fragments, and compatible with measurements of the nu(2) of single charged particles at high p(T). Good agreement between the data and predictions from JEWEL, an event generator simulating parton shower evolution in the presence of a dense QCD medium, is found in semi-central collisions. (C) 2015 CERN for the benefit of the ALICE Collaboration. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Peer reviewe

    Forward-central two-particle correlations in p-Pb collisions at root s(NN)=5.02 TeV

    Get PDF
    Two-particle angular correlations between trigger particles in the forward pseudorapidity range (2.5 2GeV/c. (C) 2015 CERN for the benefit of the ALICE Collaboration. Published by Elsevier B. V.Peer reviewe

    Event-shape engineering for inclusive spectra and elliptic flow in Pb-Pb collisions at root(NN)-N-S=2.76 TeV

    Get PDF
    Peer reviewe

    Pseudorapidity and transverse-momentum distributions of charged particles in proton-proton collisions at root s=13 TeV

    Get PDF
    The pseudorapidity (eta) and transverse-momentum (p(T)) distributions of charged particles produced in proton-proton collisions are measured at the centre-of-mass energy root s = 13 TeV. The pseudorapidity distribution in vertical bar eta vertical bar <1.8 is reported for inelastic events and for events with at least one charged particle in vertical bar eta vertical bar <1. The pseudorapidity density of charged particles produced in the pseudorapidity region vertical bar eta vertical bar <0.5 is 5.31 +/- 0.18 and 6.46 +/- 0.19 for the two event classes, respectively. The transverse-momentum distribution of charged particles is measured in the range 0.15 <p(T) <20 GeV/c and vertical bar eta vertical bar <0.8 for events with at least one charged particle in vertical bar eta vertical bar <1. The evolution of the transverse momentum spectra of charged particles is also investigated as a function of event multiplicity. The results are compared with calculations from PYTHIA and EPOS Monte Carlo generators. (C) 2015 CERN for the benefit of the ALICE Collaboration. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Peer reviewe

    Elliptic flow of muons from heavy-flavour hadron decays at forward rapidity in Pb-Pb collisions at root s(NN)=2.76TeV

    Get PDF
    The elliptic flow, v(2), of muons from heavy-flavour hadron decays at forward rapidity (2.5 <y <4) is measured in Pb-Pb collisions at root s(NN)= 2.76TeVwith the ALICE detector at the LHC. The scalar product, two- and four-particle Q cumulants and Lee-Yang zeros methods are used. The dependence of the v(2) of muons from heavy-flavour hadron decays on the collision centrality, in the range 0-40%, and on transverse momentum, p(T), is studied in the interval 3 <p(T)<10 GeV/c. A positive v(2) is observed with the scalar product and two-particle Q cumulants in semi-central collisions (10-20% and 20-40% centrality classes) for the p(T) interval from 3 to about 5GeV/c with a significance larger than 3 sigma, based on the combination of statistical and systematic uncertainties. The v(2) magnitude tends to decrease towards more central collisions and with increasing pT. It becomes compatible with zero in the interval 6 <p(T)<10 GeV/c. The results are compared to models describing the interaction of heavy quarks and open heavy-flavour hadrons with the high-density medium formed in high-energy heavy-ion collisions. (C) 2015 CERN for the benefit of the ALICE Collaboration. Published by Elsevier B.V.Peer reviewe
    corecore