975 research outputs found

    Is ChatGPT a game changer for geocoding -- a benchmark for geocoding address parsing techniques

    Full text link
    The remarkable success of GPT models across various tasks, including toponymy recognition motivates us to assess the performance of the GPT-3 model in the geocoding address parsing task. To ensure that the evaluation more accurately mirrors performance in real-world scenarios with diverse user input qualities and resolve the pressing need for a 'gold standard' evaluation dataset for geocoding systems, we introduce a benchmark dataset of low-quality address descriptions synthesized based on human input patterns mining from actual input logs of a geocoding system in production. This dataset has 21 different input errors and variations; contains over 239,000 address records that are uniquely selected from streets across all U.S. 50 states and D.C.; and consists of three subsets to be used as training, validation, and testing sets. Building on this, we train and gauge the performance of the GPT-3 model in extracting address components, contrasting its performance with transformer-based and LSTM-based models. The evaluation results indicate that Bidirectional LSTM-CRF model has achieved the best performance over these transformer-based models and GPT-3 model. Transformer-based models demonstrate very comparable results compared to the Bidirectional LSTM-CRF model. The GPT-3 model, though trailing in performance, showcases potential in the address parsing task with few-shot examples, exhibiting room for improvement with additional fine-tuning. We open source the code and data of this presented benchmark so that researchers can utilize it for future model development or extend it to evaluate similar tasks, such as document geocoding

    Bargaining in the Shadow of Eminent Domain: Valuing and Apportioning Condemnation Awards Between Landlord and Tenant

    Get PDF
    Who has a constitutionally protected property interest when the government condemns land subject to a lease? Is it the landlord? The tenant? Or do both parties have property rights that entitle them to compensation? Further, how should the size of the total condemnation award be determined? Should we value the property rights of the landlord and the tenant separately and sum? Or should we value the entire parcel as if it were an undivided fee simple and apportion the award between the landlord and the tenant? If the condemnation award is based on the value of a fee simple and apportioned, who should make this division? Is this an issue of constitutional law as to which the courts have the final say? Or do the principles of constitutional law enunciated by the courts merely provide default rules, i.e., rules that apply only if the parties fail to address the issue of compensation in the lease? In this article, we offer a normative framework for answering these questions. Our approach evolved by working backwards. We started with the question of how to apportion condemnation awards between landlord and tenant. Why, we asked, should courts do the division? Why not let the parties do it themselves? Insofar as commercial leases are concerned, all the prerequisites for efficient bargaining would seem to be present here: a small number of parties (two), an established vehicle for conducting the negotiations (the lease), and both parties typically represented by counsel. Furthermore, provided the issue is addressed in the lease – before condemnation takes place – there should be no problem of ex post strategic behavior

    An effective and efficient approach for manually improving geocoded data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The process of geocoding produces output coordinates of varying degrees of quality. Previous studies have revealed that simply excluding records with low-quality geocodes from analysis can introduce significant bias, but depending on the number and severity of the inaccuracies, their inclusion may also lead to bias. Little quantitative research has been presented on the cost and/or effectiveness of correcting geocodes through manual interactive processes, so the most cost effective methods for improving geocoded data are unclear. The present work investigates the time and effort required to correct geocodes contained in five health-related datasets that represent examples of data commonly used in Health GIS.</p> <p>Results</p> <p>Geocode correction was attempted on five health-related datasets containing a total of 22,317 records. The complete processing of these data took 11.4 weeks (427 hours), averaging 69 seconds of processing time per record. Overall, the geocodes associated with 12,280 (55%) of records were successfully improved, taking 95 seconds of processing time per corrected record on average across all five datasets. Geocode correction improved the overall match rate (the number of successful matches out of the total attempted) from 79.3 to 95%. The spatial shift between the location of original successfully matched geocodes and their corrected improved counterparts averaged 9.9 km per corrected record. After geocode correction the number of city and USPS ZIP code accuracy geocodes were reduced from 10,959 and 1,031 to 6,284 and 200, respectively, while the number of building centroid accuracy geocodes increased from 0 to 2,261.</p> <p>Conclusion</p> <p>The results indicate that manual geocode correction using a web-based interactive approach is a feasible and cost effective method for improving the quality of geocoded data. The level of effort required varies depending on the type of data geocoded. These results can be used to choose between data improvement options (e.g., manual intervention, pseudocoding/geo-imputation, field GPS readings).</p

    Probing leptoquark production at IceCube

    Get PDF
    We emphasize the inelasticity distribution of events detected at the IceCube neutrino telescope as an important tool for revealing new physics. This is possible because the unique energy resolution at this facility allows to separately assign the energy fractions for emergent muons and taus in neutrino interactions. As a particular example, we explore the possibility of probing second and third generation leptoquark parameter space (coupling and mass). We show that production of leptoquarks with masses \agt 250 GeV and diagonal generation couplings of O(1) can be directly tested if the cosmic neutrino flux is at the Waxman-Bahcall level.Comment: Matching version to be published in Phys. Rev.

    Prefrontal Cortex Modulation during Anticipation of Working Memory Demands as Revealed by Magnetoencephalography

    Get PDF
    During the anticipation of task demands frontal control is involved in the assembly of stimulus-response mappings based on current goals. It is not clear whether prefrontal modulations occur in higher-order cortical regions, likely reflecting cognitive anticipation processes. The goal of this paper was to investigate prefrontal modulation during anticipation of upcoming working memory demands as revealed by magnetoencephalography (MEG). Twenty healthy volunteers underwent MEG while they performed a variation of the Sternberg Working Memory (WM) task. Beta band (14–30 Hz) SAM (Synthetic Aperture Magnetometry) analysis was performed. During the preparatory periods there was an increase in beta power (event-related synchronization) in dorsolateral prefrontal cortex (DLPFC) bilaterally, left inferior prefrontal gyrus, left parietal, and temporal areas. Our results provide support for the hypothesis that, during preparatory states, the prefrontal cortex is important for biasing higher order brain regions that are going to be engaged in the upcoming task

    Family Unification, Exotic States and Magnetic Monopoles

    Full text link
    The embedding in SU(4)xSU(3)xSU(3) of the well studied gauge groups SU(4)xSU(2)xSU(2) and SU(3)xSU(3)xSU(3) naturally leads to family unification as opposed to simple family replication. An inescapable consequence is the predicted existence of (exotic)color singlet states that carry fractional electric charge. The corresponding magnetic monopoles carry multiple Dirac magnetic charge, can be relatively light (\sim 10^{7}-10^{13}GeV), and may be present in the galaxy not far below the Parker bound.Comment: 10 pages, Late

    The Neuroscience Information Framework: A Data and Knowledge Environment for Neuroscience

    Get PDF
    With support from the Institutes and Centers forming the NIH Blueprint for Neuroscience Research, we have designed and implemented a new initiative for integrating access to and use of Web-based neuroscience resources: the Neuroscience Information Framework. The Framework arises from the expressed need of the neuroscience community for neuroinformatic tools and resources to aid scientific inquiry, builds upon prior development of neuroinformatics by the Human Brain Project and others, and directly derives from the Society for Neuroscience’s Neuroscience Database Gateway. Partnered with the Society, its Neuroinformatics Committee, and volunteer consultant-collaborators, our multi-site consortium has developed: (1) a comprehensive, dynamic, inventory of Web-accessible neuroscience resources, (2) an extended and integrated terminology describing resources and contents, and (3) a framework accepting and aiding concept-based queries. Evolving instantiations of the Framework may be viewed at http://nif.nih.gov, http://neurogateway.org, and other sites as they come on line

    Using thermal UAV imagery to model distributed debris thicknesses and sub-debris melt rates on debris-covered glaciers

    Get PDF
    Supraglacial debris cover regulates the melt rates of many glaciers in mountainous regions around the world, thereby modifying the availability and quality of downstream water resources. However, the influence of supraglacial debris is often poorly represented within glaciological models, due to the absence of a technique to provide high-precision, spatially continuous measurements of debris thickness. Here, we use high-resolution UAV-derived thermal imagery, in conjunction with local meteorological data, visible UAV imagery and vertically profiled debris temperature time series, to model the spatially distributed debris thickness across a portion of Llaca Glacier in the Cordillera Blanca of Peru. Based on our results, we simulate daily sub-debris melt rates over a 3-month period during 2019. We demonstrate that, by effectively calibrating the radiometric thermal imagery and accounting for temporal and spatial variations in meteorological variables during UAV surveys, thermal UAV data can be used to more precisely represent the highly heterogeneous patterns of debris thickness and sub-debris melt on debris-covered glaciers. Additionally, our results indicate a mean sub-debris melt rate nearly three times greater than the mean melt rate simulated from satellite-derived debris thicknesses, emphasising the importance of acquiring further high-precision debris thickness data for the purposes of investigating glacier-scale melt processes, calibrating regional melt models and improving the accuracy of runoff predictions

    The influence of abiotic factors on the presence of European corn borer (Ostrinia nubilalis Hübner)

    Get PDF
    Istraživanja su provedena tijekom trogodišnjeg razdoblja (2012. – 2014.) u poljskim uvjetima s prirodnom zarazom kukuruznoga moljca, na Poljoprivrednom institutu u Osijeku. Cilj istraživanja bio je utvrditi utjecaj različitih varijanti navodnjavanja i gnojidbe i utjecaj genotipa na pojavu i oštećenost biljki od kukuruznoga moljca te povezanost ishrane gusjenica s koncentracijom dušika, silicija i C/N odnosa u listu kukuruza. Na kraju svake vegetacijske sezone zabilježena je masa klipa (g), dužina oštećenja stabljike (cm), oštećenje drške klipa (cm), broj gusjenica u stabljici kukuruza, broj gusjenica u dršci klipa, te ukupan broj gusjenica po biljci. U 2014. godini kada su bile niže temperature, a veća količina oborina utvrđen je značajno niži napad u odnosu na druge dvije ispitivane godine. Uz pomoć feromonskih mamaca utvrđena je dominantnost Z-tipa kukuruznoga moljca na području istočne Slavonije. Povišenom razinom sadržaja vode u tlu utvrdilo se manje oštećenje biljke, a povećanjem razine gnojidbe utvrđeno je veće oštećenje na biljkama kao posljedica ishrane gusjenica. Utvrđena je različita otpornost hibrida u odnosu na oštećenje od gusjenica te se hibrid C4 izdvojio kao najotporniji, dok je C1 bio najosjetljiviji. Koncentracija dušika i silicija u listu kukuruza u negativnoj su korelaciji kao i koncentracija dušika i C/N odnos. Otpornost kod hibrida nije isključivo ovisila o koncentracijama dušika i silicija iako se pokazalo kod većine hibrida pri povećanoj koncentraciji dušika veće oštećenje, a kod povećane koncentracije silicija utvrđeno je manje oštećenje.Field experiments with natural population of ECB were conducted in three vegetation seasons (2012-2014) at the Agricultural Institute in Osijek. The aim of this study was to determine the effect of different levels of irrigation and nitrogen fertilization and various genotypes on the occurrence and damage of maize plants from European corn borer larvae and relationship with nitrogen and silicon content as well as C/N ratio. At the end of each growing season were determined ear weight (g), tunnel length (stalk) (cm), ear shank damage (cm), the number of larvae in corn stalk, number of larvae in the ear shank, and total number of larvae in plant. In 2014, with lower temperatures and higher amount of precipitate compared to the previous years, a significantly lower ECB attack was determined. Dominance of Z-type European corn borer on pheromone traps in the area of eastern Slavonia was determined. Increasing the level of soil water content damage from larvae was reduced and increasing the level of nitrogen fertilization feeding activity was increased. We have confirmed different hybrid resistance in regards to damage from larvae, so C4 genotype was the most resistant while C1 was the most susceptible. Concentration of nitrogen and silicon in a maize leaf were in negative correlation as well as nitrogen concentration and C/N ratio. Hybrid resistance didn't entirely depend on nitrogen and silicon concentrations, even though there was greater damage at most hybrids with higher concentration of nitrogen, while damage was reduced with higher concentration of silicon

    Use of attribute association error probability estimates to evaluate quality of medical record geocodes

    Get PDF
    BACKGROUND: The utility of patient attributes associated with the spatiotemporal analysis of medical records lies not just in their values but also the strength of association between them. Estimating the extent to which a hierarchy of conditional probability exists between patient attribute associations such as patient identifying fields, patient and date of diagnosis, and patient and address at diagnosis is fundamental to estimating the strength of association between patient and geocode, and patient and enumeration area. We propose a hierarchy for the attribute associations within medical records that enable spatiotemporal relationships. We also present a set of metrics that store attribute association error probability (AAEP), to estimate error probability for all attribute associations upon which certainty in a patient geocode depends. METHODS: A series of experiments were undertaken to understand how error estimation could be operationalized within health data and what levels of AAEP in real data reveal themselves using these methods. Specifically, the goals of this evaluation were to (1) assess if the concept of our error assessment techniques could be implemented by a population-based cancer registry; (2) apply the techniques to real data from a large health data agency and characterize the observed levels of AAEP; and (3) demonstrate how detected AAEP might impact spatiotemporal health research. RESULTS: We present an evaluation of AAEP metrics generated for cancer cases in a North Carolina county. We show examples of how we estimated AAEP for selected attribute associations and circumstances. We demonstrate the distribution of AAEP in our case sample across attribute associations, and demonstrate ways in which disease registry specific operations influence the prevalence of AAEP estimates for specific attribute associations. CONCLUSIONS: The effort to detect and store estimates of AAEP is worthwhile because of the increase in confidence fostered by the attribute association level approach to the assessment of uncertainty in patient geocodes, relative to existing geocoding related uncertainty metrics
    corecore