975 research outputs found
Is ChatGPT a game changer for geocoding -- a benchmark for geocoding address parsing techniques
The remarkable success of GPT models across various tasks, including toponymy
recognition motivates us to assess the performance of the GPT-3 model in the
geocoding address parsing task. To ensure that the evaluation more accurately
mirrors performance in real-world scenarios with diverse user input qualities
and resolve the pressing need for a 'gold standard' evaluation dataset for
geocoding systems, we introduce a benchmark dataset of low-quality address
descriptions synthesized based on human input patterns mining from actual input
logs of a geocoding system in production. This dataset has 21 different input
errors and variations; contains over 239,000 address records that are uniquely
selected from streets across all U.S. 50 states and D.C.; and consists of three
subsets to be used as training, validation, and testing sets. Building on this,
we train and gauge the performance of the GPT-3 model in extracting address
components, contrasting its performance with transformer-based and LSTM-based
models. The evaluation results indicate that Bidirectional LSTM-CRF model has
achieved the best performance over these transformer-based models and GPT-3
model. Transformer-based models demonstrate very comparable results compared to
the Bidirectional LSTM-CRF model. The GPT-3 model, though trailing in
performance, showcases potential in the address parsing task with few-shot
examples, exhibiting room for improvement with additional fine-tuning. We open
source the code and data of this presented benchmark so that researchers can
utilize it for future model development or extend it to evaluate similar tasks,
such as document geocoding
Bargaining in the Shadow of Eminent Domain: Valuing and Apportioning Condemnation Awards Between Landlord and Tenant
Who has a constitutionally protected property interest when the government condemns land subject to a lease? Is it the landlord? The tenant? Or do both parties have property rights that entitle them to compensation? Further, how should the size of the total condemnation award be determined? Should we value the property rights of the landlord and the tenant separately and sum? Or should we value the entire parcel as if it were an undivided fee simple and apportion the award between the landlord and the tenant? If the condemnation award is based on the value of a fee simple and apportioned, who should make this division? Is this an issue of constitutional law as to which the courts have the final say? Or do the principles of constitutional law enunciated by the courts merely provide default rules, i.e., rules that apply only if the parties fail to address the issue of compensation in the lease?
In this article, we offer a normative framework for answering these questions. Our approach evolved by working backwards. We started with the question of how to apportion condemnation awards between landlord and tenant. Why, we asked, should courts do the division? Why not let the parties do it themselves? Insofar as commercial leases are concerned, all the prerequisites for efficient bargaining would seem to be present here: a small number of parties (two), an established vehicle for conducting the negotiations (the lease), and both parties typically represented by counsel. Furthermore, provided the issue is addressed in the lease – before condemnation takes place – there should be no problem of ex post strategic behavior
An effective and efficient approach for manually improving geocoded data
<p>Abstract</p> <p>Background</p> <p>The process of geocoding produces output coordinates of varying degrees of quality. Previous studies have revealed that simply excluding records with low-quality geocodes from analysis can introduce significant bias, but depending on the number and severity of the inaccuracies, their inclusion may also lead to bias. Little quantitative research has been presented on the cost and/or effectiveness of correcting geocodes through manual interactive processes, so the most cost effective methods for improving geocoded data are unclear. The present work investigates the time and effort required to correct geocodes contained in five health-related datasets that represent examples of data commonly used in Health GIS.</p> <p>Results</p> <p>Geocode correction was attempted on five health-related datasets containing a total of 22,317 records. The complete processing of these data took 11.4 weeks (427 hours), averaging 69 seconds of processing time per record. Overall, the geocodes associated with 12,280 (55%) of records were successfully improved, taking 95 seconds of processing time per corrected record on average across all five datasets. Geocode correction improved the overall match rate (the number of successful matches out of the total attempted) from 79.3 to 95%. The spatial shift between the location of original successfully matched geocodes and their corrected improved counterparts averaged 9.9 km per corrected record. After geocode correction the number of city and USPS ZIP code accuracy geocodes were reduced from 10,959 and 1,031 to 6,284 and 200, respectively, while the number of building centroid accuracy geocodes increased from 0 to 2,261.</p> <p>Conclusion</p> <p>The results indicate that manual geocode correction using a web-based interactive approach is a feasible and cost effective method for improving the quality of geocoded data. The level of effort required varies depending on the type of data geocoded. These results can be used to choose between data improvement options (e.g., manual intervention, pseudocoding/geo-imputation, field GPS readings).</p
Probing leptoquark production at IceCube
We emphasize the inelasticity distribution of events detected at the IceCube
neutrino telescope as an important tool for revealing new physics. This is
possible because the unique energy resolution at this facility allows to
separately assign the energy fractions for emergent muons and taus in neutrino
interactions. As a particular example, we explore the possibility of probing
second and third generation leptoquark parameter space (coupling and mass). We
show that production of leptoquarks with masses \agt 250 GeV and diagonal
generation couplings of O(1) can be directly tested if the cosmic neutrino flux
is at the Waxman-Bahcall level.Comment: Matching version to be published in Phys. Rev.
Prefrontal Cortex Modulation during Anticipation of Working Memory Demands as Revealed by Magnetoencephalography
During the anticipation of task demands frontal control is involved in the assembly of stimulus-response mappings based on current goals. It is not clear whether prefrontal modulations occur in higher-order cortical regions, likely reflecting cognitive anticipation processes. The goal of this paper was to investigate prefrontal modulation during anticipation of upcoming working memory demands as revealed by magnetoencephalography (MEG). Twenty healthy volunteers underwent MEG while they performed a variation of the Sternberg Working Memory (WM) task. Beta band (14–30 Hz) SAM (Synthetic Aperture Magnetometry) analysis was performed. During the preparatory periods there was an increase in beta power (event-related synchronization) in dorsolateral prefrontal cortex (DLPFC) bilaterally, left inferior prefrontal gyrus, left parietal, and temporal areas. Our results provide support for the hypothesis that, during preparatory states, the prefrontal cortex is important for biasing higher order brain regions that are going to be engaged in the upcoming task
Family Unification, Exotic States and Magnetic Monopoles
The embedding in SU(4)xSU(3)xSU(3) of the well studied gauge groups
SU(4)xSU(2)xSU(2) and SU(3)xSU(3)xSU(3) naturally leads to family unification
as opposed to simple family replication. An inescapable consequence is the
predicted existence of (exotic)color singlet states that carry fractional
electric charge. The corresponding magnetic monopoles carry multiple Dirac
magnetic charge, can be relatively light (\sim 10^{7}-10^{13}GeV), and may be
present in the galaxy not far below the Parker bound.Comment: 10 pages, Late
The Neuroscience Information Framework: A Data and Knowledge Environment for Neuroscience
With support from the Institutes and Centers forming the NIH Blueprint for Neuroscience Research, we have designed and implemented a new initiative for integrating access to and use of Web-based neuroscience resources: the Neuroscience Information Framework. The Framework arises from the expressed need of the neuroscience community for neuroinformatic tools and resources to aid scientific inquiry, builds upon prior development of neuroinformatics by the Human Brain Project and others, and directly derives from the Society for Neuroscience’s Neuroscience Database Gateway. Partnered with the Society, its Neuroinformatics Committee, and volunteer consultant-collaborators, our multi-site consortium has developed: (1) a comprehensive, dynamic, inventory of Web-accessible neuroscience resources, (2) an extended and integrated terminology describing resources and contents, and (3) a framework accepting and aiding concept-based queries. Evolving instantiations of the Framework may be viewed at http://nif.nih.gov, http://neurogateway.org, and other sites as they come on line
Using thermal UAV imagery to model distributed debris thicknesses and sub-debris melt rates on debris-covered glaciers
Supraglacial debris cover regulates the melt rates of many glaciers in mountainous regions around the world, thereby modifying the availability and quality of downstream water resources. However, the influence of supraglacial debris is often poorly represented within glaciological models, due to the absence of a technique to provide high-precision, spatially continuous measurements of debris thickness. Here, we use high-resolution UAV-derived thermal imagery, in conjunction with local meteorological data, visible UAV imagery and vertically profiled debris temperature time series, to model the spatially distributed debris thickness across a portion of Llaca Glacier in the Cordillera Blanca of Peru. Based on our results, we simulate daily sub-debris melt rates over a 3-month period during 2019. We demonstrate that, by effectively calibrating the radiometric thermal imagery and accounting for temporal and spatial variations in meteorological variables during UAV surveys, thermal UAV data can be used to more precisely represent the highly heterogeneous patterns of debris thickness and sub-debris melt on debris-covered glaciers. Additionally, our results indicate a mean sub-debris melt rate nearly three times greater than the mean melt rate simulated from satellite-derived debris thicknesses, emphasising the importance of acquiring further high-precision debris thickness data for the purposes of investigating glacier-scale melt processes, calibrating regional melt models and improving the accuracy of runoff predictions
The influence of abiotic factors on the presence of European corn borer (Ostrinia nubilalis Hübner)
Istraživanja su provedena tijekom trogodišnjeg razdoblja (2012. – 2014.) u poljskim uvjetima s prirodnom zarazom kukuruznoga moljca, na Poljoprivrednom institutu u Osijeku. Cilj istraživanja bio je utvrditi utjecaj različitih varijanti navodnjavanja i gnojidbe i utjecaj genotipa na pojavu i oštećenost biljki od kukuruznoga moljca te povezanost ishrane gusjenica s koncentracijom dušika, silicija i C/N odnosa u listu kukuruza. Na kraju svake vegetacijske sezone zabilježena je masa klipa (g), dužina oštećenja stabljike (cm), oštećenje drške klipa (cm), broj gusjenica u stabljici kukuruza, broj gusjenica u dršci klipa, te ukupan broj gusjenica po biljci. U 2014. godini kada su bile niže temperature, a veća količina oborina utvrđen je značajno niži napad u odnosu na druge dvije ispitivane godine. Uz pomoć feromonskih mamaca utvrđena je dominantnost Z-tipa kukuruznoga moljca na području istočne Slavonije. Povišenom razinom sadržaja vode u tlu utvrdilo se manje oštećenje biljke, a povećanjem razine gnojidbe utvrđeno je veće oštećenje na biljkama kao posljedica ishrane gusjenica. Utvrđena je različita otpornost hibrida u odnosu na oštećenje od gusjenica te se hibrid C4 izdvojio kao najotporniji, dok je C1 bio najosjetljiviji. Koncentracija dušika i silicija u listu kukuruza u negativnoj su korelaciji kao i koncentracija dušika i C/N odnos. Otpornost kod hibrida nije isključivo ovisila o koncentracijama dušika i silicija iako se pokazalo kod većine hibrida pri povećanoj koncentraciji dušika veće oštećenje, a kod povećane koncentracije silicija utvrđeno je manje oštećenje.Field experiments with natural population of ECB were conducted in three vegetation seasons (2012-2014) at the Agricultural Institute in Osijek. The aim of this study was to determine the effect of different levels of irrigation and nitrogen fertilization and various genotypes on the occurrence and damage of maize plants from European corn borer larvae and relationship with nitrogen and silicon content as well as C/N ratio. At the end of each growing season were determined ear weight (g), tunnel length (stalk) (cm), ear shank damage (cm), the number of larvae in corn stalk, number of larvae in the ear shank, and total number of larvae in plant. In 2014, with lower temperatures and higher amount of precipitate compared to the previous years, a significantly lower ECB attack was determined. Dominance of Z-type European corn borer on pheromone traps in the area of eastern Slavonia was determined. Increasing the level of soil water content damage from larvae was reduced and increasing the level of nitrogen fertilization feeding activity was increased. We have confirmed different hybrid resistance in regards to damage from larvae, so C4 genotype was the most resistant while C1 was the most susceptible. Concentration of nitrogen and silicon in a maize leaf were in negative correlation as well as nitrogen concentration and C/N ratio. Hybrid resistance didn't entirely depend on nitrogen and silicon concentrations, even though there was greater damage at most hybrids with higher concentration of nitrogen, while damage was reduced with higher concentration of silicon
Use of attribute association error probability estimates to evaluate quality of medical record geocodes
BACKGROUND: The utility of patient attributes associated with the spatiotemporal analysis of medical records lies not just in their values but also the strength of association between them. Estimating the extent to which a hierarchy of conditional probability exists between patient attribute associations such as patient identifying fields, patient and date of diagnosis, and patient and address at diagnosis is fundamental to estimating the strength of association between patient and geocode, and patient and enumeration area. We propose a hierarchy for the attribute associations within medical records that enable spatiotemporal relationships. We also present a set of metrics that store attribute association error probability (AAEP), to estimate error probability for all attribute associations upon which certainty in a patient geocode depends. METHODS: A series of experiments were undertaken to understand how error estimation could be operationalized within health data and what levels of AAEP in real data reveal themselves using these methods. Specifically, the goals of this evaluation were to (1) assess if the concept of our error assessment techniques could be implemented by a population-based cancer registry; (2) apply the techniques to real data from a large health data agency and characterize the observed levels of AAEP; and (3) demonstrate how detected AAEP might impact spatiotemporal health research. RESULTS: We present an evaluation of AAEP metrics generated for cancer cases in a North Carolina county. We show examples of how we estimated AAEP for selected attribute associations and circumstances. We demonstrate the distribution of AAEP in our case sample across attribute associations, and demonstrate ways in which disease registry specific operations influence the prevalence of AAEP estimates for specific attribute associations. CONCLUSIONS: The effort to detect and store estimates of AAEP is worthwhile because of the increase in confidence fostered by the attribute association level approach to the assessment of uncertainty in patient geocodes, relative to existing geocoding related uncertainty metrics
- …