652 research outputs found

    Fusing Data with Correlations

    Full text link
    Many applications rely on Web data and extraction systems to accomplish knowledge-driven tasks. Web information is not curated, so many sources provide inaccurate, or conflicting information. Moreover, extraction systems introduce additional noise to the data. We wish to automatically distinguish correct data and erroneous data for creating a cleaner set of integrated data. Previous work has shown that a na\"ive voting strategy that trusts data provided by the majority or at least a certain number of sources may not work well in the presence of copying between the sources. However, correlation between sources can be much broader than copying: sources may provide data from complementary domains (\emph{negative correlation}), extractors may focus on different types of information (\emph{negative correlation}), and extractors may apply common rules in extraction (\emph{positive correlation, without copying}). In this paper we present novel techniques modeling correlations between sources and applying it in truth finding.Comment: Sigmod'201

    Absorbent products for urinary/faecal incontinence: a comparative evaluation of key product designs

    No full text
    Background: The UK health service, nursing homes and public spend around £94 million per year on incontinence pads (absorbent products) to contain urine and/or faeces, but the research base for making informed choices between different product designs is very weak.Objectives: The aim of this trial was to compare the performance and cost-effectiveness of the key absorbent product designs to provide a more solid basis for guiding selection and purchase.A further aim was to carry out the first stage in the development of a quality of life instrument for measuring the impact of absorbent product use on users' lives.Design: The work involved three clinical trials focusing on the three biggest market sectors. Each trial had a similar crossover design in which each participant tested all products within their group in random order.Settings, participants and methods: In Trial 1, 85 women with light urinary incontinence living in the community tested three products from each of the four design categories available (total of 12 test products): disposable inserts (pads); menstrual pads; washable pants with integral pad; and washable inserts. In Trial 2a, 85 moderate/heavily incontinent adults (urinary or urinary/faecal) living in the community (49 men and 36 women) tested three (or two) products from each of the five design categories available (total of 14 test products): disposable inserts (with mesh pants); disposable diapers (nappies); disposable pull-ups (similar to toddlers' trainer pants); disposable T-shaped diapers (nappies with waist-band); and washable diapers. All products were provided in a daytime and a (mostly more absorbent) night-time variant. In these first two trials, the test products were selected on the basis of data from pilot studies. In Trial 2b, 100 moderate/heavily incontinent adults (urinary or urinary/faecal) living in 10 nursing homes (27 men and 73 women) evaluated one product from each of the four disposable design categories from Trial 2a. Products were selected on the basis of product performance in Trial 2a and, again, daytime and night-time variants were provided. The first phase of work to develop a quality of life tool for measuring the impact of using different pad designs was carried out by interviewing participants from Trials 1 and 2a.Outcome measures: Product performance was characterised using validated questionnaires, which asked the participants (in Trials 1 and 2a) or carers (all participants in Trial 2b, except for the few who could report for themselves) to evaluate various aspects of pad performance (leakage, ease of putting on, discreetness, etc.) using a five-point scale (very good–very poor) at the end of the week (or 2 weeks for Trial 2b) of product testing. In addition, participants/carers were asked to save individual used pads in bags for weighing and to indicate the severity of any leakage from them on a three-point scale (none, a little, a lot). These data were used to determine differences in leakage performance. Numbers of laundry items and pads used were recorded to estimate costs, and skin health changes were recorded by the participant or by the researchers (Trial 2b). At the end of testing, participants were interviewed and ranked their preferences (with and without costs), stated the acceptability of each design (highly acceptable–totally unacceptable) and recorded their overall opinion on a visual analogue scale (VAS) of 0–100 points (worst design–best design). This VAS score was used with product costs to estimate cost-effectiveness. In addition, a timed pad changing exercise was conducted with 10 women from Trial 2b to determine any differences between product designs.Results: Results presented are for statistically and clinically significant findings.<br/

    Document Filtering for Long-tail Entities

    Full text link
    Filtering relevant documents with respect to entities is an essential task in the context of knowledge base construction and maintenance. It entails processing a time-ordered stream of documents that might be relevant to an entity in order to select only those that contain vital information. State-of-the-art approaches to document filtering for popular entities are entity-dependent: they rely on and are also trained on the specifics of differentiating features for each specific entity. Moreover, these approaches tend to use so-called extrinsic information such as Wikipedia page views and related entities which is typically only available only for popular head entities. Entity-dependent approaches based on such signals are therefore ill-suited as filtering methods for long-tail entities. In this paper we propose a document filtering method for long-tail entities that is entity-independent and thus also generalizes to unseen or rarely seen entities. It is based on intrinsic features, i.e., features that are derived from the documents in which the entities are mentioned. We propose a set of features that capture informativeness, entity-saliency, and timeliness. In particular, we introduce features based on entity aspect similarities, relation patterns, and temporal expressions and combine these with standard features for document filtering. Experiments following the TREC KBA 2014 setup on a publicly available dataset show that our model is able to improve the filtering performance for long-tail entities over several baselines. Results of applying the model to unseen entities are promising, indicating that the model is able to learn the general characteristics of a vital document. The overall performance across all entities---i.e., not just long-tail entities---improves upon the state-of-the-art without depending on any entity-specific training data.Comment: CIKM2016, Proceedings of the 25th ACM International Conference on Information and Knowledge Management. 201

    Patterns of depredation in the Hawai‘i deep-set longline fishery informed by fishery and false killer whale behavior

    Get PDF
    False killer whales (Pseudorca crassidens) depredate bait and catch in the Hawai‘i-based deep-set longline fishery, and as a result, this species is hooked or entangled more than any other cetacean in this fishery. We analyzed data collected by fisheries observers and from satellite-linked transmitters deployed on false killer whales to identify patterns of odontocete depredation that could help fishermen avoid overlap with whales. Odontocete depredation was observed on ˜6% of deep-set hauls across the fleet from 2004 to 2018. Model outcomes from binomial GAMMs suggested coarse patterns, for example, higher rates of depredation in winter, at lower latitudes, and with higher fishing effort. However, explanatory power was low, and no covariates were identified that could be used in a predictive context. The best indicator of depredation was the occurrence of depredation on a previous set of the same vessel. We identified spatiotemporal scales of this repeat depredation to provide guidance to fishermen on how far to move or how long to wait to reduce the probability of repeated interactions. The risk of depredation decreased with both space and time from a previous occurrence, with the greatest benefits achieved by moving ˜400 km or waiting ˜9 d, which reduced the occurrence of depredation from 18% to 9% (a 50% reduction). Fishermen moved a median 46 km and waited 4.7 h following an observed depredation interaction, which our analysis suggests is unlikely to lead to large reductions in risk. Satellite-tagged pelagic false killer whales moved up to 75 km in 4 h and 335 km in 24 h, suggesting that they can likely keep pace with longline vessels for at least four hours and likely longer. We recommend fishermen avoid areas of known depredation or bycatch by moving as far and as quickly as practical, especially within a day or two of the depredation or bycatch event. We also encourage captains to communicate depredation and bycatch occurrence to enable other vessels to similarly avoid high-risk areas

    Reserves and trade jointly determine exposure to food supply shocks

    Get PDF
    While a growing proportion of global food consumption is obtained through international trade, there is an ongoing debate on whether this increased reliance on trade benefits or hinders food security, and specifically, the ability of global food systems to absorb shocks due to local or regional losses of production. This paper introduces a model that simulates the short-term response to a food supply shock originating in a single country, which is partly absorbed through decreases in domestic reserves and consumption, and partly transmitted through the adjustment of trade flows. By applying the model to publicly-available data for the cereals commodity group over a 17 year period, we find that differential outcomes of supply shocks simulated through this time period are driven not only by the intensification of trade, but as importantly by changes in the distribution of reserves. Our analysis also identifies countries where trade dependency may accentuate the risk of food shortages from foreign production shocks; such risk could be reduced by increasing domestic reserves or importing food from a diversity of suppliers that possess their own reserves. This simulation-based model provides a framework to study the short-term, nonlinear and out-of-equilibrium response of trade networks to supply shocks, and could be applied to specific scenarios of environmental or economic perturbations

    Dynamics of two laterally coupled semiconductor lasers: strong- and weak-coupling theory.

    Get PDF
    Copyright © 2008 The American Physical SocietyThe stability and nonlinear dynamics of two semiconductor lasers coupled side to side via evanescent waves are investigated by using three different models. In the composite-cavity model, the coupling between the lasers is accurately taken into account by calculating electric field profiles (composite-cavity modes) of the whole coupled-laser system. A bifurcation analysis of the composite-cavity model uncovers how different types of dynamics, including stationary phase-locking, periodic, quasiperiodic, and chaotic intensity oscillations, are organized. In the individual-laser model, the coupling between individual lasers is introduced phenomenologically with ad hoc coupling terms. Comparison with the composite-cavity model reveals drastic differences in the dynamics. To identify the causes of these differences, we derive a coupled-laser model with coupling terms which are consistent with the solution of the wave equation and the relevant boundary conditions. This coupled-laser model reproduces the dynamics of the composite-cavity model under weak-coupling conditions

    Global Carbon Budget 2015

    Get PDF
    Accurate assessment of anthropogenic carbon dioxide (CO2) emissions and their redistribution among the atmosphere, ocean, and terrestrial biosphere is important to better understand the global carbon cycle, support the development of climate policies, and project future climate change. Here we describe data sets and a methodology to quantify all major components of the global carbon budget, including their uncertainties, based on the combination of a range of data, algorithms, statistics, and model estimates and their interpretation by a broad scientific community. We discuss changes compared to previous estimates as well as consistency within and among components, alongside methodology and data limitations. CO2 emissions from fossil fuels and industry (E-FF) are based on energy statistics and cement production data, while emissions from land-use change (E-LUC), mainly deforestation, are based on combined evidence from land-cover-change data, fire activity associated with deforestation, and models. The global atmospheric CO2 concentration is measured directly and its rate of growth (G(ATM)) is computed from the annual changes in concentration. The mean ocean CO2 sink (S-OCEAN) is based on observations from the 1990s, while the annual anomalies and trends are estimated with ocean models. The variability in S-OCEAN is evaluated with data products based on surveys of ocean CO2 measurements. The global residual terrestrial CO2 sink (S-LAND) is estimated by the difference of the other terms of the global carbon budget and compared to results of independent dynamic global vegetation models forced by observed climate, CO2, and land-cover change (some including nitrogen-carbon interactions). We compare the mean land and ocean fluxes and their variability to estimates from three atmospheric inverse methods for three broad latitude bands. All uncertainties are reported as +/- 1 sigma, reflecting the current capacity to characterise the annual estimates of each component of the global carbon budget. For the last decade available (20052014), E-FF was 9.0 +/- 0.5 GtC yr(-1) E-LUC was 0.9 +/- 0.5 GtC yr(-1), GATM was 4.4 +/- 0.1 GtC yr(-1), S-OCEAN was 2.6 +/- 0.5 GtC yr(-1), and S LAND was 3.0 +/- 0.8 GtC yr(-1). For the year 2014 alone, E FF grew to 9.8 +/- 0.5 GtC yr(-1), 0.6% above 2013, continuing the growth trend in these emissions, albeit at a slower rate compared to the average growth of 2.2% yr(-1) that took place during 2005-2014. Also, for 2014, E-LUC was 1.1 +/- 0.5 GtC yr(-1), G(ATM) was 3.9 +/- 0.2 GtC yr(-1), S-OCEAN was 2.9 +/- 0.5 GtC yr(-1), and S-LAND was 4.1 +/- 0.9 GtC yr(-1). G(ATM) was lower in 2014 compared to the past decade (2005-2014), reflecting a larger S-LAND for that year. The global atmospheric CO2 concentration reached 397.15 +/- 0.10 ppm averaged over 2014. For 2015, preliminary data indicate that the growth in E-FF will be near or slightly below zero, with a projection of 0.6 [ range of 1.6 to C 0.5] %, based on national emissions projections for China and the USA, and projections of gross domestic product corrected for recent changes in the carbon intensity of the global economy for the rest of the world. From this projection of E-FF and assumed constant E LUC for 2015, cumulative emissions of CO2 will reach about 555 +/- 55 GtC (2035 +/- 205 GtCO(2)) for 1870-2015, about 75% from E FF and 25% from E LUC. This living data update documents changes in the methods and data sets used in this new carbon budget compared with previous publications of this data set (Le Quere et al., 2015, 2014, 2013). All observations presented here can be downloaded from the Carbon Dioxide Information Analysis Center (doi: 10.3334/CDIAC/GCP_2015)
    • …
    corecore