228 research outputs found

    Quality and complexity measures for data linkage and deduplication

    Get PDF
    Summary. Deduplicating one data set or linking several data sets are increasingly important tasks in the data preparation steps of many data mining projects. The aim of such linkages is to match all records relating to the same entity. Research interest in this area has increased in recent years, with techniques originating from statistics, machine learning, information retrieval, and database research being combined and applied to improve the linkage quality, as well as to increase performance and efficiency when linking or deduplicating very large data sets. Different measures have been used to characterise the quality and complexity of data linkage algorithms, and several new metrics have been proposed. An overview of the issues involved in measuring data linkage and deduplication quality and complexity is presented in this chapter. It is shown that measures in the space of record pair comparisons can produce deceptive quality results. Various measures are discussed and recommendations are given on how to assess data linkage and deduplication quality and complexity. Key words: data or record linkage, data integration and matching, deduplication, data mining pre-processing, quality and complexity measures

    Heterogeneous reaction of ClONO2_{2} with TiO2_{2} and SiO2_{2} aerosol particles: implications for stratospheric particle injection for climate engineering

    Get PDF
    Deliberate injection of aerosol particles into the stratosphere is a potential climate engineering scheme. Particles injected into the stratosphere would scatter solar radiation back to space, thereby reducing the temperature at the Earth's surface and hence the impacts of global warming. Minerals such as TiO2_{2} or SiO2_{2} are among the potentially suitable aerosol materials for stratospheric particle injection due to their greater light-scattering ability than stratospheric sulfuric acid particles. However, the heterogeneous reactivity of mineral particles towards trace gases important for stratospheric chemistry largely remains unknown, precluding reliable assessment of their impacts on stratospheric ozone, which is of key environmental significance. In this work we have investigated for the first time the heterogeneous hydrolysis of ClONO2_{2} on TiO2_{2} and SiO2_{2} aerosol particles at room temperature and at different relative humidities (RHs), using an aerosol flow tube. The uptake coefficient, γ(ClONO2_{2}), on TiO2_{2} was ∼ 1.2 × 103^{-3} at 7 % RH and remained unchanged at 33 % RH, and increased for SiO2_{2} from ∼ 2 × 104^{-4} at 7 % RH to  ∼ 5 × 104^{-4} at 35 % RH, reaching a value of  ∼ 6 × 104^{-4} at 59 % RH. We have also examined the impacts of a hypothetical TiO2_{2} injection on stratospheric chemistry using the UKCA (United Kingdom Chemistry and Aerosol) chemistry–climate model, in which heterogeneous hydrolysis of N2_{2}O5_{5} and ClONO2_{2} on TiO2_{2} particles is considered. A TiO2_{2} injection scenario with a solar-radiation scattering effect very similar to the eruption of Mt Pinatubo was constructed. It is found that, compared to the eruption of Mt Pinatubo, TiO2_{2} injection causes less ClOx_{x} activation and less ozone destruction in the lowermost stratosphere, while reduced depletion of N2_{2}O5_{5} and NOx_{x} in the middle stratosphere results in decreased ozone levels. Overall, no significant difference in the vertically integrated ozone abundances is found between TiO2_{2} injection and the eruption of Mt Pinatubo. Future work required to further assess the impacts of TiO2_{2} injection on stratospheric chemistry is also discussed.Financial support provided by EPSRC grant EP/I01473X/1 and the Isaac Newton Trust (Trinity College, University of Cambridge, UK) is acknowledged. We thank NCAS-CMS for modelling support. Model integrations have been performed using the ARCHER UK National Supercomputing Service. We acknowledge the ERC for support through the ACCI project (project number: 267760). M. J. Tang would like to thank the CAS Pioneer Hundred Talents programme and State Key Laboratory of Organic Geochemistry for providing starting grants

    The International Collaboration for Research methods Development in Oncology (CReDO) workshops: shaping the future of global oncology research

    Get PDF
    Low-income and middle-income countries (LMICs) have a disproportionately high burden of cancer and cancer mortality. The unique barriers to optimum cancer care in these regions necessitate context-specific research. The conduct of research in LMICs has several challenges, not least of which is a paucity of formal training in research methods. Building capacity by training early career researchers is essential to improve research output and cancer outcomes in LMICs. The International Collaboration for Research methods Development in Oncology (CReDO) workshop is an initiative by the Tata Memorial Centre and the National Cancer Grid of India to address gaps in research training and increase capacity in oncology research. Since 2015, there have been five CReDO workshops, which have trained more than 250 oncologists from India and other countries in clinical research methods and protocol development. Participants from all oncology and allied fields were represented at these workshops. Protocols developed included clinical trials, comparative effectiveness studies, health services research, and observational studies, and many of these protocols were particularly relevant to cancer management in LMICs. A follow-up of these participants in 2020 elicited an 88% response rate and showed that 42% of participants had made progress with their CReDO protocols, and 73% had initiated other research protocols and published papers. In this Policy Review, we describe the challenges to research in LMICs, as well as the evolution, structure, and impact of CReDO and other similar workshops on global oncology research

    Meteorological Controls on Local and Regional Volcanic Ash Dispersal

    Get PDF
    Volcanic ash has the capacity to impact human health, livestock, crops and infrastructure, including international air traffic. For recent major eruptions, information on the volcanic ash plume has been combined with relatively coarse-resolution meteorological model output to provide simulations of regional ash dispersal, with reasonable success on the scale of hundreds of kilometres. However, to predict and mitigate these impacts locally, significant improvements in modelling capability are required. Here, we present results from a dynamic meteorological-ash-dispersion model configured with sufficient resolution to represent local topographic and convectively-forced flows. We focus on an archetypal volcanic setting, Soufrière, St Vincent, and use the exceptional historical records of the 1902 and 1979 eruptions to challenge our simulations. We find that the evolution and characteristics of ash deposition on St Vincent and nearby islands can be accurately simulated when the wind shear associated with the trade wind inversion and topographically-forced flows are represented. The wind shear plays a primary role and topographic flows a secondary role on ash distribution on local to regional scales. We propose a new explanation for the downwind ash deposition maxima, commonly observed in volcanic eruptions, as resulting from the detailed forcing of mesoscale meteorology on the ash plume

    Blending of animal colour patterns by hybridization

    Get PDF
    Biologists have long been fascinated by the amazing diversity of animal colour patterns. Despite much interest, the underlying evolutionary and developmental mechanisms contributing to their rich variety remain largely unknown, especially the vivid and complex colour patterns seen in vertebrates. Here, we show that complex and camouflaged animal markings can be formed by the 'blending' of simple colour patterns. A mathematical model predicts that crossing between animals having inverted spot patterns (for example, 'light spots on a dark background' and 'dark spots on a light background') will necessarily result in hybrid offspring that have camouflaged labyrinthine patterns as 'blended' intermediate phenotypes. We confirmed the broad applicability of the model prediction by empirical examination of natural and artificial hybrids of salmonid fish. Our results suggest an unexplored evolutionary process by means of 'pattern blending', as one of the possible mechanisms underlying colour pattern diversity and hybrid speciation

    Host Reproductive Phenology Drives Seasonal Patterns of Host Use in Mosquitoes

    Get PDF
    Seasonal shifts in host use by mosquitoes from birds to mammals drive the timing and intensity of annual epidemics of mosquito-borne viruses, such as West Nile virus, in North America. The biological mechanism underlying these shifts has been a matter of debate, with hypotheses falling into two camps: (1) the shift is driven by changes in host abundance, or (2) the shift is driven by seasonal changes in the foraging behavior of mosquitoes. Here we explored the idea that seasonal changes in host use by mosquitoes are driven by temporal patterns of host reproduction. We investigated the relationship between seasonal patterns of host use by mosquitoes and host reproductive phenology by examining a seven-year dataset of blood meal identifications from a site in Tuskegee National Forest, Alabama USA and data on reproduction from the most commonly utilized endothermic (white-tailed deer, great blue heron, yellow-crowned night heron) and ectothermic (frogs) hosts. Our analysis revealed that feeding on each host peaked during periods of reproductive activity. Specifically, mosquitoes utilized herons in the spring and early summer, during periods of peak nest occupancy, whereas deer were fed upon most during the late summer and fall, the period corresponding to the peak in births for deer. For frogs, however, feeding on early- and late-season breeders paralleled peaks in male vocalization. We demonstrate for the first time that seasonal patterns of host use by mosquitoes track the reproductive phenology of the hosts. Peaks in relative mosquito feeding on each host during reproductive phases are likely the result of increased tolerance and decreased vigilance to attacking mosquitoes by nestlings and brooding adults (avian hosts), quiescent young (avian and mammalian hosts), and mate-seeking males (frogs)

    Gender, school and academic year differences among Spanish university students at high-risk for developing an eating disorder: An epidemiologic study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The aim of this study was to assess the magnitude of the university population at high-risk of developing an eating disorder and the prevalence of unhealthy eating attitudes and behaviours amongst groups at risk; gender, school or academic year differences were also explored.</p> <p>Methods</p> <p>A cross-sectional study based on self-report was used to screen university students at high-risk for an eating disorder. The sample size was of 2551 university students enrolled in 13 schools between the ages of 18 and 26 years. The instruments included: a social-demographic questionnaire, the Eating Disorders Inventory (EDI), the Body Shape Questionnaire (BSQ), the Symptom Check List 90-R (SCL-90-R), and the Self-Esteem Scale (RSE). The sample design is a non-proportional stratified sample by academic year and school. The prevalence rate was estimated controlling academic year and school. Logistic regression analysis was used to investigate adjusted associations between gender, school and academic year.</p> <p>Results</p> <p>Female students presented unhealthy weight-control behaviours as dieting, laxatives use or self-induced vomiting to lose weight than males. A total of 6% of the females had a BMI of 17.5 or less or 2.5% had amenorrhea for 3 or more months. In contrast, a higher proportion of males (11.6%) reported binge eating behaviour. The prevalence rate of students at high-risk for an eating disorder was 14.9% (11.6–18) for males and 20.8% (18.7–22.8) for females, according to an overall cut-off point on the EDI questionnaire. Prevalence rates presented statistically significant differences by gender (p < 0.001) but not by school or academic year.</p> <p>Conclusion</p> <p>The prevalence of eating disorder risk in university students is high and is associated with unhealthy weight-control practices, similar results have been found in previous studies using cut-off points in questionnaires. These results may be taken into account to encourage early detection and a greater awareness for seeking treatment in order to improve the diagnosis, among students on university campuses.</p
    corecore