13 research outputs found

    UvA-DARE (Digital Academic Repository) Uncertain Data Integration Using Functional Dependencies

    Get PDF
    Abstract. Data integration systems are crucial for applications that need to provide a uniform interface to a set of autonomous and heterogeneous data sources. However, setting up a full data integration system for many application contexts, e.g. web and scientific data management, requires significant human effort which prevents it from being really scalable. In this paper, we propose IFD (Integration based on Functional Dependencies), a pay-as-you-go data integration system that allows integrating a given set of data sources, as well as incrementally integrating additional sources. IFD takes advantage of the background knowledge implied within functional dependencies for matching the source schemas. Our system is built on a probabilistic data model that allows capturing the uncertainty in data integration systems. Our performance evaluation results show significant performance gains of our approach in terms of recall and precision compared to the baseline approaches. They confirm the importance of functional dependencies and also the contribution of using a probabilistic data model in improving the quality of schema matching. The analytical study and experiments show that IFD scales well

    Global, regional, and national cancer incidence, mortality, years of life lost, years lived with disability, and disability-Adjusted life-years for 29 cancer groups, 1990 to 2017 : A systematic analysis for the global burden of disease study

    Get PDF
    Importance: Cancer and other noncommunicable diseases (NCDs) are now widely recognized as a threat to global development. The latest United Nations high-level meeting on NCDs reaffirmed this observation and also highlighted the slow progress in meeting the 2011 Political Declaration on the Prevention and Control of Noncommunicable Diseases and the third Sustainable Development Goal. Lack of situational analyses, priority setting, and budgeting have been identified as major obstacles in achieving these goals. All of these have in common that they require information on the local cancer epidemiology. The Global Burden of Disease (GBD) study is uniquely poised to provide these crucial data. Objective: To describe cancer burden for 29 cancer groups in 195 countries from 1990 through 2017 to provide data needed for cancer control planning. Evidence Review: We used the GBD study estimation methods to describe cancer incidence, mortality, years lived with disability, years of life lost, and disability-Adjusted life-years (DALYs). Results are presented at the national level as well as by Socio-demographic Index (SDI), a composite indicator of income, educational attainment, and total fertility rate. We also analyzed the influence of the epidemiological vs the demographic transition on cancer incidence. Findings: In 2017, there were 24.5 million incident cancer cases worldwide (16.8 million without nonmelanoma skin cancer [NMSC]) and 9.6 million cancer deaths. The majority of cancer DALYs came from years of life lost (97%), and only 3% came from years lived with disability. The odds of developing cancer were the lowest in the low SDI quintile (1 in 7) and the highest in the high SDI quintile (1 in 2) for both sexes. In 2017, the most common incident cancers in men were NMSC (4.3 million incident cases); tracheal, bronchus, and lung (TBL) cancer (1.5 million incident cases); and prostate cancer (1.3 million incident cases). The most common causes of cancer deaths and DALYs for men were TBL cancer (1.3 million deaths and 28.4 million DALYs), liver cancer (572000 deaths and 15.2 million DALYs), and stomach cancer (542000 deaths and 12.2 million DALYs). For women in 2017, the most common incident cancers were NMSC (3.3 million incident cases), breast cancer (1.9 million incident cases), and colorectal cancer (819000 incident cases). The leading causes of cancer deaths and DALYs for women were breast cancer (601000 deaths and 17.4 million DALYs), TBL cancer (596000 deaths and 12.6 million DALYs), and colorectal cancer (414000 deaths and 8.3 million DALYs). Conclusions and Relevance: The national epidemiological profiles of cancer burden in the GBD study show large heterogeneities, which are a reflection of different exposures to risk factors, economic settings, lifestyles, and access to care and screening. The GBD study can be used by policy makers and other stakeholders to develop and improve national and local cancer control in order to achieve the global targets and improve equity in cancer care. © 2019 American Medical Association. All rights reserved.Peer reviewe

    Résolution d'entités dans des données probabilistes

    No full text
    Entity resolution (ER) is the problem of identifying duplicate tuples, which are the tuples that represent the same real-world entity. There are many real-life applications in which the ER problem arises. These applications range from news aggregation websites, identifying the news that cover the same story, in order to avoid presenting one story several times to the user, to the integration of two companies' customer databases in the case of a merger, where identifying the tuples that refer to the same customer is crucial. Due to its diverse applications, the ER problem has been formulated in different ways in the literature. The two main ER's related problem formulations include: 1) identity resolution, and 2) reduplication. In identity resolution, the aim is to find duplicate(s) of a given tuple in a given database, while in deduplication, the aim is to find groups of duplicate tuples in a given database, and merge them in order to increase the quality of the database itself. The ER problem is however not limited to deterministic (ordinary) databases, rather it also arises in applications that deal with probabilistic databases, i.e. databases in which each tuple or attribute value is associated with a probability value to, for instance, indicate its confidence level. In this thesis, we study the ER problem in probabilistic databases. More specifically, we study the following problems: 1) identity resolution in probabilistic data, 2) identity resolution in distributed probabilistic data, 3) deduplication in probabilistic data, and 4) schema matching in a fully automated setting

    Entity Resolution for Probabilistic Data

    No full text
    International audienceEntity resolution is the problem of identifying the tuples that represent the same real world entity. In this paper, we address the problem of entity resolution over probabilistic data (ERPD), which arises in many ap-plications that have to deal with probabilistic data. To deal with the ERPD problem, we distinguish between two classes of similarity functions, i.e. context-free and context-sensitive. We propose a PTIME algorithm for context-free similarity functions, and a Monte Carlo approximation algorithm for context-sensitive similarity functions. We also propose improvements over our proposed algorithms. We validated our algorithms through experiments over both synthetic and real datasets. Our extensive performance evaluation shows the effectiveness of our algorithms

    Pay-As-You-Go Data Integration Using Functional Dependencies

    Get PDF
    Part 1: ConferenceInternational audienceSetting up a full data integration system for many application contexts, e.g. web and scientific data management, requires significant human effort which prevents it from being really scalable. In this paper, we propose IFD (Integration based on Functional Dependencies), a pay-as-you-go data integration system that allows integrating a given set of data sources, as well as incrementally integrating additional sources. IFD takes advantage of the background knowledge implied within functional dependencies for matching the source schemas. Our system is built on a probabilistic data model that allows capturing the uncertainty in data integration systems. Our performance evaluation results show significant performance gains of our approach in terms of recall and precision compared to the baseline approaches. They confirm the importance of functional dependencies and also the contribution of using a probabilistic data model in improving the quality of schema matching. The analytical study and experiments show that IFD scales well

    Digital image watermarking using discrete cosine transformation based linear modulation

    No full text
    Abstract The proportion of multimedia traffic in data networks has grown substantially as a result of advancements in IT. As a result, it's become necessary to address the following challenges in protecting multimedia data: prevention of unauthorized disclosure of sensitive data, in addition to tracking down the leak's origin, making sure no alterations may be made without permission, and safeguarding intellectual property for digital assets. watermarking is a technique developed to combat this issue, which transfer secure data over the network. The main goal of invisible watermarking is a hidden exchange of data and a message from being discovered by a third party. The objective of this work is to develop a digital image watermarking using discrete cosine transformation based linear modulation. This paper proposed an invisible watermarking method for embedding information into the transformation domain for the grey scale images. This method used the embedding of a stego-text into the least significant bit (LSB) of the Discrete Cosine Transformation (DCT) coefficient by using a linear modulation algorithm. Also, a stego-text is embedded with different sizes ten times within images after embedding the stego-image immune to different kinds of attack, such as salt and pepper, rotation, cropping, and JPEG compression with different criteria. The proposed method is tested using four benchmark images. Also, to evaluate the embedding effect, PSNR, NC and BER are calculated. The outcomes show that the proposed approach is practical and robust, where the obtained results are promising and do not raise any suspicion. In addition, it has a large capacity, and its results are imperceptible, especially when 1bit/block is embedded

    Impact of the COVID-19 pandemic on patients with paediatric cancer in low-income, middle-income and high-income countries: a multicentre, international, observational cohort study

    Get PDF
    OBJECTIVES: Paediatric cancer is a leading cause of death for children. Children in low-income and middle-income countries (LMICs) were four times more likely to die than children in high-income countries (HICs). This study aimed to test the hypothesis that the COVID-19 pandemic had affected the delivery of healthcare services worldwide, and exacerbated the disparity in paediatric cancer outcomes between LMICs and HICs. DESIGN: A multicentre, international, collaborative cohort study. SETTING: 91 hospitals and cancer centres in 39 countries providing cancer treatment to paediatric patients between March and December 2020. PARTICIPANTS: Patients were included if they were under the age of 18 years, and newly diagnosed with or undergoing active cancer treatment for Acute lymphoblastic leukaemia, non-Hodgkin's lymphoma, Hodgkin lymphoma, Wilms' tumour, sarcoma, retinoblastoma, gliomas, medulloblastomas or neuroblastomas, in keeping with the WHO Global Initiative for Childhood Cancer. MAIN OUTCOME MEASURE: All-cause mortality at 30 days and 90 days. RESULTS: 1660 patients were recruited. 219 children had changes to their treatment due to the pandemic. Patients in LMICs were primarily affected (n=182/219, 83.1%). Relative to patients with paediatric cancer in HICs, patients with paediatric cancer in LMICs had 12.1 (95% CI 2.93 to 50.3) and 7.9 (95% CI 3.2 to 19.7) times the odds of death at 30 days and 90 days, respectively, after presentation during the COVID-19 pandemic (p<0.001). After adjusting for confounders, patients with paediatric cancer in LMICs had 15.6 (95% CI 3.7 to 65.8) times the odds of death at 30 days (p<0.001). CONCLUSIONS: The COVID-19 pandemic has affected paediatric oncology service provision. It has disproportionately affected patients in LMICs, highlighting and compounding existing disparities in healthcare systems globally that need addressing urgently. However, many patients with paediatric cancer continued to receive their normal standard of care. This speaks to the adaptability and resilience of healthcare systems and healthcare workers globally

    The global, regional, and national burden of colorectal cancer and its attributable risk factors in 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017

    Get PDF
    Safiri S, Sepanlou SG, Ikuta KS, et al. The global, regional, and national burden of colorectal cancer and its attributable risk factors in 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017. LANCET GASTROENTEROLOGY &amp; HEPATOLOGY. 2019;4(12):913-933.Background Data about the global, regional, and country-specific variations in the levels and trends of colorectal cancer are required to understand the impact of this disease and the trends in its burden to help policy makers allocate resources. Here we provide a status report on the incidence, mortality, and disability caused by colorectal cancer in 195 countries and territories between 1990 and 2017. Methods Vital registration, sample vital registration, verbal autopsy, and cancer registry data were used to generate incidence, death, and disability-adjusted life-year (DALY) estimates of colorectal cancer at the global, regional, and national levels. We also determined the association between development levels and colorectal cancer age-standardised DALY rates, and calculated DALYs attributable to risk factors that had evidence of causation with colorectal cancer. All of the estimates are reported as counts and age-standardised rates per 100 000 person-years, with some estimates also presented by sex and 5-year age groups. Findings In 2017, there were 1.8 million (95% UI 1.8-1.9) incident cases of colorectal cancer globally, with an age-standardised incidence rate of 23.2 (22.7-23.7) per 100 000 person-years that increased by 9.5% (4.5-13.5) between 1990 and 2017. Globally, colorectal cancer accounted for 896 000 (876 300-915 700) deaths in 2017, with an age-standardised death rate of 11.5 (11.3-11.8) per 100 000 person-years, which decreased between 1990 and 2017 (-13.5% [-18.4 to -10.0]). Colorectal cancer was also responsible for 19.0 million (18.5-19.5) DALYs globally in 2017, with an age-standardised rate of 235.7 (229.7-242.0) DALYs per 100 000 person-years, which decreased between 1990 and 2017 (-14.5% [-20.4 to -10.3]). Slovakia, the Netherlands, and New Zealand had the highest age-standardised incidence rates in 2017. Greenland, Hungary, and Slovakia had the highest age-standardised death rates in 2017. Numbers of incident cases and deaths were higher among males than females up to the ages of 80-84 years, with the highest rates observed in the oldest age group (>= 95 years) for both sexes in 2017. There was a non-linear association between the Socio-demographic Index and the Healthcare Access and Quality Index and age-standardised DALY rates. In 2017, the three largest contributors to DALYs at the global level, for both sexes, were diet low in calcium (20.5% [12.9-28.9]), alcohol use (15.2% [12.1-18.3]), and diet low in milk (14.3% [5.1-24.8]). Interpretation There is substantial global variation in the burden of colorectal cancer. Although the overall colorectal cancer age-standardised death rate has been decreasing at the global level, the increasing age-standardised incidence rate in most countries poses a major public health challenge across the world. The results of this study could be useful for policy makers to carry out cost-effective interventions and to reduce exposure to modifiable risk factors, particularly in countries with high incidence or increasing burden. Copyright (C) 2019 The Author(s). Published by Elsevier Ltd

    Twelve-month observational study of children with cancer in 41 countries during the COVID-19 pandemic

    No full text
    Childhood cancer is a leading cause of death. It is unclear whether the COVID-19 pandemic has impacted childhood cancer mortality. In this study, we aimed to establish all-cause mortality rates for childhood cancers during the COVID-19 pandemic and determine the factors associated with mortality
    corecore