15 research outputs found

    Soccer Hooligans, Ethnic Nationalism and Political Economy in Bulgaria

    Get PDF
    Soccer Hooligans, Ethnic Nationalism and Political Economy in Bulgari

    Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data

    Get PDF
    Generative models trained with Differential Privacy (DP) can be used to generate synthetic data while minimizing privacy risks. We analyze the impact of DP on these models vis-a-vis underrepresented classes/subgroups of data, specifically, studying: 1) the size of classes/subgroups in the synthetic data, and 2) the accuracy of classification tasks run on them. We also evaluate the effect of various levels of imbalance and privacy budgets. Our analysis uses three state-of-the-art DP models (PrivBayes, DP-WGAN, and PATE-GAN) and shows that DP yields opposite size distributions in the generated synthetic data. It affects the gap between the majority and minority classes/subgroups; in some cases by reducing it (a "Robin Hood" effect) and, in others, by increasing it (a "Matthew" effect). Either way, this leads to (similar) disparate impacts on the accuracy of classification tasks on the synthetic data, affecting disproportionately more the underrepresented subparts of the data. Consequently, when training models on synthetic data, one might incur the risk of treating different subpopulations unevenly, leading to unreliable or unfair conclusions

    A rare case of Ewing's sarcoma in a 4-year-old child treated by tumor endoprosthetics using 3D printing

    Get PDF
    Orthopedic oncology surgery often requires, by its very nature, precise and often extensive resections of bone and soft tissue involved in or near the tumor mass. One of the most recent and promising innovations is represented by 3D printing technology, whose main advantage in this field of application is patient specificity, which is essential in an operation that requires high precision and maximum respect for the individuality of his bones and soft tissues. Material and methods: In the present report, we present a 4-year-old boy diagnosed with Ewing's sarcoma involving Âľ of the right tibia. In another medical facility, he was offered amputation. Our team decided to use the "3D printed tumor megaendoprosthesis, double growing, from the Czech company Prospon. For reinsertion of the muscle groups to the endoprosthesis, we used a LARS textile tube that was attached to the femoral and tibial components of the endoprosthesis. A vascular surgeon also participated in the team. The patellar ligament was reinserted to the tibial component, and myoplasty was additionally performed with the medial head of the m. gastrocnemius. Intraoperatively, we lengthened the lower limb by 1.5 cm to delay the upcoming staged lengthening. Results: The postoperative period was uneventful, with sutures removed on the 12th postoperative day. For 3 weeks, a tutor orthosis was placed. Active physiotherapy was started after removal of the orthosis 21 days after surgery. Conclusion: Our goal is to perform a total revision at the end of skeletal growth if possible and replace the current implant with a non-growing tumor megaendoprosthesis in the absence of near or distant metastases and long-term patient survival. Future expectations are that non-invasive lengthening mechanisms or a biological approach will be able to meet the special needs of this population

    On the Challenges of Deploying Privacy-Preserving Synthetic Data in the Enterprise

    Full text link
    Generative AI technologies are gaining unprecedented popularity, causing a mix of excitement and apprehension through their remarkable capabilities. In this paper, we study the challenges associated with deploying synthetic data, a subfield of Generative AI. Our focus centers on enterprise deployment, with an emphasis on privacy concerns caused by the vast amount of personal and highly sensitive data. We identify 40+ challenges and systematize them into five main groups -- i) generation, ii) infrastructure & architecture, iii) governance, iv) compliance & regulation, and v) adoption. Additionally, we discuss a strategic and systematic approach that enterprises can employ to effectively address the challenges and achieve their goals by establishing trust in the implemented solutions.Comment: Accepted to the 1st Workshop on Challenges in Deployable Generative AI, part of ICML 202

    Computer Simulations of Air Quality and Bio-Climatic Indices for the City of Sofia

    No full text
    Air pollution is responsible for many adverse effects on human beings. Thermal discomfort, on the other hand, is able to overload the human body and eventually provoke health implications due to the heat imbalance. Methods: The aim of the presented work is to study the behavior of two bio-climatic indices and statistical characteristics of the air quality index for Sofia city—the capital of Bulgaria for the period 2008–2014. The study is based on the WRF-CMAQ model system simulations with a spatial resolution of 1 km. The air quality is estimated by the air quality index, taking into account the influence of different pollutants and the thermal conditions by two indices, respectively, for hot and cold weather. It was found that the recurrence of both the heat and cold index categories and of the air quality categories have heterogeneous space distribution and well manifested diurnal and seasonal variability. For all of the situations, only O3 and PM10 are the dominant pollutants—these which determine the AQI category. It was found that AQI1, AQI2, and AQI3, which fall in the “Low” band, have the highest recurrence during the different seasons, up to more than 70% in some places and situations. The recurrence of AQI10 (very high) is rather small—no more than 5% and concentrated in small areas, mostly in the city center. The Heat index of category “Danger” never appears, and the Heat index of category “Extreme caution” appears only in the spring and summer with the highest recurrence of less than 5% in the city center. For the Wind-chill index category, “Very High Risk” never appears, and the category “High Risk” appears with a frequency of about 1–2%. The above leads to the conclusion that both from a point of view of bioclimatic and air quality indices, the human health risks in the city of Sofia are not as high

    Editorial for the Special Issue “Atmospheric Composition and Regional Climate Studies in Bulgaria”

    No full text
    The Special Issue “Atmospheric composition and regional climate studies in Bulgaria” is focused on the following two problems, which are of great societal and scientific importance: [...

    On Utility and Privacy in Synthetic Genomic Data

    No full text
    The availability of genomic data is essential to progress in biomedical research, personalized medicine, etc. However, its extreme sensitivity makes it problematic, if not outright impossible, to publish or share it. As a result, several initiatives have been launched to experiment with synthetic genomic data, e.g., using generative models to learn the underlying distribution of the real data and generate artificial datasets that preserve its salient characteristics without exposing it. This paper provides the first evaluation of both utility and privacy protection of six state-of-the-art models for generating synthetic genomic data. We assess the performance of the synthetic data on several common tasks, such as allele population statistics and linkage disequilibrium. We then measure privacy through the lens of membership inference attacks, i.e., inferring whether a record was part of the training data. Our experiments show that no single approach to generate synthetic genomic data yields both high utility and strong privacy across the board. Also, the size and nature of the training dataset matter. Moreover, while some combinations of datasets and models produce synthetic data with distributions close to the real data, there often are target data points that are vulnerable to membership inference. Looking forward, our techniques can be used by practitioners to assess the risks of deploying synthetic genomic data in the wild and serve as a benchmark for future work

    Understanding how Differentially Private Generative Models Spend their Privacy Budget

    Full text link
    Generative models trained with Differential Privacy (DP) are increasingly used to produce synthetic data while reducing privacy risks. Navigating their specific privacy-utility tradeoffs makes it challenging to determine which models would work best for specific settings/tasks. In this paper, we fill this gap in the context of tabular data by analyzing how DP generative models distribute privacy budgets across rows and columns, arguably the main source of utility degradation. We examine the main factors contributing to how privacy budgets are spent, including underlying modeling techniques, DP mechanisms, and data dimensionality. Our extensive evaluation of both graphical and deep generative models sheds light on the distinctive features that render them suitable for different settings and tasks. We show that graphical models distribute the privacy budget horizontally and thus cannot handle relatively wide datasets while the performance on the task they were optimized for monotonically increases with more data. Deep generative models spend their budget per iteration, so their behavior is less predictable with varying dataset dimensions but could perform better if trained on more features. Also, low levels of privacy (ϵ≥100\epsilon\geq100) could help some models generalize, achieving better results than without applying DP
    corecore