Search CORE

16 research outputs found

Improve Robustness of Eye Disease Detection by including Learnable Probabilistic Discrete Latent Variables into Machine Learning Models

Author: Cheng Ching-Yu
Liu Dianbo
Prabhakaran Anirudh
Xiao YeKun
Publication venue
Publication date: 20/01/2024
Field of study

Ocular diseases, ranging from diabetic retinopathy to glaucoma, present a significant public health challenge due to their prevalence and potential for causing vision impairment. Early and accurate diagnosis is crucial for effective treatment and management.In recent years, deep learning models have emerged as powerful tools for analysing medical images, including ocular imaging . However, challenges persist in model interpretability and uncertainty estimation, which are critical for clinical decision-making. This study introduces a novel application of GFlowOut, leveraging the probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks, for the classification and analysis of ocular diseases using eye fundus images. We develop a robust and generalizable method that utilizes GFlowOut integrated with ResNet18 and ViT models as backbone in identifying various ocular conditions. This study employs a unique set of dropout masks - none, random, bottomup, and topdown - to enhance model performance in analyzing ocular images. Our results demonstrate that the bottomup GFlowOut mask significantly improves accuracy, outperforming the traditional dropout approach.Comment: This is a work in progres

arXiv.org e-Print Archive

A New Dataset and Method for Creativity Assessment Using the Alternate Uses Task

Author: Gu Hongyi
Myers Rebecca
Sun Luning
Yuan Zheng
Publication venue
Publication date: 28/01/2024
Field of study

Creativity ratings by humans for the alternate uses task (AUT) tend to be subjective and inefficient. To automate the scoring process of the AUT, previous literature suggested using semantic distance from non-contextual models. In this paper, we extend this line of research by including contextual semantic models and more importantly, exploring the feasibility of predicting creativity ratings with supervised discriminative machine learning models. Based on a newly collected dataset, our results show that supervised models can successfully classify between creative and non-creative responses even with unbalanced data, and can generalise well to out-of-domain unseen prompts

King's Research Portal

Performance Optimization for Federated Person Re-identification via Benchmark Analysis

Author: Caldas Sebastian
Hao Tianshu
He K.
Li Tian
Li W.
Li W.
Li Wei
Li Wei
McMahan Brendan
Wang Taiqing
Yao Xin
Yuan Yufeng
Zhang Shiliang
Zheng Liang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/08/2020
Field of study

Federated learning is a privacy-preserving machine learning technique that learns a shared model across decentralized clients. It can alleviate privacy concerns of personal re-identification, an important computer vision task. In this work, we implement federated learning to person re-identification (FedReID) and optimize its performance affected by statistical heterogeneity in the real-world scenario. We first construct a new benchmark to investigate the performance of FedReID. This benchmark consists of (1) nine datasets with different volumes sourced from different domains to simulate the heterogeneous situation in reality, (2) two federated scenarios, and (3) an enhanced federated algorithm for FedReID. The benchmark analysis shows that the client-edge-cloud architecture, represented by the federated-by-dataset scenario, has better performance than client-server architecture in FedReID. It also reveals the bottlenecks of FedReID under the real-world scenario, including poor performance of large datasets caused by unbalanced weights in model aggregation and challenges in convergence. Then we propose two optimization methods: (1) To address the unbalanced weight problem, we propose a new method to dynamically change the weights according to the scale of model changes in clients in each training round; (2) To facilitate convergence, we adopt knowledge distillation to refine the server model with knowledge generated from client models on a public dataset. Experiment results demonstrate that our strategies can achieve much better convergence with superior performance on all datasets. We believe that our work will inspire the community to further explore the implementation of federated learning on more computer vision tasks in real-world scenarios.Comment: ACMMM'2

arXiv.org e-Print Archive

Crossref

Harvard Eye Fairness: A Large-Scale 3D Imaging Dataset for Equitable Eye Diseases Screening and Fair Identity Scaling

Author: Elze Tobias
Luo Yan
Shi Min
Tian Yu
Wang Mengyu
Publication venue
Publication date: 05/10/2023
Field of study

Fairness or equity in machine learning is profoundly important for societal well-being, but limited public datasets hinder its progress, especially in the area of medicine. It is undeniable that fairness in medicine is one of the most important areas for fairness learning's applications. Currently, no large-scale public medical datasets with 3D imaging data for fairness learning are available, while 3D imaging data in modern clinics are standard tests for disease diagnosis. In addition, existing medical fairness datasets are actually repurposed datasets, and therefore they typically have limited demographic identity attributes with at most three identity attributes of age, gender, and race for fairness modeling. To address this gap, we introduce our Eye Fairness dataset with 30,000 subjects (Harvard-EF) covering three major eye diseases including age-related macular degeneration, diabetic retinopathy, and glaucoma affecting 380 million patients globally. Our Harvard-EF dataset includes both 2D fundus photos and 3D optical coherence tomography scans with six demographic identity attributes including age, gender, race, ethnicity, preferred language, and marital status. We also propose a fair identity scaling (FIS) approach combining group and individual scaling together to improve model fairness. Our FIS approach is compared with various state-of-the-art fairness learning methods with superior performance in the racial, gender, and ethnicity fairness tasks with 2D and 3D imaging data, which demonstrate the utilities of our Harvard-EF dataset for fairness learning. To facilitate fairness comparisons between different models, we propose performance-scaled disparity measures, which can be used to compare model fairness accounting for overall performance levels. The dataset and code are publicly accessible via https://ophai.hms.harvard.edu/datasets/harvard-ef30k

arXiv.org e-Print Archive

A Comprehensive Survey on Database Management System Fuzzing: Techniques, Taxonomy and Experimental Comparison

Author: Cui Jiangtao
Gao Xiyue
Li Hui
Liu Zhuang
Wei Kewei
Zhang Hui
Zhao Kankan
Publication venue
Publication date: 11/11/2023
Field of study

Database Management System (DBMS) fuzzing is an automated testing technique aimed at detecting errors and vulnerabilities in DBMSs by generating, mutating, and executing test cases. It not only reduces the time and cost of manual testing but also enhances detection coverage, providing valuable assistance in developing commercial DBMSs. Existing fuzzing surveys mainly focus on general-purpose software. However, DBMSs are different from them in terms of internal structure, input/output, and test objectives, requiring specialized fuzzing strategies. Therefore, this paper focuses on DBMS fuzzing and provides a comprehensive review and comparison of the methods in this field. We first introduce the fundamental concepts. Then, we systematically define a general fuzzing procedure and decompose and categorize existing methods. Furthermore, we classify existing methods from the testing objective perspective, covering various components in DBMSs. For representative works, more detailed descriptions are provided to analyze their strengths and limitations. To objectively evaluate the performance of each method, we present an open-source DBMS fuzzing toolkit, OpenDBFuzz. Based on this toolkit, we conduct a detailed experimental comparative analysis of existing methods and finally discuss future research directions.Comment: 34 pages, 22 figure

arXiv.org e-Print Archive