Search CORE

6 research outputs found

Cancer-Net PCa-Data: An Open-Source Benchmark Dataset for Prostate Cancer Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data

Author: Gunraj Hayden
Tai Chi-en Amy
Wong Alexander
Publication venue
Publication date: 20/11/2023
Field of study

The recent introduction of synthetic correlated diffusion (CDI

^s

) imaging has demonstrated significant potential in the realm of clinical decision support for prostate cancer (PCa). CDI

^s

is a new form of magnetic resonance imaging (MRI) designed to characterize tissue characteristics through the joint correlation of diffusion signal attenuation across different Brownian motion sensitivities. Despite the performance improvement, the CDI

^s

data for PCa has not been previously made publicly available. In our commitment to advance research efforts for PCa, we introduce Cancer-Net PCa-Data, an open-source benchmark dataset of volumetric CDI

^s

imaging data of PCa patients. Cancer-Net PCa-Data consists of CDI

^s

volumetric images from a patient cohort of 200 patient cases, along with full annotations (gland masks, tumor masks, and PCa diagnosis for each tumor). We also analyze the demographic and label region diversity of Cancer-Net PCa-Data for potential biases. Cancer-Net PCa-Data is the first-ever public dataset of CDI

^s

imaging data for PCa, and is a part of the global open-source initiative dedicated to advancement in machine learning and imaging research to aid clinicians in the global fight against cancer

arXiv.org e-Print Archive

COVIDx CXR-4: An Expanded Multi-Institutional Open-Source Benchmark Dataset for Chest X-ray Image-Based Computer-Aided COVID-19 Diagnostics

Author: Gunraj Hayden
Tai Chi-en Amy
Wong Alexander
Wu Yifan
Publication venue
Publication date: 29/11/2023
Field of study

The global ramifications of the COVID-19 pandemic remain significant, exerting persistent pressure on nations even three years after its initial outbreak. Deep learning models have shown promise in improving COVID-19 diagnostics but require diverse and larger-scale datasets to improve performance. In this paper, we introduce COVIDx CXR-4, an expanded multi-institutional open-source benchmark dataset for chest X-ray image-based computer-aided COVID-19 diagnostics. COVIDx CXR-4 expands significantly on the previous COVIDx CXR-3 dataset by increasing the total patient cohort size by greater than 2.66 times, resulting in 84,818 images from 45,342 patients across multiple institutions. We provide extensive analysis on the diversity of the patient demographic, imaging metadata, and disease distributions to highlight potential dataset biases. To the best of the authors' knowledge, COVIDx CXR-4 is the largest and most diverse open-source COVID-19 CXR dataset and is made publicly available as part of an open initiative to advance research to aid clinicians against the COVID-19 disease

arXiv.org e-Print Archive

Double-Condensing Attention Condenser: Leveraging Attention in Deep Learning to Detect Skin Cancer from Skin Lesion Images

Author: Czarnecki Chris
Janes Elizabeth
Tai Chi-en Amy
Wong Alexander
Publication venue
Publication date: 20/11/2023
Field of study

Skin cancer is the most common type of cancer in the United States and is estimated to affect one in five Americans. Recent advances have demonstrated strong performance on skin cancer detection, as exemplified by state of the art performance in the SIIM-ISIC Melanoma Classification Challenge; however these solutions leverage ensembles of complex deep neural architectures requiring immense storage and compute costs, and therefore may not be tractable. A recent movement for TinyML applications is integrating Double-Condensing Attention Condensers (DC-AC) into a self-attention neural network backbone architecture to allow for faster and more efficient computation. This paper explores leveraging an efficient self-attention structure to detect skin cancer in skin lesion images and introduces a deep neural network design with DC-AC customized for skin cancer detection from skin lesion images. The final model is publicly available as a part of a global open-source initiative dedicated to accelerating advancement in machine learning to aid clinicians in the fight against cancer

arXiv.org e-Print Archive

Cancer-Net PCa-Gen: Synthesis of Realistic Prostate Diffusion Weighted Imaging Data via Anatomic-Conditional Controlled Latent Diffusion

Author: Chen Yuhao
Gunraj Hayden
Sridhar Aditya
Tai Chi-en Amy
Wong Alexander
Publication venue
Publication date: 30/11/2023
Field of study

In Canada, prostate cancer is the most common form of cancer in men and accounted for 20% of new cancer cases for this demographic in 2022. Due to recent successes in leveraging machine learning for clinical decision support, there has been significant interest in the development of deep neural networks for prostate cancer diagnosis, prognosis, and treatment planning using diffusion weighted imaging (DWI) data. A major challenge hindering widespread adoption in clinical use is poor generalization of such networks due to scarcity of large-scale, diverse, balanced prostate imaging datasets for training such networks. In this study, we explore the efficacy of latent diffusion for generating realistic prostate DWI data through the introduction of an anatomic-conditional controlled latent diffusion strategy. To the best of the authors' knowledge, this is the first study to leverage conditioning for synthesis of prostate cancer imaging. Experimental results show that the proposed strategy, which we call Cancer-Net PCa-Gen, enhances synthesis of diverse prostate images through controllable tumour locations and better anatomical and textural fidelity. These crucial features make it well-suited for augmenting real patient data, enabling neural networks to be trained on a more diverse and comprehensive data distribution. The Cancer-Net PCa-Gen framework and sample images have been made publicly available at https://www.kaggle.com/datasets/deetsadi/cancer-net-pca-gen-dataset as a part of a global open-source initiative dedicated to accelerating advancement in machine learning to aid clinicians in the fight against cancer

arXiv.org e-Print Archive

NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches

Author: Chen Yuhao
Keller Heather
Keller Matthew
Kirkpatrick Sharon
Markham Olivia
Nair Saeejith
Parmar Krish
Tai Chi-en Amy
Wong Alexander
Wu Yifan
Xi Pengcheng
Publication venue
Publication date: 14/09/2023
Field of study

Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating, as malnutrition has been directly linked to decreased quality of life. However self-reporting methods such as food diaries suffer from substantial bias. Other conventional dietary assessment techniques and emerging alternative approaches such as mobile applications incur high time costs and may necessitate trained personnel. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images, but the lack of comprehensive datasets with diverse viewpoints, modalities and food annotations hinders the accuracy and realism of such methods. To address this limitation, we introduce NutritionVerse-Synth, the first large-scale dataset of 84,984 photorealistic synthetic 2D food images with associated dietary information and multimodal annotations (including depth images, instance masks, and semantic masks). Additionally, we collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism. Leveraging these novel datasets, we develop and benchmark NutritionVerse, an empirical study of various dietary intake estimation approaches, including indirect segmentation-based and direct prediction networks. We further fine-tune models pretrained on synthetic data with real images to provide insights into the fusion of synthetic and real data. Finally, we release both datasets (NutritionVerse-Synth, NutritionVerse-Real) on https://www.kaggle.com/nutritionverse/datasets as part of an open initiative to accelerate machine learning for dietary sensing

arXiv.org e-Print Archive

Foodverse: A Dataset of 3D Food Models for Nutritional Intake Estimation

Author: Chen Yuhao
Keller Matthew E
Kerrigan Mattie
Nair Saeejith
Pengcheng Xi
Tai Chi-en Amy
Wong Alexander
Publication venue: University of Waterloo (Waterloo, Ontario, Canada)
Publication date: 10/05/2023
Field of study

77% of adults over 50 want to age in place today, presenting a major challenge of ensuring adequate nutritional intake. Recent advancements in machine learning and computer vision show promise of automated tracking methods, but require a large high-quality dataset to have accurate performance. Existing datasets comprise of 2D images with discretely sampled camera views, unrepresentative of the different angles and quality taken by older individuals. By leveraging view synthesis for 3D models, an infinite number of 2D images can be generated for any given viewpoint/camera angle. In this paper, we develop a methodology for collecting high-quality 3D models for food items with a particular focus on speed and consistency, and introduce Foodverse, a large-scale high-quality high-resolution multimodal dataset of 52 3D food models, in conjunction with their associated weight, food name, language description, and nutritional value. We also demonstrate 2D view synthesis using these 3D food models

Waterloo Library Journal Publishing Service (University of Waterloo, Canada)