Search CORE

5 research outputs found

Overview of the feature disentanglement modeling approach.

Author: Aaron Y. Lee (7255835)
Anthony Ortiz (13919422)
Anusua Trivedi (13919419)
Caleb Robinson (8345691)
Jayashree Kalpathy-Cramer (837590)
Jocelyn Desbiens (13919425)
Juan M. Lavista Ferres (13919431)
Marian Blazes (9568501)
Pavan K. Bhatraju (11438695)
Rahul Dodhia (13919428)
Sunil Gupta (309664)
W. Conrad Liles (8842715)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/10/2022
Field of study

We propose to learn a model that simultaneously predicts the class label and domain label for a given CXR image. The parameters of the model are updated to extract representations that contain information about the class label but not about domain label.</p

The Francis Crick Institute

UMAP projections of features learned from models trained with and without feature disentanglement on unmasked imagery.

Author: Aaron Y. Lee (7255835)
Anthony Ortiz (13919422)
Anusua Trivedi (13919419)
Caleb Robinson (8345691)
Jayashree Kalpathy-Cramer (837590)
Jocelyn Desbiens (13919425)
Juan M. Lavista Ferres (13919431)
Marian Blazes (9568501)
Pavan K. Bhatraju (11438695)
Rahul Dodhia (13919428)
Sunil Gupta (309664)
W. Conrad Liles (8842715)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/10/2022
Field of study

Each point represents a CXR from the COVIDx dataset. The top row colors points by their domain label—which subdataset of the COVIDx dataset they are in—while the bottom row colors points by their disease label. We observe that without feature disentanglement, the learned representations easily separate datasets—despite not being trained for this task—however, with feature disentanglement, the learned representations do not clearly separate datasets.</p

The Francis Crick Institute

Results showing within-dataset class performance, within-dataset domain performance, and out-of-sample class performance from training models with the COVIDx dataset.

Author: Aaron Y. Lee (7255835)
Anthony Ortiz (13919422)
Anusua Trivedi (13919419)
Caleb Robinson (8345691)
Jayashree Kalpathy-Cramer (837590)
Jocelyn Desbiens (13919425)
Juan M. Lavista Ferres (13919431)
Marian Blazes (9568501)
Pavan K. Bhatraju (11438695)
Rahul Dodhia (13919428)
Sunil Gupta (309664)
W. Conrad Liles (8842715)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/10/2022
Field of study

The task performance (task AUC and task accuracy) shows how well classifiers are able to distinguish between “Normal”, “Pneumonia”, and “COVID-19+” disease labels, while the domain performance (domain AUC) shows how well classifiers are able to distinguish which sub-dataset an image belongs to. We report AUC values as averages of the one-vs-all binary AUCs between all classes, and accuracy (ACC) as the average accuracy over all classes. In all cases class performance (both within-dataset and out-of-sample) is reported from the classifier trained on samples within-dataset, while domain performance is reported from an additional classifier trained to predict domain labels on top of the learned representations, z′, as a measure of how much domain information the representation contains. We observe that using feature disentanglement decreases within-dataset domain performance as expected, and increases out-of-sample class performance—i.e. improves generalization performance.</p

The Francis Crick Institute

Dataset overview.

Author: Aaron Y. Lee (7255835)
Anthony Ortiz (13919422)
Anusua Trivedi (13919419)
Caleb Robinson (8345691)
Jayashree Kalpathy-Cramer (837590)
Jocelyn Desbiens (13919425)
Juan M. Lavista Ferres (13919431)
Marian Blazes (9568501)
Pavan K. Bhatraju (11438695)
Rahul Dodhia (13919428)
Sunil Gupta (309664)
W. Conrad Liles (8842715)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/10/2022
Field of study

Counts of disease label type per dataset. The COVIDx dataset is made up of 5 sub-datasets and the CC-CCII dataset is used as a held-out test set.</p

The Francis Crick Institute

Results showing how well logistic regression classifiers can identify which sub-dataset a CXR is from within the COVIDx dataset, and how well classifiers can identify which dataset a “COVID-19+” CXR is from across both the COVIDx and CC-CCII datasets.

Author: Aaron Y. Lee (7255835)
Anthony Ortiz (13919422)
Anusua Trivedi (13919419)
Caleb Robinson (8345691)
Jayashree Kalpathy-Cramer (837590)
Jocelyn Desbiens (13919425)
Juan M. Lavista Ferres (13919431)
Marian Blazes (9568501)
Pavan K. Bhatraju (11438695)
Rahul Dodhia (13919428)
Sunil Gupta (309664)
W. Conrad Liles (8842715)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/10/2022
Field of study

We report AUC values as averages of the one-vs-all binary AUCs between all classes, and accuracy (ACC) as the average accuracy over all classes. We observe that the representations generated by the classifiers, even from masked/equalized inputs, contain enough information to accurately identify the sources of the imagery in both cases.</p

The Francis Crick Institute