144 research outputs found
MoCaE: Mixture of Calibrated Experts Significantly Improves Object Detection
We propose an extremely simple and highly effective approach to faithfully
combine different object detectors to obtain a Mixture of Experts (MoE) that
has a superior accuracy to the individual experts in the mixture. We find that
naively combining these experts in a similar way to the well-known Deep
Ensembles (DEs), does not result in an effective MoE. We identify the
incompatibility between the confidence score distribution of different
detectors to be the primary reason for such failure cases. Therefore, to
construct the MoE, our proposal is to first calibrate each individual detector
against a target calibration function. Then, filter and refine all the
predictions from different detectors in the mixture. We term this approach as
MoCaE and demonstrate its effectiveness through extensive experiments on object
detection, instance segmentation and rotated object detection tasks.
Specifically, MoCaE improves (i) three strong object detectors on COCO test-dev
by by reaching ; (ii) instance
segmentation methods on the challenging long-tailed LVIS dataset by
; and (iii) all existing rotated object detectors by reaching
on DOTA dataset, establishing a new state-of-the-art
(SOTA). Code will be made public
Localization Recall Precision (LRP): A New Performance Metric for Object Detection
Average precision (AP), the area under the recall-precision (RP) curve, is
the standard performance measure for object detection. Despite its wide
acceptance, it has a number of shortcomings, the most important of which are
(i) the inability to distinguish very different RP curves, and (ii) the lack of
directly measuring bounding box localization accuracy. In this paper, we
propose 'Localization Recall Precision (LRP) Error', a new metric which we
specifically designed for object detection. LRP Error is composed of three
components related to localization, false negative (FN) rate and false positive
(FP) rate. Based on LRP, we introduce the 'Optimal LRP', the minimum achievable
LRP error representing the best achievable configuration of the detector in
terms of recall-precision and the tightness of the boxes. In contrast to AP,
which considers precisions over the entire recall domain, Optimal LRP
determines the 'best' confidence score threshold for a class, which balances
the trade-off between localization and recall-precision. In our experiments, we
show that, for state-of-the-art object (SOTA) detectors, Optimal LRP provides
richer and more discriminative information than AP. We also demonstrate that
the best confidence score thresholds vary significantly among classes and
detectors. Moreover, we present LRP results of a simple online video object
detector which uses a SOTA still image object detector and show that the
class-specific optimized thresholds increase the accuracy against the common
approach of using a general threshold for all classes. At
https://github.com/cancam/LRP we provide the source code that can compute LRP
for the PASCAL VOC and MSCOCO datasets. Our source code can easily be adapted
to other datasets as well.Comment: to appear in ECCV 201
Segment, Select, Correct: A Framework for Weakly-Supervised Referring Segmentation
Referring Image Segmentation (RIS) - the problem of identifying objects in
images through natural language sentences - is a challenging task currently
mostly solved through supervised learning. However, while collecting referred
annotation masks is a time-consuming process, the few existing
weakly-supervised and zero-shot approaches fall significantly short in
performance compared to fully-supervised learning ones. To bridge the
performance gap without mask annotations, we propose a novel weakly-supervised
framework that tackles RIS by decomposing it into three steps: obtaining
instance masks for the object mentioned in the referencing instruction
(segment), using zero-shot learning to select a potentially correct mask for
the given instruction (select), and bootstrapping a model which allows for
fixing the mistakes of zero-shot selection (correct). In our experiments, using
only the first two steps (zero-shot segment and select) outperforms other
zero-shot baselines by as much as 19%, while our full method improves upon this
much stronger baseline and sets the new state-of-the-art for weakly-supervised
RIS, reducing the gap between the weakly-supervised and fully-supervised
methods in some cases from around 33% to as little as 14%. Code is available at
https://github.com/fgirbal/segment-select-correct
Learning associations between clinical information and motion-based descriptors using a large scale MR-derived cardiac motion atlas
The availability of large scale databases containing imaging and non-imaging
data, such as the UK Biobank, represents an opportunity to improve our
understanding of healthy and diseased bodily function. Cardiac motion atlases
provide a space of reference in which the motion fields of a cohort of subjects
can be directly compared. In this work, a cardiac motion atlas is built from
cine MR data from the UK Biobank (~ 6000 subjects). Two automated quality
control strategies are proposed to reject subjects with insufficient image
quality. Based on the atlas, three dimensionality reduction algorithms are
evaluated to learn data-driven cardiac motion descriptors, and statistical
methods used to study the association between these descriptors and non-imaging
data. Results show a positive correlation between the atlas motion descriptors
and body fat percentage, basal metabolic rate, hypertension, smoking status and
alcohol intake frequency. The proposed method outperforms the ability to
identify changes in cardiac function due to these known cardiovascular risk
factors compared to ejection fraction, the most commonly used descriptor of
cardiac function. In conclusion, this work represents a framework for further
investigation of the factors influencing cardiac health.Comment: 2018 International Workshop on Statistical Atlases and Computational
Modeling of the Hear
What makes and breaks safety fine-tuning? a mechanistic study
Safety fine-tuning helps align Large Language Models (LLMs) with human preferences for their safe deployment. To better understand the underlying factors that make models safe via safety fine-tuning, we design a synthetic data generation framework that captures salient aspects of an unsafe input by modeling the interaction between the task the model is asked to perform (e.g., "design") versus the specific concepts the task is asked to be performed upon (e.g., a "cycle" vs. a "bomb"). Using this, we investigate three well-known safety fine-tuning methods---supervised safety fine-tuning, direct preference optimization, and unlearning---and provide significant evidence demonstrating that these methods minimally transform MLP weights to specifically align unsafe inputs into its weights' null space. This yields a clustering of inputs based on whether the model deems them safe or not. Correspondingly, when an adversarial input (e.g., a jailbreak) is provided, its activations are closer to safer samples, leading to the model processing such an input as if it were safe. We validate our findings, wherever possible, on real-world models---specifically, Llama-2 7B and Llama-3 8B
Localization recall precision (LRP): A new performance metric for object detection
Average precision (AP), the area under the recall-precision (RP) curve, is the standard performance measure for object detection. Despite its wide acceptance, it has a number of shortcomings, the most important of which are (i) the inability to distinguish very different RP curves, and (ii) the lack of directly measuring bounding box localization accuracy. In this paper, we propose “Localization Recall Precision (LRP) Error”, a new metric specifically designed for object detection. LRP Error is composed of three components related to localization, false negative (FN) rate and false positive (FP) rate. Based on LRP, we introduce the “Optimal LRP” (oLRP), the minimum achievable LRP error representing the best achievable configuration of the detector in terms of recall-precision and the tightness of the boxes. In contrast to AP, which considers precisions over the entire recall domain, oLRP determines the “best” confidence score threshold for a class, which balances the trade-off between localization and recall-precision. In our experiments, we show that oLRP provides richer and more discriminative information than AP. We also demonstrate that the best confidence score thresholds vary significantly among classes and detectors. Moreover, we present LRP results of a simple online video object detector and show that the class-specific optimized thresholds increase the accuracy against the common approach of using a general threshold for all classes. Our experiments demonstrate that LRP is more competent than AP in capturing the performance of detectors. Our source code for PASCAL VOC AND MSCOCO datasets are provided at https://github.com/cancam/LRP
Comparative efficacy of topical tetraVisc versus lidocaine gel in cataract surgery
<p>Abstract</p> <p>Background</p> <p>To compare the clinical efficacy of lidocaine 2% with tetracaine 0.5% for cataract surgery.</p> <p>Methods</p> <p>In a randomized, multi-surgeon, controlled clinical trial,122 consecutive cataract cases eligible for topical anesthesia, were randomly assigned to receive lidocaine 2% gel (1 ml) or tetracaine solution 0.5% (TetraVisc, 0.5 ml) before clear corneal phacoemulsification. Main outcome measure was visual analog scale (0 to 10), which was used to measure intra-operative pain. Secondary outcome measures included patients' discomfort due to tissue manipulation and surgeon graded patients' cooperation. Duration of surgery and intra-operative complications were also recorded.</p> <p>Results</p> <p>The mean age in TetraVisc (TV) group was 70.4 years and in the lidocaine gel group (LG) it was 70.6 years (p = 0.89). Patient reported mean intra-operative pain scores by visual analog scale were 0.70 ± 0.31 in TV group and 1.8 ± 0.4 in LG group (<it>P </it>< 0.001). Mean patient cooperation was also marginally better in the TV group (8.3 ± 0.3) compared to LG group (8.4 ± 0.6) (P = 0.25). 96% of patients in TV group showed intra-operative corneal clarity compared to 91% in LG group. TV group had less (1 out of 61 patients, 1.6%) intra-operative complications than LG group (3 out of 61 patients, 4.8%). No anesthesia related complications were noted in either group</p> <p>Conclusion</p> <p>Topical TetraVisc solution was superior to lidocaine 2% gel for pain control in patients undergoing clear corneal phacoemulsification. Lidocaine 2% gel is similar to TetraVisc in patient comfort and surgeon satisfaction.</p> <p>Trial Registration</p> <p><b>Clinical trials number</b>: ISRCTN78374774</p
Second-look PET-CT following an initial incomplete PET-CT response to (chemo)radiotherapy for head and neck squamous cell carcinoma
OBJECTIVES:
The limited positive predictive value of an incomplete response on PET-CT following (chemo)radiotherapy for head and neck squamous cell carcinoma (HNSCC) means that the optimal management strategy remains uncertain. The aim of the study is to assess the utility of a 'second-look' interval PET-CT.
METHODS:
Patients with HNSCC who were treated with (chemo)radiotherapy between 2008 and 2017 and underwent (i) baseline and (ii) response assessment PET-CT and (iii) second-look PET-CT following incomplete (positive or equivocal scan) response were included. Endpoints were conversion rate to complete response (CR) and test characteristics of the second-look PET-CT.
RESULTS:
Five hundred sixty-two patients with HNSCC underwent response assessment PET-CT at a median of 17Â weeks post-radiotherapy. Following an incomplete response on PET-CT, 40 patients underwent a second-look PET-CT at a median of 13Â weeks (range 6-25) from the first response PET-CT. Thirty-four out of 40 (85%) patients had oropharyngeal carcinoma. Twenty-four out of 40 (60%) second-look PET-CT scans converted to a complete locoregional response. The primary tumour conversion rate was 15/27 (56%) and the lymph node conversion rate was 14/19 (74%). The sensitivity, specificity, positive predictive value and negative predictive value (NPV) of the second-look PET-CT were 75%, 75%, 25% and 96% for the primary tumour and 100%, 92%, 40% and 100% for lymph nodes. There were no cases of progression following conversion to CR in the primary site or lymph nodes.
CONCLUSIONS:
The majority of patients who undergo a second-look PET-CT convert to a CR. The NPV of a second-look PET-CT is high, suggesting the potential to avoid surgical intervention.
KEY POINTS:
• PET-CT is a useful tool for response assessment following (chemo)radiotherapy for head and neck squamous cell carcinoma. • An incomplete response on PET-CT has a limited positive predictive value and optimal management is uncertain.
• These data show that with a 'second-look' interval PET-CT, the majority of patients convert to a complete metabolic response. When there is doubt about clinical and radiological response, a 'second-look' PET-CT can be used to spare patients unnecessary surgical intervention
Mediator Condensates Localize Signaling Factors to Key Cell Identity Genes
The gene expression programs that define the identity of each cell are controlled by master transcription factors (TFs) that bind cell-type-specific enhancers, as well as signaling factors, which bring extracellular stimuli to these enhancers. Recent studies have revealed that master TFs form phase-separated condensates with the Mediator coactivator at super-enhancers. Here, we present evidence that signaling factors for the WNT, TGF-β, and JAK/STAT pathways use their intrinsically disordered regions (IDRs) to enter and concentrate in Mediator condensates at super-enhancers. We show that the WNT coactivator β-catenin interacts both with components of condensates and DNA-binding factors to selectively occupy super-enhancer-associated genes. We propose that the cell-type specificity of the response to signaling is mediated in part by the IDRs of the signaling factors, which cause these factors to partition into condensates established by the master TFs and Mediator at genes with prominent roles in cell identity
- …