6 research outputs found
ANAESTHETIC MANAGEMENT OF A PATIENT WITH PREVIOUS LOBECTOMY POSTED FOR EMERGENCY MODIFIED RADICAL MASTOIDECTOMY
Pulmonary disease can be considered as a risk factor for several respiratory complications occurring during the perioperative period. Here we present a case of a middle-aged man who underwent modified radical mastoidectomy 23 y after left-sided lobectomy in order to illustrate the salient anesthetic considerations of this scenario.Keywords: Lobectomy, Mastoidectomy, Anaesthesia
Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
Recognizing scenes and objects in 3D from a single image is a longstanding
goal of computer vision with applications in robotics and AR/VR. For 2D
recognition, large datasets and scalable solutions have led to unprecedented
advances. In 3D, existing benchmarks are small in size and approaches
specialize in few object categories and specific domains, e.g. urban driving
scenes. Motivated by the success of 2D recognition, we revisit the task of 3D
object detection by introducing a large benchmark, called Omni3D. Omni3D
re-purposes and combines existing datasets resulting in 234k images annotated
with more than 3 million instances and 98 categories. 3D detection at such
scale is challenging due to variations in camera intrinsics and the rich
diversity of scene and object types. We propose a model, called Cube R-CNN,
designed to generalize across camera and scene types with a unified approach.
We show that Cube R-CNN outperforms prior works on the larger Omni3D and
existing benchmarks. Finally, we prove that Omni3D is a powerful dataset for 3D
object recognition and show that it improves single-dataset performance and can
accelerate learning on new smaller datasets via pre-training.Comment: CVPR 2023, Project website: https://omni3d.garrickbrazil.com
FACET: Fairness in Computer Vision Evaluation Benchmark
Computer vision models have known performance disparities across attributes
such as gender and skin tone. This means during tasks such as classification
and detection, model performance differs for certain classes based on the
demographics of the people in the image. These disparities have been shown to
exist, but until now there has not been a unified approach to measure these
differences for common use-cases of computer vision models. We present a new
benchmark named FACET (FAirness in Computer Vision EvaluaTion), a large,
publicly available evaluation set of 32k images for some of the most common
vision tasks - image classification, object detection and segmentation. For
every image in FACET, we hired expert reviewers to manually annotate
person-related attributes such as perceived skin tone and hair type, manually
draw bounding boxes and label fine-grained person-related classes such as disk
jockey or guitarist. In addition, we use FACET to benchmark state-of-the-art
vision models and present a deeper understanding of potential performance
disparities and challenges across sensitive demographic attributes. With the
exhaustive annotations collected, we probe models using single demographics
attributes as well as multiple attributes using an intersectional approach
(e.g. hair color and perceived skin tone). Our results show that
classification, detection, segmentation, and visual grounding models exhibit
performance disparities across demographic attributes and intersections of
attributes. These harms suggest that not all people represented in datasets
receive fair and equitable treatment in these vision tasks. We hope current and
future results using our benchmark will contribute to fairer, more robust
vision models. FACET is available publicly at https://facet.metademolab.com
Segment Anything
We introduce the Segment Anything (SA) project: a new task, model, and
dataset for image segmentation. Using our efficient model in a data collection
loop, we built the largest segmentation dataset to date (by far), with over 1
billion masks on 11M licensed and privacy respecting images. The model is
designed and trained to be promptable, so it can transfer zero-shot to new
image distributions and tasks. We evaluate its capabilities on numerous tasks
and find that its zero-shot performance is impressive -- often competitive with
or even superior to prior fully supervised results. We are releasing the
Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and
11M images at https://segment-anything.com to foster research into foundation
models for computer vision.Comment: Project web-page: https://segment-anything.co
Learning 3D Object Shape and Layout without 3D Supervision
A 3D scene consists of a set of objects, each with a shape and a layout
giving their position in space. Understanding 3D scenes from 2D images is an
important goal, with applications in robotics and graphics. While there have
been recent advances in predicting 3D shape and layout from a single image,
most approaches rely on 3D ground truth for training which is expensive to
collect at scale. We overcome these limitations and propose a method that
learns to predict 3D shape and layout for objects without any ground truth
shape or layout information: instead we rely on multi-view images with 2D
supervision which can more easily be collected at scale. Through extensive
experiments on 3D Warehouse, Hypersim, and ScanNet we demonstrate that our
approach scales to large datasets of realistic images, and compares favorably
to methods relying on 3D ground truth. On Hypersim and ScanNet where reliable
3D ground truth is not available, our approach outperforms supervised
approaches trained on smaller and less diverse datasets.Comment: CVPR 2022, project page: https://gkioxari.github.io/usl