14 research outputs found
Recommended from our members
Correction to Figures: A Reply to Hwang and Peli (2014)
In Hwang and Peli (2014), few errors occurred in computing the angular disparities. The direction of peripheral depth distortion (the angular disparity differences between what it is in real-world 3D viewing and S3D viewing) is reversed when the computational errors were corrected, making the perception of the peripheral depth to be expanded, not compressed. This reply points to the error and provides the corrected figures. Correcting these errors does not affect the general conclusion that S3D viewed on single screen display induces peripheral depth distortion which may be a cause of visually induced motion sickness
Exploring Cycle Consistency Learning in Interactive Volume Segmentation
Automatic medical volume segmentation often lacks clinical accuracy,
necessitating further refinement. In this work, we interactively approach
medical volume segmentation via two decoupled modules:
interaction-to-segmentation and segmentation propagation. Given a medical
volume, a user first segments a slice (or several slices) via the interaction
module and then propagates the segmentation(s) to the remaining slices. The
user may repeat this process multiple times until a sufficiently high volume
segmentation quality is achieved. However, due to the lack of human correction
during propagation, segmentation errors are prone to accumulate in the
intermediate slices and may lead to sub-optimal performance. To alleviate this
issue, we propose a simple yet effective cycle consistency loss that
regularizes an intermediate segmentation by referencing the accurate
segmentation in the starting slice. To this end, we introduce a backward
segmentation path that propagates the intermediate segmentation back to the
starting slice using the same propagation network. With cycle consistency
training, the propagation network is better regularized than in standard
forward-only training approaches. Evaluation results on challenging
AbdomenCT-1K and OAI-ZIB datasets demonstrate the effectiveness of our method.Comment: Major revision of tech report. Code:
https://github.com/uncbiag/iSegFormer/tree/v2.
Disguise without Disruption: Utility-Preserving Face De-Identification
With the rise of cameras and smart sensors, humanity generates an exponential
amount of data. This valuable information, including underrepresented cases
like AI in medical settings, can fuel new deep-learning tools. However, data
scientists must prioritize ensuring privacy for individuals in these untapped
datasets, especially for images or videos with faces, which are prime targets
for identification methods. Proposed solutions to de-identify such images often
compromise non-identifying facial attributes relevant to downstream tasks. In
this paper, we introduce Disguise, a novel algorithm that seamlessly
de-identifies facial images while ensuring the usability of the modified data.
Unlike previous approaches, our solution is firmly grounded in the domains of
differential privacy and ensemble-learning research. Our method involves
extracting and substituting depicted identities with synthetic ones, generated
using variational mechanisms to maximize obfuscation and non-invertibility.
Additionally, we leverage supervision from a mixture-of-experts to disentangle
and preserve other utility attributes. We extensively evaluate our method using
multiple datasets, demonstrating a higher de-identification rate and superior
consistency compared to prior approaches in various downstream tasks.Comment: Accepted at AAAI 2024. Paper + supplementary materia
Stereoscopic 3D geometric distortions analyzed from the viewer's point of view.
Stereoscopic 3D (S3D) geometric distortions can be introduced by mismatches among image capture, display, and viewing configurations. In previous work of S3D geometric models, geometric distortions have been analyzed from a third-person perspective based on the binocular depth cue (i.e., binocular disparity). A third-person perspective is different from what the viewer sees since monocular depth cues (e.g., linear perspective, occlusion, and shadows) from different perspectives are different. However, depth perception in a 3D space involves both monocular and binocular depth cues. Geometric distortions that are solely predicted by the binocular depth cue cannot describe what a viewer really perceives. In this paper, we combine geometric models and retinal disparity models to analyze geometric distortions from the viewer's perspective where both monocular and binocular depth cues are considered. Results show that binocular and monocular depth-cue conflicts in a geometrically distorted S3D space. Moreover, user-initiated head translations averting from the optimal viewing position in conventional S3D displays can also introduce geometric distortions, which are inconsistent with our natural 3D viewing condition. The inconsistency of depth cues in a dynamic scene may be a source of visually induced motions sickness
Correcting geometric distortions in stereoscopic 3D imaging.
Motion in a distorted virtual 3D space may cause visually induced motion sickness. Geometric distortions in stereoscopic 3D can result from mismatches among image capture, display, and viewing parameters. Three pairs of potential mismatches are considered, including 1) camera separation vs. eye separation, 2) camera field of view (FOV) vs. screen FOV, and 3) camera convergence distance (i.e., distance from the cameras to the point where the convergence axes intersect) vs. screen distance from the observer. The effect of the viewer's head positions (i.e., head lateral offset from the screen center) is also considered. The geometric model is expressed as a function of camera convergence distance, the ratios of the three parameter-pairs, and the offset of the head position. We analyze the impacts of these five variables separately and their interactions on geometric distortions. This model facilitates insights into the various distortions and leads to methods whereby the user can minimize geometric distortions caused by some parameter-pair mismatches through adjusting of other parameter pairs. For example, in postproduction, viewers can correct for a mismatch between camera separation and eye separation by adjusting their distance from the real screen and changing the effective camera convergence distance
Learning Local Neighboring Structure for Robust 3D Shape Representation
Mesh is a powerful data structure for 3D shapes. Representation learning for 3D meshes is important in many computer vision and graphics applications. The recent success of convolutional neural networks (CNNs) for structured data (e.g., images) suggests the value of adapting insight from CNN for 3D shapes. However, 3D shape data are irregular since each node's neighbors are unordered. Various graph neural networks for 3D shapes have been developed with isotropic filters or predefined local coordinate systems to overcome the node inconsistency on graphs. However, isotropic filters or predefined local coordinate systems limit the representation power. In this paper, we propose a local structure-aware anisotropic convolutional operation (LSA-Conv) that learns adaptive weighting matrices for each node according to the local neighboring structure and performs shared anisotropic filters. In fact, the learnable weighting matrix is similar to the attention matrix in random synthesizer -- a new Transformer model for natural language processing (NLP). Comprehensive experiments demonstrate that our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods