101 research outputs found

    Seeing a Rose in Five Thousand Ways

    Full text link
    What is a rose, visually? A rose comprises its intrinsics, including the distribution of geometry, texture, and material specific to its object category. With knowledge of these intrinsic properties, we may render roses of different sizes and shapes, in different poses, and under different lighting conditions. In this work, we build a generative model that learns to capture such object intrinsics from a single image, such as a photo of a bouquet. Such an image includes multiple instances of an object type. These instances all share the same intrinsics, but appear different due to a combination of variance within these intrinsics and differences in extrinsic factors, such as pose and illumination. Experiments show that our model successfully learns object intrinsics (distribution of geometry, texture, and material) for a wide range of objects, each from a single Internet image. Our method achieves superior results on multiple downstream tasks, including intrinsic image decomposition, shape and image generation, view synthesis, and relighting.Comment: Project page: https://cs.stanford.edu/~yzzhang/projects/rose

    Stanford-ORB: A Real-World 3D Object Inverse Rendering Benchmark

    Full text link
    We introduce Stanford-ORB, a new real-world 3D Object inverse Rendering Benchmark. Recent advances in inverse rendering have enabled a wide range of real-world applications in 3D content generation, moving rapidly from research and commercial use cases to consumer devices. While the results continue to improve, there is no real-world benchmark that can quantitatively assess and compare the performance of various inverse rendering methods. Existing real-world datasets typically only consist of the shape and multi-view images of objects, which are not sufficient for evaluating the quality of material recovery and object relighting. Methods capable of recovering material and lighting often resort to synthetic data for quantitative evaluation, which on the other hand does not guarantee generalization to complex real-world environments. We introduce a new dataset of real-world objects captured under a variety of natural scenes with ground-truth 3D scans, multi-view images, and environment lighting. Using this dataset, we establish the first comprehensive real-world evaluation benchmark for object inverse rendering tasks from in-the-wild scenes, and compare the performance of various existing methods.Comment: NeurIPS 2023 Datasets and Benchmarks Track. The first two authors contributed equally to this work. Project page: https://stanfordorb.github.io

    CTVIS: Consistent Training for Online Video Instance Segmentation

    Full text link
    The discrimination of instance embeddings plays a vital role in associating instances across time for online video instance segmentation (VIS). Instance embedding learning is directly supervised by the contrastive loss computed upon the contrastive items (CIs), which are sets of anchor/positive/negative embeddings. Recent online VIS methods leverage CIs sourced from one reference frame only, which we argue is insufficient for learning highly discriminative embeddings. Intuitively, a possible strategy to enhance CIs is replicating the inference phase during training. To this end, we propose a simple yet effective training strategy, called Consistent Training for Online VIS (CTVIS), which devotes to aligning the training and inference pipelines in terms of building CIs. Specifically, CTVIS constructs CIs by referring inference the momentum-averaged embedding and the memory bank storage mechanisms, and adding noise to the relevant embeddings. Such an extension allows a reliable comparison between embeddings of current instances and the stable representations of historical instances, thereby conferring an advantage in modeling VIS challenges such as occlusion, re-identification, and deformation. Empirically, CTVIS outstrips the SOTA VIS models by up to +5.0 points on three VIS benchmarks, including YTVIS19 (55.1% AP), YTVIS21 (50.1% AP) and OVIS (35.5% AP). Furthermore, we find that pseudo-videos transformed from images can train robust models surpassing fully-supervised ones.Comment: Accepted by ICCV 2023. The code is available at https://github.com/KainingYing/CTVI

    Achieving large super-elasticity through changing relative easiness of deformation modes in Ti-Nb-Mo alloy by ultra-grain refinement

    Get PDF
    Large super-elasticity approaching its theoretically expected value was achieved in Ti-13.3Nb-4.6Mo alloy having an ultrafine-grained β-phase. In-situ synchrotron radiation X-ray diffraction analysis revealed that the dominant yielding mechanism changed from dislocation slip to martensitic transformation by decreasing the β-grain size down to sub-micrometer. Different grain size dependence of the critical stress to initiate dislocation slip and martensitic transformation, which was reflected by the transition of yielding behavior, was considered to be the main reason for the large super-elasticity in the ultrafine-grained specimen. The present study clarified that ultra-grain refinement down to sub-mirometer scale made dislocation slips more difficult than martensitic transformation, leading to an excellent super-elasticity close to the theoretical limit in the β-Ti alloy

    ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image

    Full text link
    We introduce a 3D-aware diffusion model, ZeroNVS, for single-image novel view synthesis for in-the-wild scenes. While existing methods are designed for single objects with masked backgrounds, we propose new techniques to address challenges introduced by in-the-wild multi-object scenes with complex backgrounds. Specifically, we train a generative prior on a mixture of data sources that capture object-centric, indoor, and outdoor scenes. To address issues from data mixture such as depth-scale ambiguity, we propose a novel camera conditioning parameterization and normalization scheme. Further, we observe that Score Distillation Sampling (SDS) tends to truncate the distribution of complex backgrounds during distillation of 360-degree scenes, and propose "SDS anchoring" to improve the diversity of synthesized novel views. Our model sets a new state-of-the-art result in LPIPS on the DTU dataset in the zero-shot setting, even outperforming methods specifically trained on DTU. We further adapt the challenging Mip-NeRF 360 dataset as a new benchmark for single-image novel view synthesis, and demonstrate strong performance in this setting. Our code and data are at http://kylesargent.github.io/zeronvs/Comment: 17 page

    Relationship between motor performance and cortical activity of older neurological disorder patients with dyskinesia using fNIRS: A systematic review

    Get PDF
    Background: Neurological disorders with dyskinesia would seriously affect older people’s daily activities, which is not only associated with the degeneration or injury of the musculoskeletal or the nervous system but also associated with complex linkage between them. This study aims to review the relationship between motor performance and cortical activity of typical older neurological disorder patients with dyskinesia during walking and balance tasks.Methods: Scopus, PubMed, and Web of Science databases were searched. Articles that described gait or balance performance and cortical activity of older Parkinson’s disease (PD), multiple sclerosis, and stroke patients using functional near-infrared spectroscopy were screened by the reviewers. A total of 23 full-text articles were included for review, following an initial yield of 377 studies.Results: Participants were mostly PD patients, the prefrontal cortex was the favorite region of interest, and walking was the most popular test motor task, interventional studies were four. Seven studies used statistical methods to interpret the relationship between motor performance and cortical activation. The motor performance and cortical activation were simultaneously affected under difficult walking and balance task conditions. The concurrent changes of motor performance and cortical activation in reviewed studies contained the same direction change and different direction change.Conclusion: Most of the reviewed studies reported poor motor performance and increased cortical activation of PD, stroke and multiple sclerosis older patients. The external motor performance such as step speed were analyzed only. The design and results were not comprehensive and profound. More than 5 weeks walking training or physiotherapy can contribute to motor function promotion as well as cortices activation of PD and stroke patients. Thus, further study is needed for more statistical analysis on the relationship between motor performance and activation of the motor-related cortex. More different type and program sports training intervention studies are needed to perform

    Cortical morphological heterogeneity of schizophrenia and its relationship with glutamatergic receptor variations

    Get PDF
    Abstract Background Recent genetic evidence implicates glutamatergic-receptor variations in schizophrenia. Glutamatergic excess during early life in people with schizophrenia may cause excitotoxicity and produce structural deficits in the brain. Cortical thickness and gyrification are reduced in schizophrenia, but only a subgroup of patients exhibits such structural deficits. We delineate the structural variations among unaffected siblings and patients with schizophrenia and study the role of key glutamate-receptor polymorphisms on these variations. Methods Gaussian Mixture Model clustering was applied to the cortical thickness and gyrification data of 114 patients, 112 healthy controls, and 42 unaffected siblings to identify subgroups. The distribution of glutamate-receptor (GRM3, GRIN2A, and GRIA1) and voltage-gated calcium channel (CACNA1C) variations across the MRI-based subgroups was studied. The comparisons in clinical symptoms and cognition between patient subgroups were conducted. Results We observed a “hypogyric,” “impoverished-thickness,” and “supra-normal” subgroups of patients, with higher negative symptom burden and poorer verbal fluency in the hypogyric subgroup and notable functional deterioration in the impoverished-thickness subgroup. Compared to healthy subjects, the hypogyric subgroup had significant GRIN2A and GRM3 variations, the impoverished-thickness subgroup had CACNA1C variations while the supra-normal group had no differences. Conclusions Disrupted gyrification and thickness can be traced to the glutamatergic receptor and voltage-gated calcium channel dysfunction respectively in schizophrenia. This raises the question of whether MRI-based multimetric subtyping may be relevant for clinical trials of agents affecting the glutamatergic system
    corecore