108 research outputs found

    Generating High-Resolution 3D Faces and Bodies Using VQ-VAE-2 with PixelSNAIL Networks on 2D Representations

    Get PDF
    Modeling and representing 3D shapes of the human body and face is a prominent field due to its applications in the healthcare, clothes, and movie industry. In our work, we tackled the problem of 3D face and body synthesis by reducing 3D meshes to 2D image representations. We show that the face can naturally be modeled on a 2D grid. At the same time, for more challenging 3D body geometries, we proposed a novel non-bijective 3D–2D conversion method representing the 3D body mesh as a plurality of rendered projections on the 2D grid. Then, we trained a state-of-the-art vector-quantized variational autoencoder (VQ-VAE-2) to learn a latent representation of 2D images and fit a PixelSNAIL autoregressive model to sample novel synthetic meshes. We evaluated our method versus a classical one based on principal component analysis (PCA) by sampling from the empirical cumulative distribution of the PCA scores. We used the empirical distributions of two commonly used metrics, specificity and diversity, to quantitatively demonstrate that the synthetic faces generated with our method are statistically closer to real faces when compared with the PCA ones. Our experiment on the 3D body geometry requires further research to match the test set statistics but shows promising results

    LXL: LiDAR Excluded Lean 3D Object Detection with 4D Imaging Radar and Camera Fusion

    Full text link
    As an emerging technology and a relatively affordable device, the 4D imaging radar has already been confirmed effective in performing 3D object detection in autonomous driving. Nevertheless, the sparsity and noisiness of 4D radar point clouds hinder further performance improvement, and in-depth studies about its fusion with other modalities are lacking. On the other hand, most of the camera-based perception methods transform the extracted image perspective view features into the bird's-eye view geometrically via "depth-based splatting" proposed in Lift-Splat-Shoot (LSS), and some researchers exploit other modals such as LiDARs or ordinary automotive radars for enhancement. Recently, a few works have applied the "sampling" strategy for image view transformation, showing that it outperforms "splatting" even without image depth prediction. However, the potential of "sampling" is not fully unleashed. In this paper, we investigate the "sampling" view transformation strategy on the camera and 4D imaging radar fusion-based 3D object detection. In the proposed model, LXL, predicted image depth distribution maps and radar 3D occupancy grids are utilized to aid image view transformation, called "radar occupancy-assisted depth-based sampling". Experiments on VoD and TJ4DRadSet datasets show that the proposed method outperforms existing 3D object detection methods by a significant margin without bells and whistles. Ablation studies demonstrate that our method performs the best among different enhancement settings

    Probing Interface of Perovskite Oxide Using Surface-specific Terahertz Spectroscopy

    Full text link
    The surface/interface species in perovskite oxides play an essential role in many novel emergent physical phenomena and chemical processes. With low eigen-energy in the terahertz region, such species at buried interfaces remain poorly understood due to the lack of feasible experimental techniques. Here, we show that vibrational resonances and two-dimensional electron gas at the interface can be characterized using surface-specific nonlinear spectroscopy in the terahertz range. This technique uses intra-pulse difference frequency mixing (DFM) process, which is allowed only at surface/interface of a medium with inversion symmetry. Sub-monolayer sensitivity can be achieved using the state-of-the-art detection scheme for the terahertz emission from surface/interface. As a demonstration, Drude-like nonlinear response from the two-dimensional electron gas emerging at LaAlO3/SrTiO3 or Al2O3/ SrTiO3 interface was successfully observed. Meanwhile, the interfacial vibrational spectrum of the ferroelectric soft mode of SrTiO3 at 2.8 THz was also obtained that was polarized by the surface field in the interfacial region. The corresponding surface/interface potential, which is a key parameter for SrTiO3-based interface superconductivity and photocatalysis, can now be determined optically via quantitative analysis on the polarized phonon spectrum. The interfacial species with resonant frequencies in the THz region revealed by our method provide more insights into the understanding of physical properties of complex oxides.Comment: arXiv admin note: substantial text overlap with arXiv:2207.1461

    Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

    Full text link
    We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring language identification (LID) input from users during inference. Our method incorporates a gating mechanism and LID loss, enabling transformer experts to learn language-specific information. By combining gated transformer experts with shared transformer layers, we construct multilingual transformer blocks and utilize linear experts to effectively regularize the joint network. The curriculum training scheme leverages LID to guide the gated experts in improving their respective language performance. Experimental results on a bilingual task involving English and Spanish demonstrate significant improvements, with average relative word error reductions of 12.5% and 7.3% compared to the baseline bilingual and monolingual models, respectively. Notably, our method achieves performance comparable to the upper-bound model trained and inferred with oracle LID. Extending our approach to trilingual, quadrilingual, and pentalingual models reveals similar advantages to those observed in the bilingual models, highlighting its ease of extension to multiple languages

    Large Quantum Anomalous Hall Effect in Spin-Orbit Proximitized Rhombohedral Graphene

    Full text link
    The quantum anomalous Hall effect (QAHE) is a robust topological phenomenon that features quantized Hall resistance at zero magnetic field. Here we report the observation of the QAHE in a rhombohedral pentalayer graphene/monolayer WS2 heterostructure. Distinct from all existing QAHE systems, this system has neither magnetic element nor moir\'e superlattice effect. The QAH states emerge at charge neutrality and feature Chern numbers C = +-5 at temperatures up to about 1.5 K. This large QAHE in our system arises from the synergy of the electron correlation effect in intrinsic flat bands of pentalayer graphene, the gate-tuning effect that breaks the layer/spin-degeneracy, and the proximity-induced Ising spin-orbit-coupling (SOC) effect that further lifts the valley-degeneracy. Our experiment demonstrates the great potential of crystalline two-dimensional materials for intertwined electron correlation and band topology physics, and points to engineering chiral Majorana edge states towards topological quantum computation

    Assembly and comparison of two closely related Brassica napus genomes

    Get PDF
    As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here, we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the Brassica napus cultivar Darmor-bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor-bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in future production of a B. napus pangenome

    Insight-HXMT observations of Swift J0243.6+6124 during its 2017-2018 outburst

    Full text link
    The recently discovered neutron star transient Swift J0243.6+6124 has been monitored by {\it the Hard X-ray Modulation Telescope} ({\it Insight-\rm HXMT). Based on the obtained data, we investigate the broadband spectrum of the source throughout the outburst. We estimate the broadband flux of the source and search for possible cyclotron line in the broadband spectrum. No evidence of line-like features is, however, found up to 150 keV\rm 150~keV. In the absence of any cyclotron line in its energy spectrum, we estimate the magnetic field of the source based on the observed spin evolution of the neutron star by applying two accretion torque models. In both cases, we get consistent results with B1013 GB\rm \sim 10^{13}~G, D6 kpcD\rm \sim 6~kpc and peak luminosity of >1039 erg s1\rm >10^{39}~erg~s^{-1} which makes the source the first Galactic ultraluminous X-ray source hosting a neutron star.Comment: publishe
    corecore