178 research outputs found

    Real-time Model Predictive Control and System Identification Using Differentiable Physics Simulation

    Full text link
    Developing robot controllers in a simulated environment is advantageous but transferring the controllers to the target environment presents challenges, often referred to as the "sim-to-real gap". We present a method for continuous improvement of modeling and control after deploying the robot to a dynamically-changing target environment. We develop a differentiable physics simulation framework that performs online system identification and optimal control simultaneously, using the incoming observations from the target environment in real time. To ensure robust system identification against noisy observations, we devise an algorithm to assess the confidence of our estimated parameters, using numerical analysis of the dynamic equations. To ensure real-time optimal control, we adaptively schedule the optimization window in the future so that the optimized actions can be replenished faster than they are consumed, while staying as up-to-date with new sensor information as possible. The constant re-planning based on a constantly improved model allows the robot to swiftly adapt to the changing environment and utilize real-world data in the most sample-efficient way. Thanks to a fast differentiable physics simulator, the optimization for both system identification and control can be solved efficiently for robots operating in real time. We demonstrate our method on a set of examples in simulation and show that our results are favorable compared to baseline methods

    Multiscale Superpixel Structured Difference Graph Convolutional Network for VL Representation

    Full text link
    Within the multimodal field, the key to integrating vision and language lies in establishing a good alignment strategy. Recently, benefiting from the success of self-supervised learning, significant progress has been made in multimodal semantic representation based on pre-trained models for vision and language. However, there is still room for improvement in visual semantic representation. The lack of spatial semantic coherence and vulnerability to noise makes it challenging for current pixel or patch-based methods to accurately extract complex scene boundaries. To this end, this paper develops superpixel as a comprehensive compact representation of learnable image data, which effectively reduces the number of visual primitives for subsequent processing by clustering perceptually similar pixels. To mine more precise topological relations, we propose a Multiscale Difference Graph Convolutional Network (MDGCN). It parses the entire image as a fine-to-coarse hierarchical structure of constituent visual patterns, and captures multiscale features by progressively merging adjacent superpixels as graph nodes. Moreover, we predict the differences between adjacent nodes through the graph structure, facilitating key information aggregation of graph nodes to reason actual semantic relations. Afterward, we design a multi-level fusion rule in a bottom-up manner to avoid understanding deviation by learning complementary spatial information at different regional scales. Our proposed method can be well applied to multiple downstream task learning. Extensive experiments demonstrate that our method is competitive with other state-of-the-art methods in visual reasoning. Our code will be released upon publication

    Double Dome and Reemergence of Superconductivity in Pristine 6R-TaS2 under Pressure

    Full text link
    Investigating the implications of interlayer coupling on superconductivity is essential for comprehending the intrinsic mechanisms of high temperature superconductors. Van der Waals heterojunctions have attracted extensive research due to their exotic interlayer coupling. Here, we present a natural heterojunction superconductor of 6R-TaS2 that demonstrates a double-dome of superconductivity, in addition to, the reemergence of superconducting under high pressures. Our first principles calculation shows that the first dome of superconductivity in 6R-TaS2 can be attributed to changes in interlayer coupling and charge transfer. The second superconducting dome and the reemergence of superconductivity can be ascribed to changes in the density of states resulting from Fermi surface reconstruction, in which the DOS of T-layer and S p-orbitals play a crucial role. We have reported the first observation in TMDs that non-metallic atoms playing a dominant role in the reemergence of superconducting and the influence of two Lifshitz transitions on superconducting properties

    A Survey on Multimodal Large Language Models

    Full text link
    Multimodal Large Language Model (MLLM) recently has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of MLLM, such as writing stories based on images and OCR-free math reasoning, are rare in traditional methods, suggesting a potential path to artificial general intelligence. In this paper, we aim to trace and summarize the recent progress of MLLM. First of all, we present the formulation of MLLM and delineate its related concepts. Then, we discuss the key techniques and applications, including Multimodal Instruction Tuning (M-IT), Multimodal In-Context Learning (M-ICL), Multimodal Chain of Thought (M-CoT), and LLM-Aided Visual Reasoning (LAVR). Finally, we discuss existing challenges and point out promising research directions. In light of the fact that the era of MLLM has only just begun, we will keep updating this survey and hope it can inspire more research. An associated GitHub link collecting the latest papers is available at https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models.Comment: Project page:https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Model

    Biodiversity and activity of the gut microbiota across the life history of the insect herbivore Spodoptera littoralis

    Get PDF
    Microbes that live inside insects play critical roles in host nutrition, physiology, and behavior. Although Lepidoptera (butterflies and moths) are one of the most diverse insect taxa, their microbial symbionts are little-studied, particularly during metamorphosis. Here, using ribosomal tag pyrosequencing of DNA and RNA, we investigated biodiversity and activity of gut microbiotas across the holometabolous life cycle of Spodoptera littoralis, a notorious agricultural pest worldwide. Proteobacteria and Firmicutes dominate but undergo a structural “metamorphosis” in tandem with its host. Enterococcus, Pantoea and Citrobacter were abundant and active in early-instar, while Clostridia increased in late-instar. Interestingly, only enterococci persisted through metamorphosis. Female adults harbored high proportions of Enterococcus, Klebsiella and Pantoea, whereas males largely shifted to Klebsiella. Comparative functional analysis with PICRUSt indicated that early-instar larval microbiome was more enriched for genes involved in cell motility and carbohydrate metabolism, whereas in late-instar amino acid, cofactor and vitamin metabolism increased. Genes involved in energy and nucleotide metabolism were abundant in pupae. Female adult microbiome was enriched for genes relevant to energy metabolism, while an increase in the replication and repair pathway was observed in male. Understanding the metabolic activity of these herbivore-associated microbial symbionts may assist the development of novel pest-management strategies

    Lumen contour segmentation in ivoct based on n-type cnn

    Get PDF
    Automatic segmentation of lumen contour plays an important role in medical imaging and diagnosis, which is the first step towards the evaluation of morphology of vessels under analysis and the identification of possible atherosclerotic lesions. Meanwhile, quantitative information can only be obtained with segmentation, contributing to the appearance of novel methods which can be successfully applied to intravascular optical coherence tomography (IVOCT) images. This paper proposed a new end-to-end neural network (N-Net) for the automatic lumen segmentation, using multi-scale features based deep neural network, for IVOCT images. The architecture of the N-Net contains a multi-scale input layer, a N-type convolution network layer and a cross-entropy loss function. The multi-scale input layer in the proposed N-Net is designed to avoid the loss of information caused by pooling in traditional U-Net and also enriches the detailed information in each layer. The N-type convolutional network is proposed as the framework in the whole deep architecture. Finally, the loss function guarantees the degree of fidelity between the output of proposed method and the manually labeled output. In order to enlarge the training set, data augmentation is also introduced. We evaluated our method against loss, accuracy, recall, dice similarity coefficient, jaccard similarity coefficient and specificity. The experimental results presented in this paper demonstrate the superior performance of the proposed N-Net architecture, comparing to some existing networks, for enhancing the precision of automatic lumen segmentation and increasing the detailed information of edges of the vascular lumen

    Controllable Multi-Objective Re-ranking with Policy Hypernetworks

    Full text link
    Multi-stage ranking pipelines have become widely used strategies in modern recommender systems, where the final stage aims to return a ranked list of items that balances a number of requirements such as user preference, diversity, novelty etc. Linear scalarization is arguably the most widely used technique to merge multiple requirements into one optimization objective, by summing up the requirements with certain preference weights. Existing final-stage ranking methods often adopt a static model where the preference weights are determined during offline training and kept unchanged during online serving. Whenever a modification of the preference weights is needed, the model has to be re-trained, which is time and resources inefficient. Meanwhile, the most appropriate weights may vary greatly for different groups of targeting users or at different time periods (e.g., during holiday promotions). In this paper, we propose a framework called controllable multi-objective re-ranking (CMR) which incorporates a hypernetwork to generate parameters for a re-ranking model according to different preference weights. In this way, CMR is enabled to adapt the preference weights according to the environment changes in an online manner, without retraining the models. Moreover, we classify practical business-oriented tasks into four main categories and seamlessly incorporate them in a new proposed re-ranking model based on an Actor-Evaluator framework, which serves as a reliable real-world testbed for CMR. Offline experiments based on the dataset collected from Taobao App showed that CMR improved several popular re-ranking models by using them as underlying models. Online A/B tests also demonstrated the effectiveness and trustworthiness of CMR

    Compound Attention and Neighbor Matching Network for Multi-contrast MRI Super-resolution

    Full text link
    Multi-contrast magnetic resonance imaging (MRI) reflects information about human tissue from different perspectives and has many clinical applications. By utilizing the complementary information among different modalities, multi-contrast super-resolution (SR) of MRI can achieve better results than single-image super-resolution. However, existing methods of multi-contrast MRI SR have the following shortcomings that may limit their performance: First, existing methods either simply concatenate the reference and degraded features or exploit global feature-matching between them, which are unsuitable for multi-contrast MRI SR. Second, although many recent methods employ transformers to capture long-range dependencies in the spatial dimension, they neglect that self-attention in the channel dimension is also important for low-level vision tasks. To address these shortcomings, we proposed a novel network architecture with compound-attention and neighbor matching (CANM-Net) for multi-contrast MRI SR: The compound self-attention mechanism effectively captures the dependencies in both spatial and channel dimension; the neighborhood-based feature-matching modules are exploited to match degraded features and adjacent reference features and then fuse them to obtain the high-quality images. We conduct experiments of SR tasks on the IXI, fastMRI, and real-world scanning datasets. The CANM-Net outperforms state-of-the-art approaches in both retrospective and prospective experiments. Moreover, the robustness study in our work shows that the CANM-Net still achieves good performance when the reference and degraded images are imperfectly registered, proving good potential in clinical applications.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Mutation analysis of the WNT4 gene in Han Chinese women with premature ovarian failure

    Get PDF
    BACKGROUND: The WNT4 gene plays an important role in female sex determination and differentiation. It also contributes to maintaining of the ovaries and the survival of follicles. METHODS: We sequenced the coding region and splice sites of WNT4 in 145 Han Chinese women with premature ovarian failure (POF) and 200 healthy controls. RESULTS: Only one novel variation, in Exon 2 (195C > T), was detected among the women with POF. However, this synonymous variation did not result in a change in amino acid sequence (65 Asp > Asp). No further variants were found in any of the samples. CONCLUSION: Although we cannot provide any evidence that it is a possible disease-causing gene, this study is the first attempt to investigate the possible role of WNT4 in Han Chinese women with POF
    corecore