179 research outputs found

    RevColV2: Exploring Disentangled Representations in Masked Image Modeling

    Full text link
    Masked image modeling (MIM) has become a prevalent pre-training setup for vision foundation models and attains promising performance. Despite its success, existing MIM methods discard the decoder network during downstream applications, resulting in inconsistent representations between pre-training and fine-tuning and can hamper downstream task performance. In this paper, we propose a new architecture, RevColV2, which tackles this issue by keeping the entire autoencoder architecture during both pre-training and fine-tuning. The main body of RevColV2 contains bottom-up columns and top-down columns, between which information is reversibly propagated and gradually disentangled. Such design enables our architecture with the nice property: maintaining disentangled low-level and semantic information at the end of the network in MIM pre-training. Our experimental results suggest that a foundation model with decoupled features can achieve competitive performance across multiple downstream vision tasks such as image classification, semantic segmentation and object detection. For example, after intermediate fine-tuning on ImageNet-22K dataset, RevColV2-L attains 88.4% top-1 accuracy on ImageNet-1K classification and 58.6 mIoU on ADE20K semantic segmentation. With extra teacher and large scale dataset, RevColv2-L achieves 62.1 box AP on COCO detection and 60.4 mIoU on ADE20K semantic segmentation. Code and models are released at https://github.com/megvii-research/RevCo

    Camouflaged Object Detection with Feature Grafting and Distractor Aware

    Full text link
    The task of Camouflaged Object Detection (COD) aims to accurately segment camouflaged objects that integrated into the environment, which is more challenging than ordinary detection as the texture between the target and background is visually indistinguishable. In this paper, we proposed a novel Feature Grafting and Distractor Aware network (FDNet) to handle the COD task. Specifically, we use CNN and Transformer to encode multi-scale images in parallel. In order to better explore the advantages of the two encoders, we design a cross-attention-based Feature Grafting Module to graft features extracted from Transformer branch into CNN branch, after which the features are aggregated in the Feature Fusion Module. A Distractor Aware Module is designed to explicitly model the two possible distractors in the COD task to refine the coarse camouflage map. We also proposed the largest artificial camouflaged object dataset which contains 2000 images with annotations, named ACOD2K. We conducted extensive experiments on four widely used benchmark datasets and the ACOD2K dataset. The results show that our method significantly outperforms other state-of-the-art methods. The code and the ACOD2K will be available at https://github.com/syxvision/FDNet.Comment: ICME2023 pape

    Polarization-insensitive silicon nitride arrayed waveguide grating

    Get PDF
    Next-generation passive optical networks require integrated, polarization-insensitive wavelength-division multiplexing solutions, for which the recently emerging low-loss silicon nitride nanophotonic platforms hold great potential. A novel polarization-insensitive arrayed waveguide grating (AWG) built with silicon nitride waveguides is presented in this Letter. Polarization insensitivity is obtained when both the channel spacing and the center wavelength of the two orthogonal polarization states (i.e., the TE and TM waveguide modes) are simultaneously aligned. In our design, the channel spacing alignment between the polarization states is obtained by optimizing the geometry of the arrayed waveguides, whereas the central wavelength polarization insensitivity is obtained by splitting the two polarization states and adjusting their angle of incidence at the input star coupler to compensate for the polarization mode dispersion of the AWG. A 100 GHz 1×8 wavelength-division multiplexer with crosstalk levels below −16  dB is demonstrated experimentally

    Hypergraph Transformer for Skeleton-based Action Recognition

    Full text link
    Skeleton-based action recognition aims to predict human actions given human joint coordinates with skeletal interconnections. To model such off-grid data points and their co-occurrences, Transformer-based formulations would be a natural choice. However, Transformers still lag behind state-of-the-art methods using graph convolutional networks (GCNs). Transformers assume that the input is permutation-invariant and homogeneous (partially alleviated by positional encoding), which ignores an important characteristic of skeleton data, i.e., bone connectivity. Furthermore, each type of body joint has a clear physical meaning in human motion, i.e., motion retains an intrinsic relationship regardless of the joint coordinates, which is not explored in Transformers. In fact, certain re-occurring groups of body joints are often involved in specific actions, such as the subconscious hand movement for keeping balance. Vanilla attention is incapable of describing such underlying relations that are persistent and beyond pair-wise. In this work, we aim to exploit these unique aspects of skeleton data to close the performance gap between Transformers and GCNs. Specifically, we propose a new self-attention (SA) extension, named Hypergraph Self-Attention (HyperSA), to incorporate inherently higher-order relations into the model. The K-hop relative positional embeddings are also employed to take bone connectivity into account. We name the resulting model Hyperformer, and it achieves comparable or better performance w.r.t. accuracy and efficiency than state-of-the-art GCN architectures on NTU RGB+D, NTU RGB+D 120, and Northwestern-UCLA datasets. On the largest NTU RGB+D 120 dataset, the significantly improved performance reached by our Hyperformer demonstrates the underestimated potential of Transformer models in this field

    Metal-bonded Atomic Layers of Transition Metal Carbides (MXenes)

    Full text link
    Although two-dimensional transition metal carbides and nitrides (MXenes) have fantastic physical and chemical properties as well as wide applications, it remains challenging to produce stable MXenes due to their rapid structural degradation. Here, unique metal-bonded atomic layers of transition metal carbides with high stabilities are produced via a simple topological reaction between chlorine-terminated MXenes and selected metals, where the metals enable to not only remove Cl terminations, but also efficiently bond with adjacent atomic MXene slabs, driven by the symmetry of MAX phases. The films constructed from Al-bonded Ti3_3C2_2Clx_x atomic layers show high oxidation resistance up to 400 degrees centigrade and low sheet resistance of 9.3 ohm per square. Coupled to the multi-layer structure, the Al-bonded Ti3_3C2_2Clx_x film displays a significantly improved EMI shielding capability with a total shielding effectiveness value of 39 dB at a low thickness of 3.1 micron, outperforming pure Ti3_3C2_2Clx_x film

    Calibrating LLM-Based Evaluator

    Full text link
    Recent advancements in large language models (LLMs) on language modeling and emergent capabilities make them a promising reference-free evaluator of natural language generation quality, and a competent alternative to human evaluation. However, hindered by the closed-source or high computational demand to host and tune, there is a lack of practice to further calibrate an off-the-shelf LLM-based evaluator towards better human alignment. In this work, we propose AutoCalibrate, a multi-stage, gradient-free approach to automatically calibrate and align an LLM-based evaluator toward human preference. Instead of explicitly modeling human preferences, we first implicitly encompass them within a set of human labels. Then, an initial set of scoring criteria is drafted by the language model itself, leveraging in-context learning on different few-shot examples. To further calibrate this set of criteria, we select the best performers and re-draft them with self-refinement. Our experiments on multiple text quality evaluation datasets illustrate a significant improvement in correlation with expert evaluation through calibration. Our comprehensive qualitative analysis conveys insightful intuitions and observations on the essence of effective scoring criteria.Comment: 22 pages,11 figure

    Oil price uncertainty and stock price informativeness: Evidence from investment-price sensitivity in China

    Get PDF
    We study the effects of oil price uncertainty (OPU) on stock price informativeness based on investment-price sensitivity. Using Chinese stocks from 2008 to 2021, we find a negative relationship between OPU and the strength of Tobin's q (a standardized measure of prices) for predicting investment opportunities. This finding is likely due to the crowding out of informed investors rather than the financial constraints brought by a higher cost of capital. Investment-price sensitivity also decreases more among firms in less-competition, high sales volatility, and lower analysts' attention. What is more, the reduction in investment-price sensitivity is more concentrated in public utilities, agriculture & livestock, and industry instead of in real estate or commerce industries. These findings indicate that OPU decreases the acquisition of information related to firms, and consequently, price informativeness for future investment decisions

    Trends and predictions in the physical shape of Chinese preschool children from 2000 to 2020

    Get PDF
    ObjectiveTo explore physical shape changes in preschool children from 2000 to 2020, and forecast development trends over the next 10 years.MethodThe grey GM (1,1) prediction model was used to fit the physical shape indicators of preschool children in China from 2000 to 2020, and then the longitudinal change trend of physical shape was compared and analyzed. Finally, the development trend of physical shape in China in 2025 and 2030 was predicted.Results(1) During the period from 2000 to 2020, the height, weight and chest circumference of Chinese preschool children all increased rapidly. Specifically, the weight of male and female children increased by 1.8 kg and 1.6 kg, their chest circumference increased by 1.6 cm and 1.5 cm, respectively, and both their heights increased by 3.6 cm. Among these indicators, the older the age, the greater the growth rate. It is expected that all the indicators will continue to grow rapidly over the next 10 years, but the growth rate will slow. (2) From 2000 to 2020, the growth rate of weight was higher than that of height, and BMI showed an increasing trend. The obesity detection rates in boys and girls increased by 5.6 and 2.8%, respectively. Over the next 10 years, the incidence of obesity is expected to increase by 3.8% in boys and 2.8% in girls. (3) Improvement in the growth and development of preschool children in China has a certain correlation with the rapid growth of China’s economy,less physical activity, education and other factors.ConclusionOver the past 20 years, the growth and nutritional status of Chinese preschoolers have improved dramatically, but overweight and obesity remain. Overweight and obesity rates are expected to continue to increase rapidly over the next 10 years, particularly among boys, and effective measures should be taken to control the obesity epidemic

    Characterization of the Metabolic Fate of Datura metel Seed Extract and Its Main Constituents in Rats

    Get PDF
    Datura metel L. has been frequently used in Chinese traditional medicine. However, little is known on the chemical composition and in vivo metabolism of its seeds. In this study, using the strategy “chemical analysis, metabolism of single representative compounds, and metabolism of extract at clinical dosage” that we propose here, 42 constituents were characterized from D. metel seeds water extract. Furthermore, the metabolic pathways of 13 representative bioactive compounds of D. metel seeds were studied in rats after the oral administration of D. metel seeds water extract at a clinical dosage (0.15 g/kg). These included three withanolides, two withanolide glucosides, four amides, one indole, one triterpenoid, one steroid, and one sesquiterpenoid, and with regard to phase II metabolism, hydroxylation, (de)methylation, and dehydrogenation reactions were dominant. Furthermore, the metabolism of D. metel seeds water extract provided to rats at a clinical dosage was investigated by liquid chromatography-tandem mass spectrometry based on the above metabolic pathways. Sixty-one compounds were detected in plasma, 83 in urine, and 76 in fecal samples. Among them, withanolides exhibited higher plasma exposure than the other types. To our knowledge, this is the first systematic study on the chemical profiling and metabolite identification of D. metel seeds, including all compounds instead of single constituents

    Aggregation Pheromone for an Invasive Mussel Consists of a Precise Combination of Three Common Purines.

    Get PDF
    Most marine benthic invertebrates have a pelagic larval phase, after which they settle preferentially on or near conspecific adults, forming aggregations. Although settlement pheromones from conspecific adults have been implicated as critical drivers of aggregation for more than 30 years, surprisingly few have been unambiguously identified. Here we show that in the invasive dreissenid mussel Mytilopsis sallei (an ecological and economic pest), three common purines (adenosine, inosine, and hypoxanthine) released from adults in a synergistic and precise ratio (1:1.125:3.25) serve as an aggregation pheromone by inducing conspecific larval settlement and metamorphosis. Our results demonstrate that simple common metabolites can function as species-specific pheromones when present in precise combinations. This study provides important insights into our understanding of the ecology and communication processes of invasive organisms and indicates that the combination and ratio of purines might be critical for purine-based signaling systems that are fundamental and widespread in nature
    corecore