68 research outputs found
Open-Set Image Tagging with Multi-Grained Text Supervision
In this paper, we introduce the Recognize Anything Plus Model (RAM++), an
open-set image tagging model effectively leveraging multi-grained text
supervision. Previous approaches (e.g., CLIP) primarily utilize global text
supervision paired with images, leading to sub-optimal performance in
recognizing multiple individual semantic tags. In contrast, RAM++ seamlessly
integrates individual tag supervision with global text supervision, all within
a unified alignment framework. This integration not only ensures efficient
recognition of predefined tag categories, but also enhances generalization
capabilities for diverse open-set categories. Furthermore, RAM++ employs large
language models (LLMs) to convert semantically constrained tag supervision into
more expansive tag description supervision, thereby enriching the scope of
open-set visual description concepts. Comprehensive evaluations on various
image recognition benchmarks demonstrate RAM++ exceeds existing
state-of-the-art (SOTA) open-set image tagging models on most aspects.
Specifically, for predefined commonly used tag categories, RAM++ showcases 10.2
mAP and 15.4 mAP enhancements over CLIP on OpenImages and ImageNet. For
open-set categories beyond predefined, RAM++ records improvements of 5.0 mAP
and 6.4 mAP over CLIP and RAM respectively on OpenImages. For diverse
human-object interaction phrases, RAM++ achieves 7.8 mAP and 4.7 mAP
improvements on the HICO benchmark. Code, datasets and pre-trained models are
available at \url{https://github.com/xinyu1205/recognize-anything}.Comment: Homepage: https://github.com/xinyu1205/recognize-anythin
Tag2Text: Guiding Vision-Language Model via Image Tagging
This paper presents Tag2Text, a vision language pre-training (VLP) framework,
which introduces image tagging into vision-language models to guide the
learning of visual-linguistic features. In contrast to prior works which
utilize object tags either manually labeled or automatically detected with a
limited detector, our approach utilizes tags parsed from its paired text to
learn an image tagger and meanwhile provides guidance to vision-language
models. Given that, Tag2Text can utilize large-scale annotation-free image tags
in accordance with image-text pairs, and provides more diverse tag categories
beyond objects. As a result, Tag2Text achieves a superior image tag recognition
ability by exploiting fine-grained text information. Moreover, by leveraging
tagging guidance, Tag2Text effectively enhances the performance of
vision-language models on both generation-based and alignment-based tasks.
Across a wide range of downstream benchmarks, Tag2Text achieves
state-of-the-art or competitive results with similar model sizes and data
scales, demonstrating the efficacy of the proposed tagging guidance
ELUCID IV: Galaxy Quenching and its Relation to Halo Mass, Environment, and Assembly Bias
We examine the quenched fraction of central and satellite galaxies as a
function of galaxy stellar mass, halo mass, and the matter density of their
large scale environment. Matter densities are inferred from our ELUCID
simulation, a constrained simulation of local Universe sampled by SDSS, while
halo masses and central/satellite classification are taken from the galaxy
group catalog of Yang et al. The quenched fraction for the total population
increases systematically with the three quantities. We find that the
`environmental quenching efficiency', which quantifies the quenched fraction as
function of halo mass, is independent of stellar mass. And this independence is
the origin of the stellar mass-independence of density-based quenching
efficiency, found in previous studies. Considering centrals and satellites
separately, we find that the two populations follow similar correlations of
quenching efficiency with halo mass and stellar mass, suggesting that they have
experienced similar quenching processes in their host halo. We demonstrate that
satellite quenching alone cannot account for the environmental quenching
efficiency of the total galaxy population and the difference between the two
populations found previously mainly arises from the fact that centrals and
satellites of the same stellar mass reside, on average, in halos of different
mass. After removing these halo-mass and stellar-mass effects, there remains a
weak, but significant, residual dependence on environmental density, which is
eliminated when halo assembly bias is taken into account. Our results therefore
indicate that halo mass is the prime environmental parameter that regulates the
quenching of both centrals and satellites.Comment: 21 pages, 16 figures, submitted to Ap
Modeling seismic wave propagation in the Loess Plateau using a viscoacoustic wave equation with explicitly expressed quality factor
The thick Quaternary loess on the Loess Plateau of China produces strong seismic attenuation, resulting in weak reflections from subsurface exploration targets. Accurately simulating seismic wavefield in the Loess Plateau is important for guiding subsequent data processing and interpretation. We present a 2D/3D wavefield simulation method for the Loess Plateau using a viscoacoustic wave equation with explicitly expressed quality factor. To take into account the effect of irregular surface, we utilize a vertically deformed grid to represent the topography, and solve the viscoacoustic wave equation in a regular computational domain that conforms to topographic surface. Grid deformation introduces the partial derivatives such as ∂vx/∂z and ∂vy/∂z in the wave equation, which is difficult to be accurately computed using traditional staggered-grid finite-difference method. To mitigate this issue, a finite-difference scheme based on a fully staggered-grid is adopted to solve the viscoacoustic wave equation. Numerical experiments for a simple layer model and 2D/3D realistic Loess Plateau models demonstrate the feasibility and adaptability of the proposed method. The 3D modeling results show comparable amplitude and waveform characteristics to the field data acquired from the Chinese Loess Plateau, suggesting a good performance of the proposed modeling method
Recognize Anything: A Strong Image Tagging Model
We present the Recognize Anything Model (RAM): a strong foundation model for
image tagging. RAM can recognize any common category with high accuracy. RAM
introduces a new paradigm for image tagging, leveraging large-scale image-text
pairs for training instead of manual annotations. The development of RAM
comprises four key steps. Firstly, annotation-free image tags are obtained at
scale through automatic text semantic parsing. Subsequently, a preliminary
model is trained for automatic annotation by unifying the caption and tagging
tasks, supervised by the original texts and parsed tags, respectively. Thirdly,
a data engine is employed to generate additional annotations and clean
incorrect ones. Lastly, the model is retrained with the processed data and
fine-tuned using a smaller but higher-quality dataset. We evaluate the tagging
capabilities of RAM on numerous benchmarks and observe impressive zero-shot
performance, significantly outperforming CLIP and BLIP. Remarkably, RAM even
surpasses the fully supervised manners and exhibits competitive performance
with the Google API. We are releasing the RAM at
\url{https://recognize-anything.github.io/} to foster the advancements of large
models in computer vision
Old Age Protection in the Context of Rural Development
This study examines the potential of rural communities for generating and allocating resources for rural old age support in the context of decreasing family resources and inadequate state provision. In?depth interviews with elderly people, their families, community leaders and government officials of three villages, respectively located in three provinces provide us with clear evidence on existing local institutional arrangements for rural old age support and the role of both government and community in organising such programmes. They confirm the potential of rural communities to generate and distribute resources for old age support, offering community opportunities for social inclusion through fair flows of resources to promote social harmony and stability, and accelerating economic growth. The findings of the study imply that there is a need for policymakers to link the state effort for old age protection to rural community development, and encourage grassroots efforts in old age support
Precision measurements of A1N in the deep inelastic regime
We have performed precision measurements of the double-spin virtual-photon asymmetry A1A1 on the neutron in the deep inelastic scattering regime, using an open-geometry, large-acceptance spectrometer and a longitudinally and transversely polarized 3He target. Our data cover a wide kinematic range 0.277≤x≤0.5480.277≤x≤0.548 at an average Q2Q2 value of 3.078 (GeV/c)2, doubling the available high-precision neutron data in this x range. We have combined our results with world data on proton targets to make a leading-order extraction of the ratio of polarized-to-unpolarized parton distribution functions for up quarks and for down quarks in the same kinematic range. Our data are consistent with a previous observation of anA1n zero crossing near x=0.5x=0.5. We find no evidence of a transition to a positive slope in(Δd+Δd¯)/(d+d¯) up to x=0.548x=0.548
Growth mechanisms for spherical mixed hydroxide agglomerates prepared by co-precipitation method: a case of Ni1/3Co1/3Mn1/3(OH)2
Spherical Ni1/3Co1/3Mn1/3(OH)2 agglomerates were synthesized by the co-precipitation method in the presence of ammonia. The results show that the growth mechanism of spherical agglomerates follows three-stages, i.e. nucleation and anisotropic growth of single crystals; agglomeration of polycrystalline crystallites agglomerated by single crystal grains as primary particles to form embryonic agglomerates; formation, growth and consolidation of spherical agglomerates or particles by agglomeration of embryonic agglomerates, continued growth of individual crystals in the agglomerates and further attachment of primary particles. The first two stages are very fast while the last stage takes almost the entire process to complete. The main reason for the anisotropic growth of Ni1/3Co1/3Mn1/3(OH)2 crystal is that crystal surface energy of E(001), E(100), E(101) and E(102) is different with E(001) being the highest. The morphology of the final spherical agglomerates is explained by partial re-crystallization of contacting primary particles. The growth process of spherical agglomerates was examined by X-ray diffraction, scanning electron microscope, transmission electron microscope and calculation of crystal surface energy using density function theory
Comparison of alternative remediation technologies for recycled gravel contaminated with heavy metals
To evaluate the effects of different remediation methods on heavy metals contaminated recycled gravel, three immobilization agents (monopotassium phosphate, lime, nano-iron) and two mobilization agents (glyphosate, humic acid (HA)) were studied and compared. Results indicated that nano-iron powder was found to be more effective to immobilize Zn, Cu, Pb and Cd. Meanwhile, glyphosate presents a higher mobilization effect than HA with removal rates of about 66.7% for Cd, more than 80% for Cr, Cu and Zn, and the highest removal percentage of 85.9% for Cr. After the mobilization by glyphosate, the leaching rates of Zn, Cu and Cr were about 0.8%, and below 0.2% for Pb and Cd. The leaching rates after nano-iron powder treatment were 1.18% for Zn, 0.96% for Cr, 0.61% for Cu, 0.45% for Pb and Cd not detected. The formation and disappearance of metal (Zn/Cu/Cr/Pb/Cd) compounds were firmly confirmed through X-ray diffraction and scanning electron microscopy analyses on crystalline phases and morphological surface structures
- …