10 research outputs found

    Open-World Weakly-Supervised Object Localization

    Full text link
    While remarkable success has been achieved in weakly-supervised object localization (WSOL), current frameworks are not capable of locating objects of novel categories in open-world settings. To address this issue, we are the first to introduce a new weakly-supervised object localization task called OWSOL (Open-World Weakly-Supervised Object Localization). During training, all labeled data comes from known categories and, both known and novel categories exist in the unlabeled data. To handle such data, we propose a novel paradigm of contrastive representation co-learning using both labeled and unlabeled data to generate a complete G-CAM (Generalized Class Activation Map) for object localization, without the requirement of bounding box annotation. As no class label is available for the unlabelled data, we conduct clustering over the full training set and design a novel multiple semantic centroids-driven contrastive loss for representation learning. We re-organize two widely used datasets, i.e., ImageNet-1K and iNatLoc500, and propose OpenImages150 to serve as evaluation benchmarks for OWSOL. Extensive experiments demonstrate that the proposed method can surpass all baselines by a large margin. We believe that this work can shift the close-set localization towards the open-world setting and serve as a foundation for subsequent works. Code will be released at https://github.com/ryylcc/OWSOL

    Recognize Anything: A Strong Image Tagging Model

    Full text link
    We present the Recognize Anything Model (RAM): a strong foundation model for image tagging. RAM can recognize any common category with high accuracy. RAM introduces a new paradigm for image tagging, leveraging large-scale image-text pairs for training instead of manual annotations. The development of RAM comprises four key steps. Firstly, annotation-free image tags are obtained at scale through automatic text semantic parsing. Subsequently, a preliminary model is trained for automatic annotation by unifying the caption and tagging tasks, supervised by the original texts and parsed tags, respectively. Thirdly, a data engine is employed to generate additional annotations and clean incorrect ones. Lastly, the model is retrained with the processed data and fine-tuned using a smaller but higher-quality dataset. We evaluate the tagging capabilities of RAM on numerous benchmarks and observe impressive zero-shot performance, significantly outperforming CLIP and BLIP. Remarkably, RAM even surpasses the fully supervised manners and exhibits competitive performance with the Google API. We are releasing the RAM at \url{https://recognize-anything.github.io/} to foster the advancements of large models in computer vision

    Congenital cataract: prevalence and surgery age at Zhongshan Ophthalmic Center (ZOC).

    No full text
    Congenital cataract (CC) is the primary cause of treatable childhood blindness. Population-based assessments of prevalence and surgery age of CC, which are critical for improving management strategies, have been unavailable in China until now. We conducted a hospital-based, cross-sectional study of the hospital charts of CC patients younger than 18 years old from January 2005 to December 2010 at Zhongshan Ophthalmic Center (ZOC) in Guangzhou, China. Residence, gender, age at surgery, hospitalization time, and the presence of other ocular abnormalities were extracted and statistically analyzed in different subgroups. The search identified 1314 patients diagnosed with CC from a total of 136154 hospitalizations, which accounted for 2.39% of all the cataract in-patients and 1.06% of the total in-patients over the six-year study period. Of the identified CC patients, 9.2% had ≥ 2 hospitalizations due to the necessity of additional surgeries, with a total ratio of boys to girls of 1.75 ∶ 1. Based on a subgroup analysis according to age, patients 2-6 years old constituted the highest proportion (29.22%) of all hospitalized CC patients, and those 13-18 years old constituted the lowest proportion (13.47%) of the total number. The average age at surgery was 27.62 ± 23.36 months, but CC patients ≤ 6 years old (especially ≤ 6 months old) became increasingly prevalent throughout the 6-year study period. A total of 276 cases (20.93%) of CC were associated with one or more other ocular abnormalities, the highest incidence rates were observed for exotropia (6.24%), nystagmus (6.16%), and refractive error (3.65%). In conclusion, CC patients accounted for 2.39% of all cataract in-patients in a review of 6 years of hospitalization charts from ZOC. The age at the time of surgery decreased over the 6-year study period, which probably reflects the continuing improvement of public awareness of children's eye care in China
    corecore