278 research outputs found

    Environmental Justice in Greater Los Angeles: Impacts of Spatial and Ethnic Factors on Residents' Socioeconomic and Health Status

    Get PDF
    Environmental justice advocates that all people are protected from disproportionate impacts of environmental hazards. Despite this ideal aspiration, social and environmental inequalities exist throughout greater Los Angeles. Previous research has identified and mapped pollutant levels, demographic information, and the population’s socioeconomic status and health issues. Nevertheless, the complex interrelationships between these factors remain unclear. To close this knowledge gap, we first measured the spatial centrality using sDNA software. These data were then integrated with other socioeconomic and health data collected from CalEnvironScreen, with census tract as the unit of analysis. Finally, structural equation modeling (SEM) was executed to explore direct, indirect, and total effects among variables. The results show that the White population tends to reside in the more segregated areas and lives closer to green space, contributing to higher housing stability, financial security, and more education attainment. In contrast, people of color, especially Latinx, experience the opposite of the environmental benefits. Spatial centrality exhibits a significant indirect effect on environmental justice by influencing ethnicity composition and pollution levels. Moreover, green space accessibility significantly influences environmental justice via pollution. These findings can assist decision-makers to create a more inclusive society and curtail social segregation for all individuals

    Turning a CLIP Model into a Scene Text Detector

    Full text link
    The recent large-scale Contrastive Language-Image Pretraining (CLIP) model has shown great potential in various downstream tasks via leveraging the pretrained vision and language knowledge. Scene text, which contains rich textual and visual information, has an inherent connection with a model like CLIP. Recently, pretraining approaches based on vision language models have made effective progresses in the field of text detection. In contrast to these works, this paper proposes a new method, termed TCM, focusing on Turning the CLIP Model directly for text detection without pretraining process. We demonstrate the advantages of the proposed TCM as follows: (1) The underlying principle of our framework can be applied to improve existing scene text detector. (2) It facilitates the few-shot training capability of existing methods, e.g., by using 10% of labeled data, we significantly improve the performance of the baseline method with an average of 22% in terms of the F-measure on 4 benchmarks. (3) By turning the CLIP model into existing scene text detection methods, we further achieve promising domain adaptation ability. The code will be publicly released at https://github.com/wenwenyu/TCM.Comment: CVPR202

    To Explain or Not to Explain: A Study on the Necessity of Explanations for Autonomous Vehicles

    Full text link
    Explainable AI, in the context of autonomous systems, like self driving cars, has drawn broad interests from researchers. Recent studies have found that providing explanations for an autonomous vehicle actions has many benefits, e.g., increase trust and acceptance, but put little emphasis on when an explanation is needed and how the content of explanation changes with context. In this work, we investigate which scenarios people need explanations and how the critical degree of explanation shifts with situations and driver types. Through a user experiment, we ask participants to evaluate how necessary an explanation is and measure the impact on their trust in the self driving cars in different contexts. We also present a self driving explanation dataset with first person explanations and associated measure of the necessity for 1103 video clips, augmenting the Berkeley Deep Drive Attention dataset. Additionally, we propose a learning based model that predicts how necessary an explanation for a given situation in real time, using camera data inputs. Our research reveals that driver types and context dictates whether or not an explanation is necessary and what is helpful for improved interaction and understanding.Comment: 9.5 pages, 7 figures, submitted to UIST202

    Looking and Listening: Audio Guided Text Recognition

    Full text link
    Text recognition in the wild is a long-standing problem in computer vision. Driven by end-to-end deep learning, recent studies suggest vision and language processing are effective for scene text recognition. Yet, solving edit errors such as add, delete, or replace is still the main challenge for existing approaches. In fact, the content of the text and its audio are naturally corresponding to each other, i.e., a single character error may result in a clear different pronunciation. In this paper, we propose the AudioOCR, a simple yet effective probabilistic audio decoder for mel spectrogram sequence prediction to guide the scene text recognition, which only participates in the training phase and brings no extra cost during the inference stage. The underlying principle of AudioOCR can be easily applied to the existing approaches. Experiments using 7 previous scene text recognition methods on 12 existing regular, irregular, and occluded benchmarks demonstrate our proposed method can bring consistent improvement. More importantly, through our experimentation, we show that AudioOCR possesses a generalizability that extends to more challenging scenarios, including recognizing non-English text, out-of-vocabulary words, and text with various accents. Code will be available at https://github.com/wenwenyu/AudioOCR

    ID Embedding as Subtle Features of Content and Structure for Multimodal Recommendation

    Full text link
    Multimodal recommendation aims to model user and item representations comprehensively with the involvement of multimedia content for effective recommendations. Existing research has shown that it is beneficial for recommendation performance to combine (user- and item-) ID embeddings with multimodal salient features, indicating the value of IDs. However, there is a lack of a thorough analysis of the ID embeddings in terms of feature semantics in the literature. In this paper, we revisit the value of ID embeddings for multimodal recommendation and conduct a thorough study regarding its semantics, which we recognize as subtle features of content and structures. Then, we propose a novel recommendation model by incorporating ID embeddings to enhance the semantic features of both content and structures. Specifically, we put forward a hierarchical attention mechanism to incorporate ID embeddings in modality fusing, coupled with contrastive learning, to enhance content representations. Meanwhile, we propose a lightweight graph convolutional network for each modality to amalgamate neighborhood and ID embeddings for improving structural representations. Finally, the content and structure representations are combined to form the ultimate item embedding for recommendation. Extensive experiments on three real-world datasets (Baby, Sports, and Clothing) demonstrate the superiority of our method over state-of-the-art multimodal recommendation methods and the effectiveness of fine-grained ID embeddings

    Consecutive Insulator-Metal-Insulator Phase Transitions of Vanadium Dioxide by Hydrogen Doping

    Get PDF
    We report modulation of a reversible phase transition in VO2 films by hydrogen doping. A metallic phase and a new insulating phase are successively observed at room temperature as the doping concentration increases. It is suggested that the polarized charges from doped hydrogens play an important role. These charges gradually occupy V3d-O2p hybridized orbitals and consequently modulate the filling of the VO2 crystal conduction band-edge states, which eventually evolve into new valence band-edge states. This demonstrates the exceptional sensitivity of VO2 electronic properties to electron concentration and orbital occupancy, providing key information for the phase transition mechanism.Comment: 16 pages, 4 figure
    corecore