220 research outputs found
Structure-Consistent Weakly Supervised Salient Object Detection with Local Saliency Coherence
Sparse labels have been attracting much attention in recent years. However,
the performance gap between weakly supervised and fully supervised salient
object detection methods is huge, and most previous weakly supervised works
adopt complex training methods with many bells and whistles. In this work, we
propose a one-round end-to-end training approach for weakly supervised salient
object detection via scribble annotations without pre/post-processing
operations or extra supervision data. Since scribble labels fail to offer
detailed salient regions, we propose a local coherence loss to propagate the
labels to unlabeled regions based on image features and pixel distance, so as
to predict integral salient regions with complete object structures. We design
a saliency structure consistency loss as self-consistent mechanism to ensure
consistent saliency maps are predicted with different scales of the same image
as input, which could be viewed as a regularization technique to enhance the
model generalization ability. Additionally, we design an aggregation module
(AGGM) to better integrate high-level features, low-level features and global
context information for the decoder to aggregate various information. Extensive
experiments show that our method achieves a new state-of-the-art performance on
six benchmarks (e.g. for the ECSSD dataset: F_\beta = 0.8995, E_\xi = 0.9079
and MAE = 0.0489$), with an average gain of 4.60\% for F-measure, 2.05\% for
E-measure and 1.88\% for MAE over the previous best method on this task. Source
code is available at http://github.com/siyueyu/SCWSSOD.Comment: Accepted by AAAI202
A Deeper Look at Autonomous Vehicle Ethics: An Integrative Ethical Decision-Making Framework to Explain Moral Pluralism
The autonomous vehicle (AV) is one of the first commercialized AI-embedded robots to make autonomous decisions. Despite technological advancements, unavoidable AV accidents that result in life-and-death consequences cannot be completely eliminated. The emerging social concern of how an AV should make ethical decisions during unavoidable accidents is referred to as the moral dilemma of AV, which has promoted heated discussions among various stakeholders. However, there are research gaps in explainable AV ethical decision-making processes that predict how AVs’ moral behaviors are made that are acceptable from the AV users’ perspectives. This study addresses the key question: What factors affect ethical behavioral intentions in the AV moral dilemma? To answer this question, this study draws theories from multidisciplinary research fields to propose the “Integrative ethical decision-making framework for the AV moral dilemma.” The framework includes four interdependent ethical decision-making stages: AV moral dilemma issue framing, intuitive moral reasoning, rational moral reasoning, and ethical behavioral intention making. Further, the framework includes variables (e.g., perceived moral intensity, individual factors, and personal moral philosophies) that influence the ethical decision-making process. For instance, the framework explains that AV users from Eastern cultures will tend to endorse a situationist ethics position (high idealism and high relativism), which views that ethical decisions are relative to context, compared to AV users from Western cultures. This proposition is derived from the link between individual factors and personal moral philosophy. Moreover, the framework proposes a dual-process theory, which explains that both intuitive and rational moral reasoning are integral processes of ethical decision-making during the AV moral dilemma. Further, this framework describes that ethical behavioral intentions that lead to decisions in the AV moral dilemma are not fixed, but are based on how an individual perceives the seriousness of the situation, which is shaped by their personal moral philosophy. This framework provides a step-by-step explanation of how pluralistic ethical decision-making occurs, reducing the abstractness of AV moral reasoning processes
Discriminative Triad Matching and Reconstruction for Weakly Referring Expression Grounding
In this paper, we are tackling the weakly-supervised referring expression
grounding task, for the localization of a referent object in an image according
to a query sentence, where the mapping between image regions and queries are
not available during the training stage. In traditional methods, an object
region that best matches the referring expression is picked out, and then the
query sentence is reconstructed from the selected region, where the
reconstruction difference serves as the loss for back-propagation. The existing
methods, however, conduct both the matching and the reconstruction
approximately as they ignore the fact that the matching correctness is unknown.
To overcome this limitation, a discriminative triad is designed here as the
basis to the solution, through which a query can be converted into one or
multiple discriminative triads in a very scalable way. Based on the
discriminative triad, we further propose the triad-level matching and
reconstruction modules which are lightweight yet effective for the
weakly-supervised training, making it three times lighter and faster than the
previous state-of-the-art methods. One important merit of our work is its
superior performance despite the simple and neat design. Specifically, the
proposed method achieves a new state-of-the-art accuracy when evaluated on
RefCOCO (39.21%), RefCOCO+ (39.18%) and RefCOCOg (43.24%) datasets, that is
4.17%, 4.08% and 7.8% higher than the previous one, respectively.Comment: TPAM
Class Activation Map Calibration for Weakly Supervised Semantic Segmentation
Image-level weakly supervised semantic segmentation (WSSS) has received substantial attention due to its cost-effective annotation process. In WSSS, Class Activation Maps (CAMs) generated via classifier weights tend to focus on the most discriminative region, while the CAMs derived from class prototypes are significantly enhanced to cover more complete regions. However, the prototype CAMs still exhibit limitations such as incomplete localization maps on target objects and the presence of background noise. In this paper, we propose a novel WSSS framework called Classifier-Prototype Mutual Calibration (CPMC) that leverages the characteristics of both classifier and prototype CAMs to address the above issues. Specifically, an iterative refinement strategy based on context feature dependency is applied to refine the original classifier CAMs, which helps to generate improved prototype CAMs. Subsequently, local prototypes are constructed based on the false negative regions and false positive regions extracted from the previous two CAMs, which contribute to completing missing parts of the target object and suppressing background noise respectively. Therefore, CPMC can alleviate the aforementioned issues. Extensive experimental results on standard WSSS benchmarks (PASCAL VOC and MS COCO) show that our method significantly improves the quality of CAMs and achieves state-of-the-art performance. Our source code will be released
Cross-frame feature-saliency mutual reinforcing for weakly supervised video salient object detection
- …