Artificial Intelligence (AI) offers a promising approach to automating neonatal pain assessment, improving consistency and objectivity in clinical decision-making. However, differences between how humans and AI models perceive and explain pain-related features present challenges for adoption. In this study, we introduce a region-based explanation framework that improves interpretability and agreement between XAI methods and human assessments. Alongside this, we present a multi-metric evaluation protocol that jointly considers robustness, faithfulness, and agreement to support informed explainer selection. Applied to neonatal pain classification, our approach reveals several key insights: region-based explanations are more intuitive and stable than pixel-based methods — leading to higher consensus amongst explainer ensembles; both humans and machines focus on central facial features, such as the nose, mouth, and eyes; agreement is higher in "pain" cases than "no-pain" cases likely due to clearer visual cues; and robustness positively correlates with agreement, while higher faithfulness can reduce pixel-level consensus. Our findings highlight the value of region-based evaluation and multi-perspective analysis for improving the transparency and reliability of AI systems in clinical settings. We hope that this framework can support clinicians in better understanding model decisions, enabling more informed trust and integration of AI support in neonatal care
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.