5,035 research outputs found
Egocentric Image Captioning for Privacy-Preserved Passive Dietary Intake Monitoring
Camera-based passive dietary intake monitoring is able to continuously
capture the eating episodes of a subject, recording rich visual information,
such as the type and volume of food being consumed, as well as the eating
behaviours of the subject. However, there currently is no method that is able
to incorporate these visual clues and provide a comprehensive context of
dietary intake from passive recording (e.g., is the subject sharing food with
others, what food the subject is eating, and how much food is left in the
bowl). On the other hand, privacy is a major concern while egocentric wearable
cameras are used for capturing. In this paper, we propose a privacy-preserved
secure solution (i.e., egocentric image captioning) for dietary assessment with
passive monitoring, which unifies food recognition, volume estimation, and
scene understanding. By converting images into rich text descriptions,
nutritionists can assess individual dietary intake based on the captions
instead of the original images, reducing the risk of privacy leakage from
images. To this end, an egocentric dietary image captioning dataset has been
built, which consists of in-the-wild images captured by head-worn and
chest-worn cameras in field studies in Ghana. A novel transformer-based
architecture is designed to caption egocentric dietary images. Comprehensive
experiments have been conducted to evaluate the effectiveness and to justify
the design of the proposed architecture for egocentric dietary image
captioning. To the best of our knowledge, this is the first work that applies
image captioning to dietary intake assessment in real life settings
- …