839 research outputs found
Scraping social media photos posted in Kenya and elsewhere to detect and analyze food types
Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior. We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56 million images over a period of 20 days in March 2019. We also propose a scrape-by-keywords methodology and used it to scrape ∼30,000 images and their captions of 38 Kenyan food types. We publish two datasets of 104,000 and 8,174 image/caption pairs, respectively. With the first dataset, Kenya104K, we train a Kenyan Food Classifier, called KenyanFC, to distinguish Kenyan food from non-food images posted in
Kenya. We used the second dataset, KenyanFood13, to train a classifier KenyanFTR, short for Kenyan Food Type Recognizer, to recognize 13 popular food types in Kenya. The KenyanFTR is a multimodal deep neural network that can identify 13 types of Kenyan foods using both images and their corresponding captions. Experiments show that the average top-1 accuracy of KenyanFC is 99% over 10,400 tested Instagram images and of KenyanFTR is 81% over 8,174 tested data points. Ablation studies show that three of the 13 food types are particularly difficult to categorize based on image content only and that adding analysis of captions to the image analysis yields a classifier that is 9 percent points more accurate than a classifier that relies only on images. Our food trend analysis revealed that cakes and roasted meats were the most popular foods in photographs on Instagram in Kenya in March 2019.Accepted manuscrip
An Improved Encoder-Decoder Framework for Food Energy Estimation
Dietary assessment is essential to maintaining a healthy lifestyle. Automatic
image-based dietary assessment is a growing field of research due to the
increasing prevalence of image capturing devices (e.g. mobile phones). In this
work, we estimate food energy from a single monocular image, a difficult task
due to the limited hard-to-extract amount of energy information present in an
image. To do so, we employ an improved encoder-decoder framework for energy
estimation; the encoder transforms the image into a representation embedded
with food energy information in an easier-to-extract format, which the decoder
then extracts the energy information from. To implement our method, we compile
a high-quality food image dataset verified by registered dietitians containing
eating scene images, food-item segmentation masks, and ground truth calorie
values. Our method improves upon previous caloric estimation methods by over
10\% and 30 kCal in terms of MAPE and MAE respectively.Comment: Accepted for Madima'23 in ACM Multimedi
Diffusion Model with Clustering-based Conditioning for Food Image Generation
Image-based dietary assessment serves as an efficient and accurate solution
for recording and analyzing nutrition intake using eating occasion images as
input. Deep learning-based techniques are commonly used to perform image
analysis such as food classification, segmentation, and portion size
estimation, which rely on large amounts of food images with annotations for
training. However, such data dependency poses significant barriers to
real-world applications, because acquiring a substantial, diverse, and balanced
set of food images can be challenging. One potential solution is to use
synthetic food images for data augmentation. Although existing work has
explored the use of generative adversarial networks (GAN) based structures for
generation, the quality of synthetic food images still remains subpar. In
addition, while diffusion-based generative models have shown promising results
for general image generation tasks, the generation of food images can be
challenging due to the substantial intra-class variance. In this paper, we
investigate the generation of synthetic food images based on the conditional
diffusion model and propose an effective clustering-based training framework,
named ClusDiff, for generating high-quality and representative food images. The
proposed method is evaluated on the Food-101 dataset and shows improved
performance when compared with existing image generation works. We also
demonstrate that the synthetic food images generated by ClusDiff can help
address the severe class imbalance issue in long-tailed food classification
using the VFN-LT dataset.Comment: Accepted for 31st ACM International Conference on Multimedia: 8th
International Workshop on Multimedia Assisted Dietary Management (MADiMa
2023
Smartphone-based Calorie Estimation From Food Image Using Distance Information
Personal assistive systems for diet control can play a vital role to combat obesity. As smartphones have become inseparable companions for a large number of people around the world, designing smartphone-based system is perhaps the best choice at the moment. Using this system people can take an image of their food right before eating, know the calorie content based on the food items on the plate. In this paper, we propose a simple method that ensures both user flexibility and high accuracy at the same time. The proposed system employs capturing food images with a fixed posture and estimating the volume of the food using simple geometry. The real world experiments on different food items chosen arbitrarily show that the proposed system can work well for both regular and liquid food items
Muti-Stage Hierarchical Food Classification
Food image classification serves as a fundamental and critical step in
image-based dietary assessment, facilitating nutrient intake analysis from
captured food images. However, existing works in food classification
predominantly focuses on predicting 'food types', which do not contain direct
nutritional composition information. This limitation arises from the inherent
discrepancies in nutrition databases, which are tasked with associating each
'food item' with its respective information. Therefore, in this work we aim to
classify food items to align with nutrition database. To this end, we first
introduce VFN-nutrient dataset by annotating each food image in VFN with a food
item that includes nutritional composition information. Such annotation of food
items, being more discriminative than food types, creates a hierarchical
structure within the dataset. However, since the food item annotations are
solely based on nutritional composition information, they do not always show
visual relations with each other, which poses significant challenges when
applying deep learning-based techniques for classification. To address this
issue, we then propose a multi-stage hierarchical framework for food item
classification by iteratively clustering and merging food items during the
training process, which allows the deep model to extract image features that
are discriminative across labels. Our method is evaluated on VFN-nutrient
dataset and achieve promising results compared with existing work in terms of
both food type and food item classification.Comment: accepted for ACM MM 2023 Madim
- …