6 research outputs found

    SSDB-Net: a single-step dual branch network for weakly supervised semantic segmentation of food images

    Get PDF
    Food image segmentation, as a critical task in food and nutrition research, promotes the development of various application domains such as calorie and nutrition estimation, food recommender systems, and daily food monitoring systems. Currently, most of the research is focused on food and non-food segmentation, which simply segments the food and background regions. Differently, semantic food segmentation can identify different specific food ingredients in a food image and provide more detailed and accurate information such as object location, shape and class. This is a more challenging but meaningful task, because the same food may appear in completely different colours, shapes and textures in different dishes, and correspondingly less researched. From the implementation perspective, most previous research is based on deep learning methods with pixel-level labelled data. However, annotating pixel-level labels requires extremely high labour costs. In this paper, a novel single-step dual branch network (SSDB-Net) is proposed to achieve weakly supervised semantic food segmentation. To our knowledge, this research is the first time proposing weakly supervised semantic food segmentation with image-level labels based on convolutional neural networks (CNN). It may serve as a benchmark for future food segmentation research. Our proposal method resulted in an mIoU of 14.79% for 104 categories in the FoodSeg103 dataset compared to 11.49% of the state-of-the-art WSSS method applied in food domains

    Vision-language models boost food composition compilation

    Full text link
    Nutrition information plays a pillar role in clinical dietary practice, precision nutrition, and food industry. Currently, food composition compilation serves as a standard paradigm to estimate food nutrition information according to food ingredient information. However, within this paradigm, conventional approaches are laborious and highly dependent on the experience of data managers, they cannot keep pace with the dynamic consumer market and resulting in lagging and missing nutrition data and earlier machine learning methods unable to fully understand food ingredient statement information or ignored the characteristic of food image. To this end, we developed a novel vision-language AI model, UMDFood-VL, using front-of-package labeling and product images to accurately estimate food composition profiles. In order to drive such large model training, we established UMDFood-90k, the most comprehensive multimodal food database to date. The UMDFood-VL model significantly outperformed convolutional neural networks (CNNs) and recurrent neural networks (RNNs) on a variety of nutrition value estimations. For instance, we achieved macro-AUCROC up to 0.921 for fat value estimation, which satisfied the practice requirement of food composition compilation. This performance shed the light to generalize to other food and nutrition-related data compilation and catalyzed the evolution of other food applications.Comment: 31 pages, 5 figure

    The Nutritional Content of Meal Images in Free-Living Conditions-Automatic Assessment with goFOODTM.

    Get PDF
    A healthy diet can help to prevent or manage many important conditions and diseases, particularly obesity, malnutrition, and diabetes. Recent advancements in artificial intelligence and smartphone technologies have enabled applications to conduct automatic nutritional assessment from meal images, providing a convenient, efficient, and accurate method for continuous diet evaluation. We now extend the goFOODTM automatic system to perform food segmentation, recognition, volume, as well as calorie and macro-nutrient estimation from single images that are captured by a smartphone. In order to assess our system's performance, we conducted a feasibility study with 50 participants from Switzerland. We recorded their meals for one day and then dietitians carried out a 24 h recall. We retrospectively analysed the collected images to assess the nutritional content of the meals. By comparing our results with the dietitians' estimations, we demonstrated that the newly introduced system has comparable energy and macronutrient estimation performance with the previous method; however, it only requires a single image instead of two. The system can be applied in a real-life scenarios, and it can be easily used to assess dietary intake. This system could help individuals gain a better understanding of their dietary consumption. Additionally, it could serve as a valuable resource for dietitians, and could contribute to nutritional research

    Power efficient machine learning-based hardware architectures for biomedical applications

    Get PDF
    The future of critical health diagnosis will involve intelligent and smart devices that are low-cost, wearable, and lightweight, requiring low-power, energy-efficient hardware platforms. Various machine learning models, such as deep learning architectures, have been employed to design intelligent healthcare systems. However, deploying these sophisticated and intelligent devices in real-time embedded systems with limited hardware resources and power budget is complex due to the requirement of high computational power in achieving a high accuracy rate. As a result, this creates a significant gap between the advancement of computing technology and the associated device technologies for healthcare applications. Power-efficient machine learning-based digital hardware design techniques have been introduced in this work for the realization of a compact design solution while maintaining optimal prediction accuracy. Two hardware design approaches, DeepSAC and SABiNN have been proposed and analyzed in this work. DeepSAC is a shift-accumulator-based technique, whereas SABiNN is a 2's complement-based binarized digital hardware technique. Neural network models, such as feedforward, convolutional neural nets, residual networks, and other popular machine learning and deep neural networks, are selected to benchmark the proposed model architecture. Various deep compression learning techniques, such as pruning, n-bit (n = 8,16) integer quantization, and binarization on hyper-parameters, are also employed. These models significantly reduced the power consumption rate by 5x, size by 13x, and improved the model latency. For efficient use of these models, especially in biomedical applications, a sleep apnea (SA) detection device for adults is developed to detect SA events in real-time. The input to the system consists of two physiological sensor data, such as ECG signal from the chest movement and SpO2 measurement from the pulse oximeter to predict the occurrence of SA episodes. In the training phase, actual patient data is used, and the network model is converted into the proposed hardware models to achieve medically backed accuracy. After achieving acceptable results of 88 percent accuracy, all the parameters are extracted for inference on edge. In the inference phase, reconfigurable hardware validated the extracted parameter for model precision and power consumption rate before being translated onto the silicon. This research implements the final model in CMOS platforms using 130 nm and 180 nm commercial CMOS processes.Includes bibliographical references
    corecore