208 research outputs found

    Scraping social media photos posted in Kenya and elsewhere to detect and analyze food types

    Full text link
    Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior. We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56 million images over a period of 20 days in March 2019. We also propose a scrape-by-keywords methodology and used it to scrape ∼30,000 images and their captions of 38 Kenyan food types. We publish two datasets of 104,000 and 8,174 image/caption pairs, respectively. With the first dataset, Kenya104K, we train a Kenyan Food Classifier, called KenyanFC, to distinguish Kenyan food from non-food images posted in Kenya. We used the second dataset, KenyanFood13, to train a classifier KenyanFTR, short for Kenyan Food Type Recognizer, to recognize 13 popular food types in Kenya. The KenyanFTR is a multimodal deep neural network that can identify 13 types of Kenyan foods using both images and their corresponding captions. Experiments show that the average top-1 accuracy of KenyanFC is 99% over 10,400 tested Instagram images and of KenyanFTR is 81% over 8,174 tested data points. Ablation studies show that three of the 13 food types are particularly difficult to categorize based on image content only and that adding analysis of captions to the image analysis yields a classifier that is 9 percent points more accurate than a classifier that relies only on images. Our food trend analysis revealed that cakes and roasted meats were the most popular foods in photographs on Instagram in Kenya in March 2019.Accepted manuscrip

    Recognition of Activities of Daily Living with Egocentric Vision: A Review.

    Get PDF
    Video-based recognition of activities of daily living (ADLs) is being used in ambient assisted living systems in order to support the independent living of older people. However, current systems based on cameras located in the environment present a number of problems, such as occlusions and a limited field of view. Recently, wearable cameras have begun to be exploited. This paper presents a review of the state of the art of egocentric vision systems for the recognition of ADLs following a hierarchical structure: motion, action and activity levels, where each level provides higher semantic information and involves a longer time frame. The current egocentric vision literature suggests that ADLs recognition is mainly driven by the objects present in the scene, especially those associated with specific tasks. However, although object-based approaches have proven popular, object recognition remains a challenge due to the intra-class variations found in unconstrained scenarios. As a consequence, the performance of current systems is far from satisfactory

    Face Quality Estimation and Its Correlation to Demographic and Non-Demographic Bias in Face Recognition

    Full text link
    Face quality assessment aims at estimating the utility of a face image for the purpose of recognition. It is a key factor to achieve high face recognition performances. Currently, the high performance of these face recognition systems come with the cost of a strong bias against demographic and non-demographic sub-groups. Recent work has shown that face quality assessment algorithms should adapt to the deployed face recognition system, in order to achieve highly accurate and robust quality estimations. However, this could lead to a bias transfer towards the face quality assessment leading to discriminatory effects e.g. during enrolment. In this work, we present an in-depth analysis of the correlation between bias in face recognition and face quality assessment. Experiments were conducted on two publicly available datasets captured under controlled and uncontrolled circumstances with two popular face embeddings. We evaluated four state-of-the-art solutions for face quality assessment towards biases to pose, ethnicity, and age. The experiments showed that the face quality assessment solutions assign significantly lower quality values towards subgroups affected by the recognition bias demonstrating that these approaches are biased as well. This raises ethical questions towards fairness and discrimination which future works have to address.Comment: Accepted at IJCB202

    Integration of Context Information through Probabilistic Ontological Knowledge into Image Classification

    Get PDF
    The use of ontological knowledge to improve classification results is a promising line of research. The availability of a probabilistic ontology raises the possibility of combining the probabilities coming from the ontology with the ones produced by a multi-class classifier that detects particular objects in an image. This combination not only provides the relations existing between the different segments, but can also improve the classification accuracy. In fact, it is known that the contextual information can often give information that suggests the correct class. This paper proposes a possible model that implements this integration, and the experimental assessment shows the effectiveness of the integration, especially when the classifier’s accuracy is relatively low. To assess the performance of the proposed model, we designed and implemented a simulated classifier that allows a priori decisions of its performance with sufficient precision

    Fine-grained action recognition by motion saliency and mid-level patches

    Get PDF
    Effective extraction of human body parts and operated objects participating in action is the key issue of fine-grained action recognition. However, most of the existing methods require intensive manual annotation to train the detectors of these interaction components. In this paper, we represent videos by mid-level patches to avoid the manual annotation, where each patch corresponds to an action-related interaction component. In order to capture mid-level patches more exactly and rapidly, candidate motion regions are extracted by motion saliency. Firstly, the motion regions containing interaction components are segmented by a threshold adaptively calculated according to the saliency histogram of the motion saliency map. Secondly, we introduce a mid-level patch mining algorithm for interaction component detection, with object proposal generation and mid-level patch detection. The object proposal generation algorithm is used to obtain multi-granularity object proposals inspired by the idea of the Huffman algorithm. Based on these object proposals, the mid-level patch detectors are trained by K-means clustering and SVM. Finally, we build a fine-grained action recognition model using a graph structure to describe relationships between the mid-level patches. To recognize actions, the proposed model calculates the appearance and motion features of mid-level patches and the binary motion cooperation relationships between adjacent patches in the graph. Extensive experiments on the MPII cooking database demonstrate that the proposed method gains better results on fine-grained action recognition
    • …
    corecore