38 research outputs found

    Scraping social media photos posted in Kenya and elsewhere to detect and analyze food types

    Full text link
    Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior. We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56 million images over a period of 20 days in March 2019. We also propose a scrape-by-keywords methodology and used it to scrape ∼30,000 images and their captions of 38 Kenyan food types. We publish two datasets of 104,000 and 8,174 image/caption pairs, respectively. With the first dataset, Kenya104K, we train a Kenyan Food Classifier, called KenyanFC, to distinguish Kenyan food from non-food images posted in Kenya. We used the second dataset, KenyanFood13, to train a classifier KenyanFTR, short for Kenyan Food Type Recognizer, to recognize 13 popular food types in Kenya. The KenyanFTR is a multimodal deep neural network that can identify 13 types of Kenyan foods using both images and their corresponding captions. Experiments show that the average top-1 accuracy of KenyanFC is 99% over 10,400 tested Instagram images and of KenyanFTR is 81% over 8,174 tested data points. Ablation studies show that three of the 13 food types are particularly difficult to categorize based on image content only and that adding analysis of captions to the image analysis yields a classifier that is 9 percent points more accurate than a classifier that relies only on images. Our food trend analysis revealed that cakes and roasted meats were the most popular foods in photographs on Instagram in Kenya in March 2019.Accepted manuscrip

    Light field image processing : overview and research issues

    Get PDF
    Light field (LF) imaging first appeared in the computer graphics community with the goal of photorealistic 3D rendering [1]. Motivated by a variety of potential applications in various domains (e.g., computational photography, augmented reality, light field microscopy, medical imaging, 3D robotic, particle image velocimetry), imaging from real light fields has recently gained in popularity, both at the research and industrial level.peer-reviewe

    Instanceeasytl: an improved transfer-learning method for EEG-based cross-subject fatigue detection

    Get PDF
    Electroencephalogram (EEG) is an effective indicator for the detection of driver fatigue. Due to the significant differences in EEG signals across subjects, and difficulty in collecting sufficient EEG samples for analysis during driving, detecting fatigue across subjects through using EEG signals remains a challenge. EasyTL is a kind of transfer-learning model, which has demonstrated better performance in the field of image recognition, but not yet been applied in cross-subject EEG-based applications. In this paper, we propose an improved EasyTL-based classifier, the InstanceEasyTL, to perform EEG-based analysis for cross-subject fatigue mental-state detection. Experimental results show that InstanceEasyTL not only requires less EEG data, but also obtains better performance in accuracy and robustness than EasyTL, as well as existing machine-learning models such as Support Vector Machine (SVM), Transfer Component Analysis (TCA), Geodesic Flow Kernel (GFK), and Domain-adversarial Neural Networks (DANN), etc

    Network Capacity Bound for Personalized PageRank in Multimodal Networks

    Full text link
    In a former paper the concept of Bipartite PageRank was introduced and a theorem on the limit of authority flowing between nodes for personalized PageRank has been generalized. In this paper we want to extend those results to multimodal networks. In particular we introduce a hypergraph type that may be used for describing multimodal network where a hyperlink connects nodes from each of the modalities. We introduce a generalisation of PageRank for such graphs and define the respective random walk model that can be used for computations. we finally state and prove theorems on the limit of outflow of authority for cases where individual modalities have identical and distinct damping factors.Comment: 28 pages. arXiv admin note: text overlap with arXiv:1702.0373

    A Decoding-Complexity and Rate-Controlled Video-Coding Algorithm for HEVC

    Get PDF
    Video playback on mobile consumer electronic (CE) devices is plagued by fluctuations in the network bandwidth and by limitations in processing and energy availability at the individual devices. Seen as a potential solution, the state-of-the-art adaptive streaming mechanisms address the first aspect, yet the efficient control of the decoding-complexity and the energy use when decoding the video remain unaddressed. The quality of experience (QoE) of the end-users’ experiences, however, depends on the capability to adapt the bit streams to both these constraints (i.e., network bandwidth and device’s energy availability). As a solution, this paper proposes an encoding framework that is capable of generating video bit streams with arbitrary bit rates and decoding-complexity levels using a decoding-complexity–rate–distortion model. The proposed algorithm allocates rate and decoding-complexity levels across frames and coding tree units (CTUs) and adaptively derives the CTU-level coding parameters to achieve their imposed targets with minimal distortion. The experimental results reveal that the proposed algorithm can achieve the target bit rate and the decoding-complexity with 0.4% and 1.78% average errors, respectively, for multiple bit rate and decoding-complexity levels. The proposed algorithm also demonstrates a stable frame-wise rate and decoding-complexity control capability when achieving a decoding-complexity reduction of 10.11 (%/dB). The resultant decoding-complexity reduction translates into an overall energy-consumption reduction of up to 10.52 (%/dB) for a 1 dB peak signal-to-noise ratio (PSNR) quality loss compared to the HM 16.0 encoded bit streams

    HemoKinect: A Microsoft Kinect V2 Based Exergaming Software to Supervise Physical Exercise of Patients with Hemophilia

    Get PDF
    Patients with hemophilia need to strictly follow exercise routines to minimize their risk of suffering bleeding in joints, known as hemarthrosis. This paper introduces and validates a new exergaming software tool called HemoKinect that intends to keep track of exercises using Microsoft Kinect V2's body tracking capabilities. The software has been developed in C++ and MATLAB. The Kinect SDK V2.0 libraries have been used to obtain 3D joint positions from the Kinect color and depth sensors. Performing angle calculations and center-of-mass (COM) estimations using these joint positions, HemoKinect can evaluate the following exercises: elbow flexion/extension, knee flexion/extension (squat), step climb (ankle exercise) and multi-directional balance based on COM. The software generates reports and progress graphs and is able to directly send the results to the physician via email. Exercises have been validated with 10 controls and eight patients. HemoKinect successfully registered elbow and knee exercises, while displaying real-time joint angle measurements. Additionally, steps were successfully counted in up to 78% of the cases. Regarding balance, differences were found in the scores according to the difficulty level and direction. HemoKinect supposes a significant leap forward in terms of exergaming applicability to rehabilitation of patients with hemophilia, allowing remote supervision

    A Cost-Effective Person-Following System for Assistive Unmanned Vehicles with Deep Learning at the Edge

    Get PDF
    The vital statistics of the last century highlight a sharp increment of the average age of the world population with a consequent growth of the number of older people. Service robotics applications have the potentiality to provide systems and tools to support the autonomous and self-sufficient older adults in their houses in everyday life, thereby avoiding the task of monitoring them with third parties. In this context, we propose a cost-effective modular solution to detect and follow a person in an indoor, domestic environment. We exploited the latest advancements in deep learning optimization techniques, and we compared different neural network accelerators to provide a robust and flexible person-following system at the edge. Our proposed cost-effective and power-efficient solution is fully-integrable with pre-existing navigation stacks and creates the foundations for the development of fully-autonomous and self-contained service robotics applications

    An exploration of the performances achievable by combining unsupervised background subtraction algorithms

    Full text link
    Background subtraction (BGS) is a common choice for performing motion detection in video. Hundreds of BGS algorithms are released every year, but combining them to detect motion remains largely unexplored. We found that combination strategies allow to capitalize on this massive amount of available BGS algorithms, and offer significant space for performance improvement. In this paper, we explore sets of performances achievable by 6 strategies combining, pixelwise, the outputs of 26 unsupervised BGS algorithms, on the CDnet 2014 dataset, both in the ROC space and in terms of the F1 score. The chosen strategies are representative for a large panel of strategies, including both deterministic and non-deterministic ones, voting and learning. In our experiments, we compare our results with the state-of-the-art combinations IUTIS-5 and CNN-SFC, and report six conclusions, among which the existence of an important gap between the performances of the individual algorithms and the best performances achievable by combining them.Applications et Recherche pour une Intelligence Artificielle de Confiance (ARIAC
    corecore