6 research outputs found

    CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

    Full text link
    Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images. The aggregation of multi-scale convolution layers with different kernel size is also used for weighting the features results from different scales. In addition, a joint loss function based on Negative Log Likelihood (NLL) is used to fit the model probability to multi labeled classes for multi-modal classification task. Furthermore, this work provides a new dataset for food attributes, so-called Yummly48K, extracted from the popular food website, Yummly. Our model is assessed on the constructed Yummly48K dataset. The experimental results show that our proposed method yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.Comment: 8 pages, Submitted in CCIA 201

    Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network

    Full text link
    This paper proposed a retinal image segmentation method based on conditional Generative Adversarial Network (cGAN) to segment optic disc. The proposed model consists of two successive networks: generator and discriminator. The generator learns to map information from the observing input (i.e., retinal fundus color image), to the output (i.e., binary mask). Then, the discriminator learns as a loss function to train this mapping by comparing the ground-truth and the predicted output with observing the input image as a condition.Experiments were performed on two publicly available dataset; DRISHTI GS1 and RIM-ONE. The proposed model outperformed state-of-the-art-methods by achieving around 0.96% and 0.98% of Jaccard and Dice coefficients, respectively. Moreover, an image segmentation is performed in less than a second on recent GPU.Comment: 8 pages, Submitted to 21st International Conference of the Catalan Association for Artificial Intelligence (CCIA 2018

    Learning Brightness Transfer Functions for the Joint Recovery of Illumination Changes and Optical Flow

    No full text
    The increasing importance of outdoor applications such as driver assistance systems or video surveillance tasks has recently triggered the development of optical flow methods that aim at performing robustly under uncontrolled illumination. Most of these methods are based on patch-based features such as the normalized cross correlation, the census transform or the rank transform. They achieve their robustness by locally discarding both absolute brightness and contrast. In this paper, we follow an alternative strategy: Instead of discarding potentially important image information, we propose a novel variational model that jointly estimates both illumination changes and optical flow. The key idea is to parametrize the illumination changes in terms of basis functions that are learned from training data. While such basis functions allow for a meaningful representation of illumination effects, they also help to distinguish real illumination changes from motion-induced brightness variations if supplemented by additional smoothness constraints. Experiments on the KITTI benchmark show the clear benefits of our approach. They do not only demonstrate that it is possible to obtain meaningful basis functions, they also show state-of-the-art results for robust optical flow estimation. Document type: Part of book or chapter of boo
    corecore