Search CORE

27,828 research outputs found

Ontology-Driven Food Category Classification in Images

Author: F Kong
G Ciocca
G Ciocca
J Dehais
L Bossard
M Dragoni
P Pouladzadeh
R Maimone
S Mezgec
W Zhang
Publication venue
Publication date
Field of study

The self-management of chronic diseases related to dietary habits includes the necessity of tracking what people eat. Most of the approaches proposed in the literature classify food pictures by labels describing the whole recipe. The main drawback of this kind of strategy is that a wrong prediction of the recipe leads to a wrong prediction of any ingredient of such a recipe. In this paper we present a multi-label food classification approach, exploiting deep neural networks, where each food picture is classified with labels describing the food categories of the ingredients in each recipe. The aim of our approach is to support the detection of food categories in order to detect which one might be dangerous for a user affected by chronic disease. Our approach relies on background knowledge where recipes, food categories, and their relatedness with chronic diseases are modeled within a state-of-the-art ontology. Experiments conducted on a new publicly released dataset demonstrated the effectiveness of the proposed approach with respect to state-of-the-art classification strategies

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Joint Video and Text Parsing for Understanding Events and Answering Queries

Author: Choe Tae Eun
Lee Mun Wai
Meng Meng
Tu Kewei
Zhu Song-Chun
Publication venue
Publication date: 21/02/2014
Field of study

We propose a framework for parsing video and text jointly for understanding events and answering user queries. Our framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events) and causal information (causalities between events and fluents) in the video and text. The knowledge representation of our framework is based on a spatial-temporal-causal And-Or graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. We present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs and the joint parse graph. Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text respectively. The joint inference module produces a joint parse graph by performing matching, deduction and revision on the video and text parse graphs. The proposed framework has the following objectives: Firstly, we aim at deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; Secondly, we perform parsing and reasoning across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG representation; Thirdly, we show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where and why. We empirically evaluated our system based on comparison against ground-truth as well as accuracy of query answering and obtained satisfactory results

arXiv.org e-Print Archive

CiteSeerX

Place Categorization and Semantic Mapping on a Mobile Robot

Author: Corke Peter
Dayoub Feras
McMahon Sean
Milford Michael
Schulz Ruth
Sünderhauf Niko
Talbot Ben
Upcroft Ben
Wyeth Gordon
Publication venue
Publication date: 09/07/2015
Field of study

In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot without environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot's behaviour during navigation tasks. The system is made available to the community as a ROS module

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

Towards Bottom-Up Analysis of Social Food

Author: Alwy Fadhlun
Atuhairwe Susan
Hanson Claudia
HMS BAB study team
Kaharuza Frank
Leshabari Sebalda
Marrone Gaetano
Morris Jessica
Pembe Andrea B
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

in ACM Digital Health Conference 201

Crossref

LSHTM Research Online

Directory of Open Access Journals

Edinburgh Research Explorer

Queen Mary Research Online

FigShare