70,380 research outputs found
Variational recurrent sequence-to-sequence retrieval for stepwise illustration
We address and formalise the task of sequence-to-sequence (seq2seq) cross-modal retrieval. Given a sequence of text passages as query, the goal is to retrieve a sequence of images that best describes and aligns with the query. This new task extends the traditional cross-modal retrieval, where each image-text pair is treated independently ignoring broader context. We propose a novel variational recurrent seq2seq (VRSS) retrieval model for this seq2seq task. Unlike most cross-modal methods, we generate an image vector corresponding to the latent topic obtained from combining the text semantics and context. This synthetic image embedding point associated with every text embedding point can then be employed for either image generation or image retrieval as desired. We evaluate the model for the application of stepwise illustration of recipes, where a sequence of relevant images are retrieved to best match the steps described in the text. To this end, we build and release a new Stepwise Recipe dataset for research purposes, containing 10K recipes (sequences of image-text pairs) having a total of 67K image-text pairs. To our knowledge, it is the first publicly available dataset to offer rich semantic descriptions in a focused category such as food or recipes. Our model is shown to outperform several competitive and relevant baselines in the experiments. We also provide qualitative analysis of how semantically meaningful the results produced by our model are through human evaluation and comparison with relevant existing methods
Hierarchical Attention Network for Visually-aware Food Recommendation
Food recommender systems play an important role in assisting users to
identify the desired food to eat. Deciding what food to eat is a complex and
multi-faceted process, which is influenced by many factors such as the
ingredients, appearance of the recipe, the user's personal preference on food,
and various contexts like what had been eaten in the past meals. In this work,
we formulate the food recommendation problem as predicting user preference on
recipes based on three key factors that determine a user's choice on food,
namely, 1) the user's (and other users') history; 2) the ingredients of a
recipe; and 3) the descriptive image of a recipe. To address this challenging
problem, we develop a dedicated neural network based solution Hierarchical
Attention based Food Recommendation (HAFR) which is capable of: 1) capturing
the collaborative filtering effect like what similar users tend to eat; 2)
inferring a user's preference at the ingredient level; and 3) learning user
preference from the recipe's visual images. To evaluate our proposed method, we
construct a large-scale dataset consisting of millions of ratings from
AllRecipes.com. Extensive experiments show that our method outperforms several
competing recommender solutions like Factorization Machine and Visual Bayesian
Personalized Ranking with an average improvement of 12%, offering promising
results in predicting user preference for food. Codes and dataset will be
released upon acceptance
Automated data reduction workflows for astronomy
Data from complex modern astronomical instruments often consist of a large
number of different science and calibration files, and their reduction requires
a variety of software tools. The execution chain of the tools represents a
complex workflow that needs to be tuned and supervised, often by individual
researchers that are not necessarily experts for any specific instrument. The
efficiency of data reduction can be improved by using automatic workflows to
organise data and execute the sequence of data reduction steps. To realize such
efficiency gains, we designed a system that allows intuitive representation,
execution and modification of the data reduction workflow, and has facilities
for inspection and interaction with the data. The European Southern Observatory
(ESO) has developed Reflex, an environment to automate data reduction
workflows. Reflex is implemented as a package of customized components for the
Kepler workflow engine. Kepler provides the graphical user interface to create
an executable flowchart-like representation of the data reduction process. Key
features of Reflex are a rule-based data organiser, infrastructure to re-use
results, thorough book-keeping, data progeny tracking, interactive user
interfaces, and a novel concept to exploit information created during data
organisation for the workflow execution. Reflex includes novel concepts to
increase the efficiency of astronomical data processing. While Reflex is a
specific implementation of astronomical scientific workflows within the Kepler
workflow engine, the overall design choices and methods can also be applied to
other environments for running automated science workflows.Comment: 12 pages, 7 figure
Food Ingredients Recognition through Multi-label Learning
Automatically constructing a food diary that tracks the ingredients consumed
can help people follow a healthy diet. We tackle the problem of food
ingredients recognition as a multi-label learning problem. We propose a method
for adapting a highly performing state of the art CNN in order to act as a
multi-label predictor for learning recipes in terms of their list of
ingredients. We prove that our model is able to, given a picture, predict its
list of ingredients, even if the recipe corresponding to the picture has never
been seen by the model. We make public two new datasets suitable for this
purpose. Furthermore, we prove that a model trained with a high variability of
recipes and ingredients is able to generalize better on new data, and visualize
how it specializes each of its neurons to different ingredients.Comment: 8 page
- …