1,291 research outputs found

    Transferring Knowledge from Text to Video: Zero-Shot Anticipation for Procedural Actions

    Full text link
    Can we teach a robot to recognize and make predictions for activities that it has never seen before? We tackle this problem by learning models for video from text. This paper presents a hierarchical model that generalizes instructional knowledge from large-scale text corpora and transfers the knowledge to video. Given a portion of an instructional video, our model recognizes and predicts coherent and plausible actions multiple steps into the future, all in rich natural language. To demonstrate the capabilities of our model, we introduce the \emph{Tasty Videos Dataset V2}, a collection of 4022 recipes for zero-shot learning, recognition and anticipation. Extensive experiments with various evaluation metrics demonstrate the potential of our method for generalization, given limited video data for training models.Comment: TPAMI 2022. arXiv admin note: text overlap with arXiv:1812.0250

    The Multimodal And Modular Ai Chef: Complex Recipe Generation From Imagery

    Full text link
    The AI community has embraced multi-sensory or multi-modal approaches to advance this generation of AI models to resemble expected intelligent understanding. Combining language and imagery represents a familiar method for specific tasks like image captioning or generation from descriptions. This paper compares these monolithic approaches to a lightweight and specialized method based on employing image models to label objects, then serially submitting this resulting object list to a large language model (LLM). This use of multiple Application Programming Interfaces (APIs) enables better than 95% mean average precision for correct object lists, which serve as input to the latest Open AI text generator (GPT-4). To demonstrate the API as a modular alternative, we solve the problem of a user taking a picture of ingredients available in a refrigerator, then generating novel recipe cards tailored to complex constraints on cost, preparation time, dietary restrictions, portion sizes, and multiple meal plans. The research concludes that monolithic multimodal models currently lack the coherent memory to maintain context and format for this task and that until recently, the language models like GPT-2/3 struggled to format similar problems without degenerating into repetitive or non-sensical combinations of ingredients. For the first time, an AI chef or cook seems not only possible but offers some enhanced capabilities to augment human recipe libraries in pragmatic ways. The work generates a 100-page recipe book featuring the thirty top ingredients using over 2000 refrigerator images as initializing lists

    Egocentric vision-based passive dietary intake monitoring

    Get PDF
    Egocentric (first-person) perception captures and reveals how people perceive their surroundings. This unique perceptual view enables passive and objective monitoring of human-centric activities and behaviours. In capturing egocentric visual data, wearable cameras are used. Recent advances in wearable technologies have enabled wearable cameras to be lightweight, accurate, and with long battery life, making long-term passive monitoring a promising solution for healthcare and human behaviour understanding. In addition, recent progress in deep learning has provided an opportunity to accelerate the development of passive methods to enable pervasive and accurate monitoring, as well as comprehensive modelling of human-centric behaviours. This thesis investigates and proposes innovative egocentric technologies for passive dietary intake monitoring and human behaviour analysis. Compared to conventional dietary assessment methods in nutritional epidemiology, such as 24-hour dietary recall (24HR) and food frequency questionnaires (FFQs), which heavily rely on subjects’ memory to recall the dietary intake, and trained dietitians to collect, interpret, and analyse the dietary data, passive dietary intake monitoring can ease such burden and provide more accurate and objective assessment of dietary intake. Egocentric vision-based passive monitoring uses wearable cameras to continuously record human-centric activities with a close-up view. This passive way of monitoring does not require active participation from the subject, and records rich spatiotemporal details for fine-grained analysis. Based on egocentric vision and passive dietary intake monitoring, this thesis proposes: 1) a novel network structure called PAR-Net to achieve accurate food recognition by mining discriminative food regions. PAR-Net has been evaluated with food intake images captured by wearable cameras as well as those non-egocentric food images to validate its effectiveness for food recognition; 2) a deep learning-based solution for recognising consumed food items as well as counting the number of bites taken by the subjects from egocentric videos in an end-to-end manner; 3) in light of privacy concerns in egocentric data, this thesis also proposes a privacy-preserved solution for passive dietary intake monitoring, which uses image captioning techniques to summarise the image content and subsequently combines image captioning with 3D container reconstruction to report the actual food volume consumed. Furthermore, a novel framework that integrates food recognition, hand tracking and face recognition has also been developed to tackle the challenge of assessing individual dietary intake in food sharing scenarios with the use of a panoramic camera. Extensive experiments have been conducted. Tested with both laboratory (captured in London) and field study data (captured in Africa), the above proposed solutions have proven the feasibility and accuracy of using the egocentric camera technologies with deep learning methods for individual dietary assessment and human behaviour analysis.Open Acces

    Canalisation and plasticity on the developmental manifold of Caenorhabditis elegans

    Full text link
    How do the same mechanisms that faithfully regenerate complex developmental programs in spite of environmental and genetic perturbations also permit responsiveness to environmental signals, adaptation, and genetic evolution? Using the nematode Caenorhabditis elegans as a model, we explore the phenotypic space of growth and development in various genetic and environmental contexts. Our data are growth curves and developmental parameters obtained by automated microscopy. Using these, we show that among the traits that make up the developmental space, correlations within a particular context are predictive of correlations among different contexts. Further we find that the developmental variability of this animal can be captured on a relatively low dimensional phenoptypic manifold and that on this manifold, genetic and environmental contributions to plasticity can be deconvolved independently. Our perspective offers a new way of understanding the relationship between robustness and flexibility in complex systems, suggesting that projection and concentration of dimensional can naturally align these forces as complementary rather than competing

    Information overload in structured data

    Get PDF
    Information overload refers to the difficulty of making decisions caused by too much information. In this dissertation, we address information overload problem in two separate structured domains, namely, graphs and text. Graph kernels have been proposed as an efficient and theoretically sound approach to compute graph similarity. They decompose graphs into certain sub-structures, such as subtrees, or subgraphs. However, existing graph kernels suffer from a few drawbacks. First, the dimension of the feature space associated with the kernel often grows exponentially as the complexity of sub-structures increase. One immediate consequence of this behavior is that small, non-informative, sub-structures occur more frequently and cause information overload. Second, as the number of features increase, we encounter sparsity: only a few informative sub-structures will co-occur in multiple graphs. In the first part of this dissertation, we propose to tackle the above problems by exploiting the dependency relationship among sub-structures. First, we propose a novel framework that learns the latent representations of sub-structures by leveraging recent advancements in deep learning. Second, we propose a general smoothing framework that takes structural similarity into account, inspired by state-of-the-art smoothing techniques used in natural language processing. Both the proposed frameworks are applicable to popular graph kernel families, and achieve significant performance improvements over state-of-the-art graph kernels. In the second part of this dissertation, we tackle information overload in text. We first focus on a popular social news aggregation website, Reddit, and design a submodular recommender system that tailors a personalized frontpage for individual users. Second, we propose a novel submodular framework to summarize videos, where both transcript and comments are available. Third, we demonstrate how to apply filtering techniques to select a small subset of informative features from virtual machine logs in order to predict resource usage

    Advanced oxidation process models for optimisation and decision making support in water management

    Get PDF
    The objective of this thesis is contributing to the development of a systematic modelling approach for a more efficient and sustainable water management. The main aim is introducing Chemical and Process System Engineering methods and tools to provide a contribution to the AOPs (Advanced Oxidation Processes) investigation field by proposing process models that can be exploited to progress towards efficient management strategies for practical AOPs operation and inclusion in wastewater treatment networks. First, different advanced oxidation processes, namely Fenton, photo-Fenton and VUV photo-oxidation, were investigated and compared for the treatment of paracetamol (PCT) aqueous solution, by evaluating a series of performance indicators. Among the selected AOPs, VUV photo-oxidation and photo-Fenton showed the most promising results. Both processes allowed attaining total removal of the target compound and high mineralization levels. The second and main part of the thesis was focused on transforming “data into knowledge” by proposing different modelling approaches. The modelling effort focused on Fenton/photo-Fenton processes that showed the need of improving operating conditions. Accordingly, two practical kinetic models for Fenton and photo-Fenton degradation of organic compounds have been proposed and validated: • A conventional First Principles Model, based on a line source radiation model with spherical and isotropic emission, developed for the prediction of Fenton and photo-Fenton degradation of PCT and the oxidant (H2O2) consumption; • A general non-conventional First Principles Model, based on a wide-ranging contaminant degradation mechanism considering a variable number of carbon atoms for the characterization of the intermediates. Both models were experimentally validated and showed that were able to satisfactorily reproduce the system behavior. Particularly, the non-conventional First Principles Model showed to be a successful modelling approach for the Fenton/photo-Fenton degradation of different wastewater systems composed by single or multiple organic contaminants by means of lumped parameters (e.g. Total Organic Carbon-TOC). Thus, the approach proved to offer practical characterization of complex mixtures of chemicals. Once process models are proposed and validated, they can be systematically exploited to determine efficient operation modes or design alternatives. Accordingly, the thesis addressed two cases of practical interest: the optimization of a control recipe and the design of a treatment network. Particularly, a dynamic optimization framework for taking advantage of available kinetic models and determining the best hydrogen peroxide dosage profile, was proposed. Economic and environmental objectives and constraints were included to develop a dynamic optimization problem that was implemented in JModelica and solved using a direct simultaneous optimization method (IPOPT). Finally, the combination of cheaper conventional biological processes with more expensive AOPs was explored. A Mixed-Integer Non Linear Programming (MINLP) model for the optimization of a general wastewater network was proposed based on a superstructure of alternative designs, which was implemented and solved in GAMS. The novel formulation includes the BOD5/COD ratio method, describing the removal efficiency of BOD5 and COD of a treatment for modelling the variation of the biodegradability of the influents. This novel formulation allows determining the extent of the AOP treatments when combined with biological treatments, and paves the way for more complex models aimed at solving the trade-off between cost and treatment efficiency.El objetivo de esta tesis es contribuir al desarrollo de técnicas de modelado sistemático para una gestión del agua eficiente y sostenible. El objetivo principal es contribuir a la investigación de los PAOs (Procesos Avanzados de Oxidación). La tesis introduce métodos y herramientas de Ingeniería de Procesos Químicos y propone modelos que puedan ser explotados para producir estrategias eficientes de gestión y operación de los PAOs, así como para la integración de estos PAOs en redes de tratamiento de aguas residuales. Primero, se investigaron diferentes PAOs. Se consideraron los procesos Fenton, foto-Fenton y la foto-oxidación VUV, y se compararon en el tratamiento de soluciones acuosas de paracetamol (PCT) adoptando una serie de indicadores de rendimiento (Key Performance Indicators, KPI). Entre los PAOs seleccionados, la foto-oxidación VUV y foto-Fenton producieron los resultados más prometedores, permitiendo lograr la eliminación total del contaminante y alto nivel de mineralización. La segunda parte de la tesis se centró en transformar ¿datos en conocimiento¿ mediante la propuesta de modelos para los procesos Fenton / photo-Fenton con el fin de mejorar las condiciones operativas de los mismos. En concreto se han propuesto y validado dos modelos cinéticos que describen la degradación de compuestos orgánicos por procesos Fenton y foto-Fenton: ¿ Un Modelo de Principios Básicos (F convencional, que incluye la modelización de la radiación lineal con emisión esférica e isotrópica, desarrollado para predecir la degradación Fenton y foto-Fenton del PCT y el consumo de oxidante (H2O2); ¿ Un Modelo de Principios Básicos no convencional, que propone un mecanismo de degradación de contaminantes general puesto que considera el número de átomos de carbono del contaminante como la variable de partida para la propuesta de fragmentación y consiguiente generación de intermedios. Ambos modelos se validaron experimentalmente y mostraron reproducir satisfactoriamente el comportamiento del sistema. En particular, el Modelo de Principios Básicos no convencional genero resultados prometedores para la degradación Fenton / foto-Fenton de diferentes sistemas (compuestos por uno o más contaminantes orgánicos) por medio de parámetros globales (como el Carbono Orgánico Total). Por tanto, este enfoque ofrece un modelado práctico independiente del contaminante y válido para mezclas complejas de productos químicos. Una vez propuestos y validados, los modelos de proceso se pueden utilizar sistemáticamente para determinar modos de operación eficientes o alternativas de diseño. En consecuencia, la tesis se dirigió a dos casos de interés práctico: la optimización de una receta de control y el diseño de una red de tratamiento. En el primer caso, se propuso una estrategia de optimización dinámica para aprovechar los modelos cinéticos disponibles y determinar el mejor perfil de dosificación de peróxido de hidrógeno. Se incluyeron objetivos y limitaciones económicas y ambientales para desarrollar un problema de optimización dinámica que se implementó en JModelica y se resolvió utilizando un método de optimización simultánea directa (IPOPT). Finalmente, se exploró la combinación de procesos biológicos convencionales, más baratos, con PAOs, más costosos. Se propuso un modelo de Programación no Lineal de Enteros Mixtos (MINLP) para la optimización de una red general de aguas residuales, basado en una superestructura de diseños alternativos, que se implementó y resolvió en GAMS. La nueva formulación incluye la relación BOD5 / COD, que describe la eficiencia de eliminación de BOD5 y COD de un tratamiento con el fin de modelar la variación de la biodegradabilidad de los influentes. Esta novedosa formulación permite determinar el alcance de los tratamientos de PAOs cuando se combinan con tratamientos biológicos, y allana el camino para modelos más complejos destinados a resolver el equilibrio entre coste y eficiencia del tratamiento.L'objectiu d'aquesta tesi és contribuir al desenvolupament de tècniques de modelatge sistemàtic per a una gestió de l'aigua eficient i sostenible. L'objectiu principal és contribuir a la investigació dels PAOs (Processos Avançats d'Oxidació). La tesi introdueix mètodes i eines d'Enginyeria de Processos Químics i proposa models que puguin ser explotats per produir estratègies eficients de gestió i operació dels PAOs, així com per a la integració d'aquests PAOs en xarxes de tractament d'aigües residuals. Primer, es van investigar diferents PAOs. Es van considerar els processos Fenton, foto-Fenton i la foto-oxidació VUV, i es van comparar en el tractament de solucions aquoses de paracetamol (PCT) adoptant una sèrie d'indicadors de rendiment (Key Performance Indicators, KPI). Entre els PAOs seleccionats, la foto-oxidació VUV i foto-Fenton van produir els resultats més prometedors, permetent aconseguir l'eliminació total del contaminant i alt nivell de mineralització. La segona part de la tesi es va centrar en transformar "dades en coneixement" mitjançant la proposta de models per als processos Fenton / photo-Fenton amb la finalitat de millorar les condicions operatives dels mateixos. En concret, s’han proposat i validat dos models cinètics que descriuen la degradació de compostos orgànics per processos Fenton i foto-Fenton: • Un model de Principis Bàsics convencional, que inclou la modelització de la radiació lineal amb emissió esfèrica i isotròpica, desenvolupat per predir la degradació Fenton i foto-Fenton del PCT i el consum d'oxidant (H2O2); • Un model de Principis Bàsics no convencional que proposa un mecanisme de degradació de contaminants general, ja que considera el nombre d'àtoms de carboni del contaminant com la variable de partida per a la proposta de fragmentació i consegüent generació d'intermedis. Tots dos models es van validar experimentalment i van mostrar la seva capacitat de reproduir satisfactòriament el comportament del sistema. En particular, el Model de Principis Bàsics no convencional ha generat resultats prometedors per a la degradació Fenton / foto-Fenton de diferents sistemes (compostos per un o més contaminants orgànics) per mitjà de paràmetres globals (com el Carboni Orgànic Total). Per tant, aquest enfocament ofereix un modelatge pràctic, independent del contaminant i vàlid per a mescles complexes de productes químics. Un cop proposats i validats, els models de procés es poden utilitzar sistemàticament per determinar maneres d'operació eficients o alternatives de disseny. En conseqüència, la tesi es va dirigir a dos casos d'interès pràctic: l'optimització d'una recepta de control i el disseny d'una xarxa de tractament. En el primer cas, es va proposar una estratègia d'optimització dinàmica per aprofitar els models cinètics disponibles i determinar el millor perfil de dosificació de peròxid d'hidrogen. Es van incloure objectius i limitacions econòmiques i ambientals per desenvolupar un problema d'optimització dinàmica que es va implementar en JModelica i es va resoldre utilitzant un mètode d'optimització simultània directa (IPOPT). Finalment, es va explorar la combinació de processos biològics convencionals, més barats, amb PAOs, més costosos. Es va proposar un model de Programació No Lineal Entera Mixta (MINLP) per a l'optimització d'una xarxa general d'aigües residuals, basat en una superestructura de dissenys alternatius, que es va implementar i resoldre en GAMS. La nova formulació inclou la relació BOD5 / COD, que descriu l'eficiència d'eliminació de BOD5 i COD d'un tractament per tal de modelar la variació de la biodegradabilitat dels influents. Aquesta nova formulació permet determinar l'abast dels tractaments de PAOs quan es combinen amb tractaments biològics, i aplana el camí per a models més complexos destinats a resoldre l'equilibri entre cost i eficiència del tractament.L'obiettivo della presente tesi è contribuire allo sviluppo di tecniche di modellizzazione sistematica finalizzate ad una gestione efficiente e sostenibile delle risorse idriche. L'obiettivo principale è contribuire allo studio dei POA (Processi di Ossidazione Avanzata). La tesi introduce metodi e strumenti dell’ingegneria dei processi chimici e propone modelli che possano servire per lo sviluppo di strategie efficaci di gestione dei POA nonché per l'integrazione di questi ultimi nelle reti di trattamento delle acque reflue. In primo luogo, sono stati studiati diversi POA. Nello specifico, sono stati studiati i processi di foto-ossidazione Fenton, photo-Fenton e VUV per il trattamento di soluzioni acquose di paracetamolo (PCT) la cui efficacia è stata confrontata adottando una serie di indicatori di prestazione (Key Performance Indicators, KPI). Tra i POA selezionati, la foto-ossidazione VUV e il processo photo-Fenton hanno prodotto i risultati più promettenti, consentendo di ottenere l'eliminazione totale del contaminante e un alto livello di mineralizzazione. La seconda parte della tesi si è concentrata sulla trasformazione dei "dati in conoscenza" proponendo modelli finalizzati alla descrizione dei processi Fenton / photo-Fenton e con l’obiettivo ultimo di adoperarli per ottimizzare le condizioni operative di questi ultimi. In particolare, sono stati proposti e validati due modelli cinetici che descrivono la degradazione dei composti organici per mezzo dei processi Fenton e photo-Fenton: • Un modello convenzionale, che include la modellizzazione della radiazione basata sull’ipotesi di radiazione lineare ed emissione sferica e isotropica, e in grado di prevedere la degradazione del PCT ad opera dei processi Fenton e photo-Fenton nonché il consumo di ossidante (H2O2); • Un modello non convenzionale, che propone un meccanismo generale di degradazione degli inquinanti basato sul numero di atomi di carbonio del contaminante da degradare a partire dal quale si propone un meccanismo di frammentazione di quest’ultimo con conseguente generazione di prodotti intermedi. Entrambi i modelli sono stati validati sperimentalmente e hanno mostrato prestazioni soddisfacenti. In particolare, il modello non convenzionale ha mostrato di poter rappresentare la degradazione Fenton e photo-Fenton di diversi composti organici attraverso l’utilizzo di un parametro globale (come il carbonio organico totale) e qindi di offrire una proposta di modellizzazione pratica indipendente dal contaminante e che può pertanto essere adottata in caso di miscele complesse di prodotti chimici. Una volta proposti e convalidati, i modelli di processo possono essere utilizzati sistematicamente per determinare modalità operative efficienti o alternative di progettazione. Di conseguenza, la tesi ha affrontato due casi di interesse pratico: l'ottimizzazione della ricetta di controllo del processo e la progettazione di una rete di trattamento di acque reflue. Nel primo caso è stata proposta una strategia di ottimizzazione dinamica per sfruttare i modelli cinetici disponibili e determinare il miglior profilo di dosaggio del perossido di idrogeno. Sono stati inclusi obiettivi e limiti economici e ambientali per sviluppare un problema di ottimizzazione dinamica implementato in JModelica che è stato risolto utilizzando un metodo di ottimizzazione simultanea diretta (IPOPT). Infine, è stata esplorata la combinazione di processi biologici convenzionali, più economici, con i PAO, più costosi. È stato proposto un modello MINLP (Nonlinear Mixed Integer Programming) per l'ottimizzazione di una rete di acque reflue, basato su una struttura attraverso la quale è possibile rappresentare scelte progettuali alternative, che è stato implementato e risolto in GAMS. La nuova formulazione include il rapporto BOD5 / COD, che descrive l'efficienza di eliminazione di BOD5 e COD di un trattamento al fine di modellare la variazione della biodegradabilità degli influenti. Questa nuova formulazione consente di determinare le diverse possibilità di combinazione dei PAO con i trattamenti biologici e apre la strada a modelli più complessi volti a risolvere l'equilibrio tra costo ed efficienza del trattamento
    corecore