10 research outputs found

    Variational recurrent sequence-to-sequence retrieval for stepwise illustration

    Get PDF
    We address and formalise the task of sequence-to-sequence (seq2seq) cross-modal retrieval. Given a sequence of text passages as query, the goal is to retrieve a sequence of images that best describes and aligns with the query. This new task extends the traditional cross-modal retrieval, where each image-text pair is treated independently ignoring broader context. We propose a novel variational recurrent seq2seq (VRSS) retrieval model for this seq2seq task. Unlike most cross-modal methods, we generate an image vector corresponding to the latent topic obtained from combining the text semantics and context. This synthetic image embedding point associated with every text embedding point can then be employed for either image generation or image retrieval as desired. We evaluate the model for the application of stepwise illustration of recipes, where a sequence of relevant images are retrieved to best match the steps described in the text. To this end, we build and release a new Stepwise Recipe dataset for research purposes, containing 10K recipes (sequences of image-text pairs) having a total of 67K image-text pairs. To our knowledge, it is the first publicly available dataset to offer rich semantic descriptions in a focused category such as food or recipes. Our model is shown to outperform several competitive and relevant baselines in the experiments. We also provide qualitative analysis of how semantically meaningful the results produced by our model are through human evaluation and comparison with relevant existing methods

    Experimental Investigations on the Instantaneous Flow Structure in Circulating Fluidized Beds

    Get PDF
    Knowing the instantaneous flow structure is of great importance for the understanding of gas-particle interaction and for the prediction of reactor performances. In this thesis, a systematic and comprehensive study has been conducted on the instantaneous flow structure in a narrow rectangular riser (19 mm in thickness, 114 mm in width and 7.6 m in height), in a cylindrical riser (76 mm in diameter, 10.2 m in height) and in a cylindrical downer (50 mm in diameter, 4.9 m in height). A wider range of operating conditions has been achieved in risers and downer with superficial gas velocity from 3.0 to 9.0 m/s and solids circulation rate from 50 to 700 . With high-speed imaging and optical fiber sensing, it has been found that there are crest clusters, coalesced particles, trough clusters and dispersed particles in CFBs. Crest clusters are surrounded by a cloud of coalesced particles, while trough clusters are immersed in dispersed particles. Then, instantaneous flow dynamics are computed with the tracking of image blocks. The existence of aggregations influences both the particle velocity and the solids flux. After that, a physically meaningful threshold was proposed to characterize crest cluster and trough clusters in terms of solids holdup, size and shape. As the optical fiber probe can be used in high-density conditions, a discrimination method was also proposed for probe signals using wavelet transform. A thorough characterization of crest clusters, coalesced particles, trough clusters and dispersed particles is conducted in the rectangular CFB and their properties are consistent with those originated from the images. Moreover, how particle properties influence phase characteristics was also investigated by comparing phase information formed by FCC and glass beads. Finally, probe signals captured in the cylindrical riser and cylindrical downer are processed to investigate the phase properties in high-density conditions. Phases, including crest clusters, coalesced particles, trough clusters and dispersed particles, are characterized over the riser and downer in terms of length, frequency and time fraction. With the systematic comparison of flow information between the CFB riser and downer, it has been found that aggregation in the CFB downer is less severe than that in the CFB riser

    Beyond picturing

    Get PDF
    Beyond Picturing is practice led research aimed at determining whether horizontality can be deemed a medium in its own right, and further, whether it can establish a new set of conventions, enabling a cross-cultural dialogue between peoples of our region, particularly Aboriginal people and Maori and those of European heritage. I chart the course of horizontality across the art of the 20th century, identifying it as a medium for practice. My thesis examines examples in which horizontality as a methodology was a vehicle for meaning, based on the theories of structural linguistics and phenomenology. Furthermore, by acknowledging the axial shift, from the horizontal plane of process to the vertical plane of image, I discover a shared ground for cultural dialogue with painters of the central desert and the Kimberley

    The few touch digital diabetes diary : user-involved design of mobile self-help tools for peoplewith diabetes

    Get PDF
    Paper number 2, 4, 5 and 7 are not available in Munin, due to publishers' restrictions: 2. Årsand E, and Demiris G.: "User-Centered Methods for Designing Patient-Centric Self-Help Tools", Informatics for Health and Social Care, 2008 Vol. 33, No. 3, Pages 158-169 (Informa Healthcare). Available at http://dx.doi.org/10.1080/17538150802457562 4. Årsand E, Olsen OA, Varmedal R, Mortensen W, and Hartvigsen G.: "A System for Monitoring Physical Activity Data Among People with Type 2 Diabetes", pages 173-178 in S.K. Andersen, et.al. (eds.) "eHealth Beyond the Horizon - Get IT There", Proceedings of MIE2008, Studies in Health Technology and Informatics, Volume 136, May 2008, ISBN: 978-1-58603-864-9 5. Årsand E, Tufano JT, Ralston J, and Hjortdahl P.: "Designing Mobile Dietary Management Support Technologies for People with Diabetes", Journal of Telemedicine and Telecare, 2008 Volume 14, Number 7, Pp. 329-332 (Royal Society of Medicine Press). Available at http://dx.doi.org/10.1258/jtt.2008.007001 7. Årsand E, Walseth OA, Andersson N, Fernando R, Granberg O, Bellika JG, and Hartvigsen G.: "Using Blood Glucose Data as an Indicator for Epidemic Disease Outbreaks", pages 199-204 in R. Engelbrecht et.al. (eds.): "Connecting Medical Informatics and Bio-Informatics", Proceedings of MIE2005, Studies in Health Technology and Informatics, Volume 116, August 2005, ISBN: 978-1-58603-549-5. Check availabilityParadoxically, the technological revolution that has created a vast health problem due to a drastic change in lifestyle also holds great potential for individuals to take better care of their own health. The first consequence is not addressed in this dissertation, but the second represents the focus of the work presented, namely utilizing ICT to support self-management of individual health challenges. As long as only 35% of the patients in Norway achieve the International Diabetes Federation‟s goal for blood glucose (HbA1c), actions and activities to improve blood glucose control and related factors are needed. The presented work focuses on the development and integration of alternative sensor systems for blood glucose and physical activity, and a fast and effortless method for recording food habits. Various user-interface concepts running on a mobile terminal constitute a digital diabetes diary, and the total concept is referred to as the “Few Touch application”. The overall aim of this PhD project is to generate knowledge about how a mobile tool can be designed for supporting lifestyle changes among people with diabetes. Applying technologies and methods from the informatics field has contributed to improved insight into this issue. Conversely, addressing the concrete use cases for people with diabetes has resulted in the achievement of ICT designs that have been appreciated by the cohorts involved. Cooperation with three different groups of patients with diabetes over several years and various methods and theories founded in computer science, medical informatics, and telemedicine have been combined in design and research on patient-oriented aids. The blood glucose Bluetooth adapter, the step counter, and the nutrition habit registration system that have been developed were all novel and to my knowledge unique designs at the time they were first tested, and this still applies to the latter two. Whether it can be claimed that the total concept presented, the Few Touch application, will increase quality of life, is up to future research and large-scale tests of the system to answer. However, results from the Type 2 diabetes half-year study showed that several of the participants did adjust their medication, food habits and/or physical activity due to use of the application

    Neural models for stepwise text illustration

    Get PDF
    In this thesis, we investigate the task of sequence-to-sequence (seq2seq) retrieval: given a sequence (of text passages) as the query, retrieve a sequence (of images) that best describes and aligns with the query. This is a step beyond the traditional cross-modal retrieval which treats each image-text pair independently and ignores broader context. Since this is a difficult task, we break it into steps. We start with caption generation for images in news articles. Different from traditional image captioning task where a text description is generated given an image, here, a caption is generated conditional on both image and the news articles where it appears. We propose a novel neural-networks based methodology to take into account both news article content and image semantics to generate a caption best describing the image and its surrounding text context. Our results outperform existing approaches to image captioning generation. We then introduce two new novel datasets, GutenStories and Stepwise Recipe datasets for the task of story picturing and sequential text illustration. GutenStories consists of around 90k text paragraphs, each accompanied with an image, aligned in around 18k visual stories. It consists of a wide variety of images and story content styles. StepwiseRecipe is a similar dataset having sequenced image-text pairs, but having only domain-constrained images, namely food-related. It consists of 67k text paragraphs (cooking instructions), each accompanied by an image describing the step, aligned in 10k recipes. Both datasets are web-scrawled and systematically filtered and cleaned. We propose a novel variational recurrent seq2seq (VRSS) retrieval model. xii The model encodes two streams of information at every step: the contextual information from both text and images retrieved in previous steps, and the semantic meaning of the current input (text) as a latent vector. These together guide the retrieval of a relevant image from the repository to match the semantics of the given text. The model has been evaluated on both the Stepwise Recipe and GutenStories datasets. The results on several automatic evaluation measures show that our model outperforms several competitive and relevant baselines. We also qualitatively analyse the model both using human evaluation and by visualizing the representation space to judge the semantical meaningfulness. We further discuss the challenges faced on the more difficult GutenStories and outline possible solutions

    We Are What We See? – Aggression and Neurological Activation Towards Affective Imagery

    Get PDF
    Violent and erotic media has been suggested to have a long-lasting negative effect on both the brain and behaviour (e.g. Anderson & Bushman, 2001; Grimes, Anderson & Bergen, 2008) and has been linked with increased aggression (Anderson & Bushman, 2001, 2002; Bartholow, Bushman, & Sestir, 2006; Engelhardt, Bartholow, & Saults, 2011; Greitemeyer, 2018). This thesis is the first comprehensive investigation into the effects of aggression and visual media content on early neurological response. Despite adopting gold-standard measures of aggression and contemporary EEG methodology, there was no evidence to support claims of a negative effect using a range of differing content visual stimuli. However, participant sex was identified as a key defining factor in electrocortical response towards all stimuli categories. In general, females tended to respond with an early negativity bias and an increased overall response in comparison to males. This was especially found where the content was related to biological drives. Support was found for research and theory providing that attention is motivated towards evolutionary salient stimuli (e.g. Gur et al, 2002; Kim et al. 2013; Schupp, Junghofer, Weike and Hamm, 2003; Weinberg and Hajak, 2010; Wheaton et al, 2013), and preferred media content (Boheart, 2001; Nordstrom and Wiens, 2012). A variety of measures of aggression have been employed within the field with inconsistencies across procedure, analysis method and reporting that has impacted objectivity and the validity of findings. Four methods of data processing were employed in order to analyze scores on trait aggression scales. Results showed that trait aggression appeared to modulate ERP response towards affective imagery. However, this finding was sex specific (for males only) and was dependent on data processing method employed thus, was inconsistent. This identified that minor modifications to simple data processing techniques have major implications on results and meaning. These findings have clearly demonstrated the need for standardization of methods and analysis across processes, measurement tools and techniques. Additional investigation found that there were numerous elements of stimuli content and context that influenced response. This included neutral stimuli. Taken together, these findings have made a clear case for the requirement of a valid stimuli collection that encompasses a stringent classification of appropriate content that can be widely adopted across research within multiple disciplines

    Objects extraction and recognition for camera-based interaction : heuristic and statistical approaches

    Get PDF
    In this thesis, heuristic and probabilistic methods are applied to a number of problems for camera-based interactions. The goal is to provide solutions for a vision based system that is able to extract and analyze interested objects in camera images and to use that information for various interactions for mobile usage. New methods and new attempts of combination of existing methods are developed for different applications, including text extraction from complex scene images, bar code reading performed by camera phones, and face/facial feature detection and facial expression manipulation. The application-driven problems of camera-based interaction can not be modeled by a uniform and straightforward model that has very strong simplifications of reality. The solutions we learned to be efficient were to apply heuristic but easy of implementation approaches at first to reduce the complexity of the problems and search for possible means, then use developed statistical learning approaches to deal with the remaining difficult but well-defined problems and get much better accuracy. The process can be evolved in some or all of the stages, and the combination of the approaches is problem-dependent. Contribution of this thesis resides in two aspects: firstly, new features and approaches are proposed either as heuristics or statistical means for concrete applications; secondly engineering design combining seveal methods for system optimization is studied. Geometrical characteristics and the alignment of text, texture features of bar codes, and structures of faces can all be extracted as heuristics for object extraction and further recognition. The boosting algorithm is one of the proper choices to perform probabilistic learning and to achieve desired accuracy. New feature selection techniques are proposed for constructing the weak learner and applying the boosting output in concrete applications. Subspace methods such as manifold learning algorithms are introduced and tailored for facial expression analysis and synthesis. A modified generalized learning vector quantization method is proposed to deal with the blurring of bar code images. Efficient implementations that combine the approaches in a rational joint point are presented and the results are illustrated.reviewe

    Incorporating declared capacity uncertainty in optimizing airport slot allocation

    Get PDF
    Slot allocation is the mechanism used to allocate capacity at congested airports. A number of models have been introduced in the literature aiming to produce airport schedules that optimize the allocation of slot requests to the available airport capacity. A critical parameter affecting the outcome of the slot allocation process is the airport’s declared capacity. Existing airport slot allocation models treat declared capacity as an exogenously defined deterministic parameter. In this presentation we propose a new robust optimization formulation based on the concept of stability radius. The proposed formulation considers endogenously the airport’s declared capacity and expresses it as a function of its throughput. We present results from the application of the proposed approach to a congested airport and we discuss the trade-off between the declared capacity of the airport and the efficiency of the slot allocation process
    corecore