479 research outputs found

    Impatient DNNs - Deep Neural Networks with Dynamic Time Budgets

    Full text link
    We propose Impatient Deep Neural Networks (DNNs) which deal with dynamic time budgets during application. They allow for individual budgets given a priori for each test example and for anytime prediction, i.e., a possible interruption at multiple stages during inference while still providing output estimates. Our approach can therefore tackle the computational costs and energy demands of DNNs in an adaptive manner, a property essential for real-time applications. Our Impatient DNNs are based on a new general framework of learning dynamic budget predictors using risk minimization, which can be applied to current DNN architectures by adding early prediction and additional loss layers. A key aspect of our method is that all of the intermediate predictors are learned jointly. In experiments, we evaluate our approach for different budget distributions, architectures, and datasets. Our results show a significant gain in expected accuracy compared to common baselines.Comment: British Machine Vision Conference (BMVC) 201

    Efficient multi-level scene understanding in videos

    No full text
    Automatic video parsing is a key step towards human-level dynamic scene understanding, and a fundamental problem in computer vision. A core issue in video understanding is to infer multiple scene properties of a video in an efficient and consistent manner. This thesis addresses the problem of holistic scene understanding from monocular videos, which jointly reason about semantic and geometric scene properties from multiple levels, including pixelwise annotation of video frames, object instance segmentation in spatio-temporal domain, and/or scene-level description in terms of scene categories and layouts. We focus on four main issues in the holistic video understanding: 1) what is the representation for consistent semantic and geometric parsing of videos? 2) how do we integrate high-level reasoning (e.g., objects) with pixel-wise video parsing? 3) how can we do efficient inference for multi-level video understanding? and 4) what is the representation learning strategy for efficient/cost-aware scene parsing? We discuss three multi-level video scene segmentation scenarios based on different aspects of scene properties and efficiency requirements. The first case addresses the problem of consistent geometric and semantic video segmentation for outdoor scenes. We propose a geometric scene layout representation, or a stage scene model, to efficiently capture the dependency between the semantic and geometric labels. We build a unified conditional random field for joint modeling of the semantic class, geometric label and the stage representation, and design an alternating inference algorithm to minimize the resulting energy function. The second case focuses on the problem of simultaneous pixel-level and object-level segmentation in videos. We propose to incorporate foreground object information into pixel labeling by jointly reasoning semantic labels of supervoxels, object instance tracks and geometric relations between objects. In order to model objects, we take an exemplar approach based on a small set of object annotations to generate a set of object proposals. We then design a conditional random field framework that jointly models the supervoxel labels and object instance segments. To scale up our method, we develop an active inference strategy to improve the efficiency of multi-level video parsing, which adaptively selects an informative subset of object proposals and performs inference on the resulting compact model. The last case explores the problem of learning a flexible representation for efficient scene labeling. We propose a dynamic hierarchical model that allows us to achieve flexible trade-offs between efficiency and accuracy. Our approach incorporates the cost of feature computation and model inference, and optimizes the model performance for any given test-time budget. We evaluate all our methods on several publicly available video and image semantic segmentation datasets, and demonstrate superior performance in efficiency and accuracy. Keywords: Semantic video segmentation, Multi-level scene understanding, Efficient inference, Cost-aware scene parsin

    Distillation-based training for multi-exit architectures

    Get PDF
    Multi-exit architectures, in which a stack of processing layers is interleaved with early output layers, allow the processing of a test example to stop early and thus save computation time and/or energy. In this work, we propose a new training procedure for multi-exit architectures based on the principle of knowledge distillation. The method encourage searly exits to mimic later, more accurate exits, by matching their output probabilities. Experiments on CIFAR100 and ImageNet show that distillation-based training significantly improves the accuracy of early exits while maintaining state-of-the-art accuracy for late ones. The method is particularly beneficial when training data is limited and it allows a straightforward extension to semi-supervised learning,i.e. making use of unlabeled data at training time. Moreover, it takes only afew lines to implement and incurs almost no computational overhead at training time, and none at all at test time

    Geospatial Information Research: State of the Art, Case Studies and Future Perspectives

    Get PDF
    Geospatial information science (GI science) is concerned with the development and application of geodetic and information science methods for modeling, acquiring, sharing, managing, exploring, analyzing, synthesizing, visualizing, and evaluating data on spatio-temporal phenomena related to the Earth. As an interdisciplinary scientific discipline, it focuses on developing and adapting information technologies to understand processes on the Earth and human-place interactions, to detect and predict trends and patterns in the observed data, and to support decision making. The authors – members of DGK, the Geoinformatics division, as part of the Committee on Geodesy of the Bavarian Academy of Sciences and Humanities, representing geodetic research and university teaching in Germany – have prepared this paper as a means to point out future research questions and directions in geospatial information science. For the different facets of geospatial information science, the state of art is presented and underlined with mostly own case studies. The paper thus illustrates which contributions the German GI community makes and which research perspectives arise in geospatial information science. The paper further demonstrates that GI science, with its expertise in data acquisition and interpretation, information modeling and management, integration, decision support, visualization, and dissemination, can help solve many of the grand challenges facing society today and in the future

    Advances in detecting object classes and their semantic parts

    Get PDF
    Object classes are central to computer vision and have been the focus of substantial research in the last fifteen years. This thesis addresses the tasks of localizing entire objects in images (object class detection) and localizing their semantic parts (part detection). We present four contributions, two for each task. The first two improve existing object class detection techniques by using context and calibration. The other two contributions explore semantic part detection in weakly-supervised settings. First, the thesis presents a technique for predicting properties of objects in an image based on its global appearance only. We demonstrate the method by predicting three properties: aspect of appearance, location in the image and class membership. Overall, the technique makes multi-component object detectors faster and improves their performance. The second contribution is a method for calibrating the popular Ensemble of Exemplar- SVM object detector. Unlike the standard approach, which calibrates each Exemplar- SVM independently, our technique optimizes their joint performance as an ensemble. We devise an efficient optimization algorithm to find the global optimal solution of the calibration problem. This leads to better object detection performance compared to using independent calibration. The third innovation is a technique to train part-based model of object classes using data sourced from the web. We learn rich models incrementally. Our models encompass the appearance of parts and their spatial arrangement on the object, specific to each viewpoint. Importantly, it does not require any part location annotation, which is one of the main limits to training many part detectors. Finally, the last contribution is a study on whether semantic object parts emerge in Convolutional Neural Networks trained for higher-level tasks, such as image classification. While previous efforts studied this matter by visual inspection only, we perform an extensive quantitative analysis based on ground-truth part location annotations. This provides a more conclusive answer to the question

    Explorando ferramentas de modelação digital, aumentada e orientada por dados em engenharia e design de produto

    Get PDF
    Tools are indispensable for all diligent professional practice. New concepts and possibilities for paradigm shifting are emerging with recent computational technological developments in digital tools. However, new tools from key concepts such as “Big-Data”, “Accessibility” and “Algorithmic Design” are fundamentally changing the input and position of the Product Engineer and Designer. After the context introduction, this dissertation document starts by extracting three pivotal criteria from the Product Design Engineering's State of the Art analysis. In each one of those criteria the new emergent, more relevant and paradigmatic concepts are explored and later on are positioned and compared within the Product Lifecycle Management wheel scheme, where the potential risks and gaps are pointed to be explored in the experience part. There are two types of empirical experiences: the first being of case studies from Architecture and Urban Planning — from the student's professional experience —, that served as a pretext and inspiration for the experiments directly made for Product Design Engineering. First with a set of isolated explorations and analysis, second with a hypothetical experience derived from the latter and, finally, a deliberative section that culminate in a listing of risks and changes concluded from all the previous work. The urgency to reflect on what will change in that role and position, what kind of ethical and/or conceptual reformulations should exist for the profession to maintain its intellectual integrity and, ultimately, to survive, are of the utmost evidence.As ferramentas são indispensáveis para toda a prática diligente profissional. Novos conceitos e possibilidades de mudança de paradigma estão a surgir com os recentes progressos tecnológicos a nível computacional nas ferramentas digitais. Contudo, novas ferramentas originadas sobre conceitos-chave como “Big Data”, “Acessibilidade” e “Design Algorítmico” estão a mudar de forma fundamental o contributo e posição do Engenheiro e Designer de Produto. Esta dissertação, após uma primeira introdução contextual, começa por extrair três conceitos-eixo duma análise ao Estado da Arte actual em Engenharia e Design de Produto. Em cada um desses conceitos explora-se os novos conceitos emergentes mais relevantes e paradigmáticos, que então são comparados e posicionados no círculo de Gestão de Ciclo de Vida de Produto, apontando aí potenciais riscos e falhas que possam ser explorados em experiências. As experiências empíricas têm duas índoles: a primeira de projetos e casos de estudo de arquitetura e planeamento urbanístico — experiência em contexto de trabalho do aluno —, que serviu de pretexto e inspiração para as experiências relacionadas com Engenharia e Design de Produto. Primeiro com uma série de análises e experiências isoladas, segundo com uma formulação hipotética com o compêndio dessas experiências e, finalmente, com uma secção de reflexão que culmina numa série de riscos e mudanças induzidas do trabalho anterior. A urgência em refletir sobre o que irá alterar nesse papel e posição, que género de reformulações éticas e/ou conceptuais deverão existir para que a profissão mantenha a sua integridade intelectual e, em última instância, sobreviva, são bastante evidentes.Mestrado em Engenharia e Design de Produt

    Simple identification tools in FishBase

    Get PDF
    Simple identification tools for fish species were included in the FishBase information system from its inception. Early tools made use of the relational model and characters like fin ray meristics. Soon pictures and drawings were added as a further help, similar to a field guide. Later came the computerization of existing dichotomous keys, again in combination with pictures and other information, and the ability to restrict possible species by country, area, or taxonomic group. Today, www.FishBase.org offers four different ways to identify species. This paper describes these tools with their advantages and disadvantages, and suggests various options for further development. It explores the possibility of a holistic and integrated computeraided strategy

    Proceedings of the 1st Doctoral Consortium at the European Conference on Artificial Intelligence (DC-ECAI 2020)

    Get PDF
    1st Doctoral Consortium at the European Conference on Artificial Intelligence (DC-ECAI 2020), 29-30 August, 2020 Santiago de Compostela, SpainThe DC-ECAI 2020 provides a unique opportunity for PhD students, who are close to finishing their doctorate research, to interact with experienced researchers in the field. Senior members of the community are assigned as mentors for each group of students based on the student’s research or similarity of research interests. The DC-ECAI 2020, which is held virtually this year, allows students from all over the world to present their research and discuss their ongoing research and career plans with their mentor, to do networking with other participants, and to receive training and mentoring about career planning and career option

    Machine Learning for Biomedical Application

    Get PDF
    Biomedicine is a multidisciplinary branch of medical science that consists of many scientific disciplines, e.g., biology, biotechnology, bioinformatics, and genetics; moreover, it covers various medical specialties. In recent years, this field of science has developed rapidly. This means that a large amount of data has been generated, due to (among other reasons) the processing, analysis, and recognition of a wide range of biomedical signals and images obtained through increasingly advanced medical imaging devices. The analysis of these data requires the use of advanced IT methods, which include those related to the use of artificial intelligence, and in particular machine learning. It is a summary of the Special Issue “Machine Learning for Biomedical Application”, briefly outlining selected applications of machine learning in the processing, analysis, and recognition of biomedical data, mostly regarding biosignals and medical images
    corecore