1,046 research outputs found

    Text visualization techniques: Taxonomy, visual survey, and community insights

    Full text link
    Figure 1: The web-based user interface of our visual survey called Text Visualization Browser. By using the interaction panel on the left hand side, researchers can look for specific visualization techniques and filter out entries with respect to a set of categories (cf. the taxonomy given in Sect. 3). Details for a selected entry are shown by clicking on a thumbnail image in the main view. The survey contains 141 categorized visualization techniques by January 19, 2015. Text visualization has become a growing and increasingly impor-tant subfield of information visualization. Thus, it is getting harder for researchers to look for related work with specific tasks or vi-sual metaphors in mind. In this paper, we present an interactive visual survey of text visualization techniques that can be used for the purposes of search for related work, introduction to the subfield and gaining insight into research trends. We describe the taxonomy used for categorization of text visualization techniques and com-pare it to approaches employed in several other surveys. Finally, we present results of analyses performed on the entries data

    Eulerian video magnification for revealing subtle changes in the world

    Get PDF
    Our goal is to reveal temporal variations in videos that are difficult or impossible to see with the naked eye and display them in an indicative manner. Our method, which we call Eulerian Video Magnification, takes a standard video sequence as input, and applies spatial decomposition, followed by temporal filtering to the frames. The resulting signal is then amplified to reveal hidden information. Using our method, we are able to visualize the flow of blood as it fills the face and also to amplify and reveal small motions. Our technique can run in real time to show phenomena occurring at the temporal frequencies selected by the user.United States. Defense Advanced Research Projects Agency (DARPA SCENICC program)National Science Foundation (U.S.) (NSF CGV-1111415)Quanta Computer (Firm)Nvidia Corporation (Graduate Fellowship

    Collision Avoidance on Unmanned Aerial Vehicles using Deep Neural Networks

    Get PDF
    Unmanned Aerial Vehicles (UAVs), although hardly a new technology, have recently gained a prominent role in many industries, being widely used not only among enthusiastic consumers but also in high demanding professional situations, and will have a massive societal impact over the coming years. However, the operation of UAVs is full of serious safety risks, such as collisions with dynamic obstacles (birds, other UAVs, or randomly thrown objects). These collision scenarios are complex to analyze in real-time, sometimes being computationally impossible to solve with existing State of the Art (SoA) algorithms, making the use of UAVs an operational hazard and therefore significantly reducing their commercial applicability in urban environments. In this work, a conceptual framework for both stand-alone and swarm (networked) UAVs is introduced, focusing on the architectural requirements of the collision avoidance subsystem to achieve acceptable levels of safety and reliability. First, the SoA principles for collision avoidance against stationary objects are reviewed. Afterward, a novel image processing approach that uses deep learning and optical flow is presented. This approach is capable of detecting and generating escape trajectories against potential collisions with dynamic objects. Finally, novel models and algorithms combinations were tested, providing a new approach for the collision avoidance of UAVs using Deep Neural Networks. The feasibility of the proposed approach was demonstrated through experimental tests using a UAV, created from scratch using the framework developed.Os veículos aéreos não tripulados (VANTs), embora dificilmente considerados uma nova tecnologia, ganharam recentemente um papel de destaque em muitas indústrias, sendo amplamente utilizados não apenas por amadores, mas também em situações profissionais de alta exigência, sendo expectável um impacto social massivo nos próximos anos. No entanto, a operação de VANTs está repleta de sérios riscos de segurança, como colisões com obstáculos dinâmicos (pássaros, outros VANTs ou objetos arremessados). Estes cenários de colisão são complexos para analisar em tempo real, às vezes sendo computacionalmente impossível de resolver com os algoritmos existentes, tornando o uso de VANTs um risco operacional e, portanto, reduzindo significativamente a sua aplicabilidade comercial em ambientes citadinos. Neste trabalho, uma arquitectura conceptual para VANTs autônomos e em rede é apresentada, com foco nos requisitos arquitetônicos do subsistema de prevenção de colisão para atingir níveis aceitáveis de segurança e confiabilidade. Os estudos presentes na literatura para prevenção de colisão contra objectos estacionários são revistos e uma nova abordagem é descrita. Esta tecnica usa técnicas de aprendizagem profunda e processamento de imagem, para realizar a prevenção de colisões em tempo real com objetos móveis. Por fim, novos modelos e combinações de algoritmos são propostos, fornecendo uma nova abordagem para evitar colisões de VANTs usando Redes Neurais Profundas. A viabilidade da abordagem foi demonstrada através de testes experimentais utilizando um VANT, desenvolvido a partir da arquitectura apresentada

    Geospatial Information Research: State of the Art, Case Studies and Future Perspectives

    Get PDF
    Geospatial information science (GI science) is concerned with the development and application of geodetic and information science methods for modeling, acquiring, sharing, managing, exploring, analyzing, synthesizing, visualizing, and evaluating data on spatio-temporal phenomena related to the Earth. As an interdisciplinary scientific discipline, it focuses on developing and adapting information technologies to understand processes on the Earth and human-place interactions, to detect and predict trends and patterns in the observed data, and to support decision making. The authors – members of DGK, the Geoinformatics division, as part of the Committee on Geodesy of the Bavarian Academy of Sciences and Humanities, representing geodetic research and university teaching in Germany – have prepared this paper as a means to point out future research questions and directions in geospatial information science. For the different facets of geospatial information science, the state of art is presented and underlined with mostly own case studies. The paper thus illustrates which contributions the German GI community makes and which research perspectives arise in geospatial information science. The paper further demonstrates that GI science, with its expertise in data acquisition and interpretation, information modeling and management, integration, decision support, visualization, and dissemination, can help solve many of the grand challenges facing society today and in the future

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Learning efficient temporal information in deep networks: From the viewpoints of applications and modeling

    Get PDF
    With the introduction of deep learning, machine learning has dominated several technology areas, giving birth to high-performance applications that can even challenge human-level accuracy. However, the complexity of deep models is also exploding as a by-product of the revolution of machine learning. Such enormous model complexity has raised the new challenge of improving the efficiency in deep models to reduce deployment expense, especially for systems with high throughput demands or devices with limited power. The dissertation aims to improve the efficiency of temporal-sensitive deep models in four different directions. First, we develop a bandwidth extension mapping to avoid deploying multiple speech recognition systems corresponding to wideband and narrowband signals. Second, we apply a multi-modality approach to compensate for the performance of an excitement scoring system, where the input video sequences are aggressively down-sampled to reduce throughput. Third, we formulate the motion feature in the feature space by directly inducing the temporal information from intermediate layers of deep networks instead of relying on an additional optical flow stream. Finally, we model a spatiotemporal sampling network inspired by the human visual perception mechanism to reduce input frames and regions adaptively

    Scalable Methodologies and Analyses for Modality Bias and Feature Exploitation in Language-Vision Multimodal Deep Learning

    Get PDF
    Multimodal machine learning benchmarks have exponentially grown in both capability and popularity over the last decade. Language-vision question-answering tasks such as Visual Question Answering (VQA) and Video Question Answering (video-QA) have ---thanks to their high difficulty--- become a particularly popular means through which to develop and test new modelling designs and methodology for multimodal deep learning. The challenging nature of VQA and video-QA tasks leaves plenty of room for innovation at every component of the deep learning pipeline: from dataset to modelling methodology. Such circumstances are ideal for innovating in the space of language-vision multimodality. Furthermore, the wider field is currently undergoing an incredible period of growth and increasing interest. I therefore aim to contribute to multiple key components of the VQA and video-QA pipeline, but specifically in a manner such that my contributions remain relevant, ‘scaling’ with the revolutionary new benchmark models and datasets of the near future instead of being rendered obsolete by them. The work in this thesis: highlights and explores the disruptive and problematic presence of language bias in the popular TVQA video-QA dataset, and proposes a dataset-invariant method to identify subsets that respond to different modalities; thoroughly explores the suitability of bilinear pooling as a language-vision fusion technique in video-QA, offering experimental and theoretical insight, and highlighting the parallels in multimodal processing with neurological theories; explores the nascent visual equivalent of languague modelling (`visual modelling') in order to boost the power of visual features; and proposes a dataset-invariant neurolinguistically-inspired labelling scheme for use in multimodal question-answering. I explore the positive and negative results that my experiments across this thesis yield. I conclude by discussing the limitations of my contributions, and conclude with proposals for future directions of study in the areas I contribute to

    The role of time in video understanding

    Get PDF
    • …
    corecore