108 research outputs found

    Layered Controllable Video Generation

    Full text link
    We introduce layered controllable video generation, where we, without any supervision, decompose the initial frame of a video into foreground and background layers, with which the user can control the video generation process by simply manipulating the foreground mask. The key challenges are the unsupervised foreground-background separation, which is ambiguous, and ability to anticipate user manipulations with access to only raw video sequences. We address these challenges by proposing a two-stage learning procedure. In the first stage, with the rich set of losses and dynamic foreground size prior, we learn how to separate the frame into foreground and background layers and, conditioned on these layers, how to generate the next frame using VQ-VAE generator. In the second stage, we fine-tune this network to anticipate edits to the mask, by fitting (parameterized) control to the mask from future frame. We demonstrate the effectiveness of this learning and the more granular control mechanism, while illustrating state-of-the-art performance on two benchmark datasets. We provide a video abstract as well as some video results on https://gabriel-huang.github.io/layered_controllable_video_generationComment: This paper has been accepted to ECCV 2022 as an Oral pape

    ECOS 2012

    Get PDF
    The 8-volume set contains the Proceedings of the 25th ECOS 2012 International Conference, Perugia, Italy, June 26th to June 29th, 2012. ECOS is an acronym for Efficiency, Cost, Optimization and Simulation (of energy conversion systems and processes), summarizing the topics covered in ECOS: Thermodynamics, Heat and Mass Transfer, Exergy and Second Law Analysis, Process Integration and Heat Exchanger Networks, Fluid Dynamics and Power Plant Components, Fuel Cells, Simulation of Energy Conversion Systems, Renewable Energies, Thermo-Economic Analysis and Optimisation, Combustion, Chemical Reactors, Carbon Capture and Sequestration, Building/Urban/Complex Energy Systems, Water Desalination and Use of Water Resources, Energy Systems- Environmental and Sustainability Issues, System Operation/ Control/Diagnosis and Prognosis, Industrial Ecology

    Unsupervised clustering of IoT signals through feature extraction and self organizing maps

    Get PDF
    This thesis scope is to build a clustering model to inspect the structural properties of a dataset composed of IoT signals and to classify these through unsupervised clustering algorithms. To this end, a feature-based representation of the signals is used. Different feature selection algorithms are then used to obtain reduced feature spaces, so as to decrease the computational cost and the memory demand. Thus, the IoT signals are clustered using Self-Organizing Maps (SOM) and then evaluatedope

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Increasing Accuracy Performance through Optimal Feature Extraction Algorithms

    Get PDF
    This research developed models and techniques to improve the three key modules of popular recognition systems: preprocessing, feature extraction, and classification. Improvements were made in four key areas: processing speed, algorithm complexity, storage space, and accuracy. The focus was on the application areas of the face, traffic sign, and speaker recognition. In the preprocessing module of facial and traffic sign recognition, improvements were made through the utilization of grayscaling and anisotropic diffusion. In the feature extraction module, improvements were made in two different ways; first, through the use of mixed transforms and second through a convolutional neural network (CNN) that best fits specific datasets. The mixed transform system consists of various combinations of the Discrete Wavelet Transform (DWT) and Discrete Cosine Transform (DCT), which have a reliable track record for image feature extraction. In terms of the proposed CNN, a neuroevolution system was used to determine the characteristics and layout of a CNN to best extract image features for particular datasets. In the speaker recognition system, the improvement to the feature extraction module comprised of a quantized spectral covariance matrix and a two-dimensional Principal Component Analysis (2DPCA) function. In the classification module, enhancements were made in visual recognition through the use of two neural networks: the multilayer sigmoid and convolutional neural network. Results show that the proposed improvements in the three modules led to an increase in accuracy as well as reduced algorithmic complexity, with corresponding reductions in storage space and processing time

    Validation of Neural Network Predictions for the Outcome of Refractive Surgery for Myopia

    Get PDF
    Background: Refractive surgery (RS) for myopia has made a very big progress regarding its safety and predictability of the outcome. Still, a small percentage of operations require retreatment. Therefore, both legally and ethically, patients should be informed that sometimes a corrective RS may be required. We addressed this issue using Neural Networks (NN) in RS for myopia. This was a recently developed validation study of a NN.  Methods: We anonymously searched the Ophthalmica Institute of Ophthalmology and Microsurgery database for patients who underwent RS with PRK, LASEK, Epi-LASIK or LASIK between 2010 and 2018. We used a total of 13 factors related to RS. Data was divided into four sets of successful RS outcomes used for training the NN, successful RS outcomes used for testing the NN performance, RS outcomes that required retreatment used for training the NN and RS outcomes that required retreatment used for testing the NN performance. We created eight independent Learning Vector Quantization (LVQ) networks, each one responding to a specific query with 0 (for the retreat class) or 1 (for the correct class). The results of the 8 LVQs were then averaged so we could obtain a best estimate of the NN performance. Finally, a voting procedure was used to reach to a conclusion. Results: There was a statistically significant agreement (Cohen’s Kappa = 0.7658) between the predicted and the actual results regarding the need for retreatment. Our predictions had good sensitivity (0.8836) and specificity (0.9186). Conclusion: We validated our previously published results and confirmed our expectations for the NN we developed. Our results allow us to be optimistic about the future of NNs in predicting the outcome and, eventually, in planning RS

    Dense light field coding: a survey

    Get PDF
    Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems. Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.info:eu-repo/semantics/publishedVersio
    • …
    corecore