Search CORE

153 research outputs found

State of the Art on Neural Rendering

Author: Agrawala M.
Fanello S.
Fried O.
Goldman D.
Lombardi S.
Martin-Brualla R.
Nießner M.
Pandey R.
Saragih J.
Shechtman E.
Simon T.
Sitzmann V.
Sunkavalli K.
Tewari A.
Theobalt C.
Thies J.
Wetzstein G.
Zhu J.
Zollhöfer M.
Publication venue
Publication date: 01/01/2020
Field of study

Efficient rendering of photo-realistic virtual worlds is a long standing effort of computer graphics. Modern graphics techniques have succeeded in synthesizing photo-realistic images from hand-crafted scene representations. However, the automatic generation of shape, materials, lighting, and other aspects of scenes remains a challenging problem that, if solved, would make photo-realistic computer graphics more widely accessible. Concurrently, progress in computer vision and machine learning have given rise to a new approach to image synthesis and editing, namely deep generative models. Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training. With a plethora of applications in computer graphics and vision, neural rendering is poised to become a new area in the graphics community, yet no survey of this emerging field exists. This state-of-the-art report summarizes the recent trends and applications of neural rendering. We focus on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photo-realistic outputs. Starting with an overview of the underlying computer graphics and machine learning concepts, we discuss critical aspects of neural rendering approaches. This state-of-the-art report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting, free-viewpoint video, and the creation of photo-realistic avatars for virtual and augmented reality telepresence. Finally, we conclude with a discussion of the social implications of such technology and investigate open research problems

MPG.PuRe

OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System

Author: Bian Rongcheng
Cai Bohua
Cao Qiong
Chen Guanpu
Chen Shixiang
Ding Liang
He Fengxiang
Li Chang
Li Jiaxing
Liu Daqing
Liu Dongkai
Liu Wei
Liu Xiangyang
Peng Xuyang
Shen Li
Tao Dacheng
Wang Chaoyue
Wang Zhenfang
Xie Shuai
Xue Chao
Yang Yibo
Zhan Yibing
Zhang Jing
Zhang Shijin
Zhang Yukang
Zhao Shanshan
Zhao Yiyan
Zheng Heliang
Publication venue
Publication date: 08/07/2023
Field of study

Automated machine learning (AutoML) seeks to build ML models with minimal human effort. While considerable research has been conducted in the area of AutoML in general, aiming to take humans out of the loop when building artificial intelligence (AI) applications, scant literature has focused on how AutoML works well in open-environment scenarios such as the process of training and updating large models, industrial supply chains or the industrial metaverse, where people often face open-loop problems during the search process: they must continuously collect data, update data and models, satisfy the requirements of the development and deployment environment, support massive devices, modify evaluation metrics, etc. Addressing the open-environment issue with pure data-driven approaches requires considerable data, computing resources, and effort from dedicated data engineers, making current AutoML systems and platforms inefficient and computationally intractable. Human-computer interaction is a practical and feasible way to tackle the problem of open-environment AI. In this paper, we introduce OmniForce, a human-centered AutoML (HAML) system that yields both human-assisted ML and ML-assisted human techniques, to put an AutoML system into practice and build adaptive AI in open-environment scenarios. Specifically, we present OmniForce in terms of ML version management; pipeline-driven development and deployment collaborations; a flexible search strategy framework; and widely provisioned and crowdsourced application algorithms, including large models. Furthermore, the (large) models constructed by OmniForce can be automatically turned into remote services in a few minutes; this process is dubbed model as a service (MaaS). Experimental results obtained in multiple search spaces and real-world use cases demonstrate the efficacy and efficiency of OmniForce

arXiv.org e-Print Archive

Deep Learning-Based Action Recognition

Author
Publication venue: 'MDPI AG'
Publication date: 25/10/2022
Field of study

The classification of human action or behavior patterns is very important for analyzing situations in the field and maintaining social safety. This book focuses on recent research findings on recognizing human action patterns. Technology for the recognition of human action pattern includes the processing technology of human behavior data for learning, technology of expressing feature values of images, technology of extracting spatiotemporal information of images, technology of recognizing human posture, and technology of gesture recognition. Research on these technologies has recently been conducted using general deep learning network modeling of artificial intelligence technology, and excellent research results have been included in this edition

Directory of Open Access Books (DOAB)

Soft computing applied to optimization, computer vision and medicine

Author: Diaz Cortes Margarita Arimatea
Publication venue
Publication date: 01/01/2019
Field of study

Artificial intelligence has permeated almost every area of life in modern society, and its significance continues to grow. As a result, in recent years, Soft Computing has emerged as a powerful set of methodologies that propose innovative and robust solutions to a variety of complex problems. Soft Computing methods, because of their broad range of application, have the potential to significantly improve human living conditions. The motivation for the present research emerged from this background and possibility. This research aims to accomplish two main objectives: On the one hand, it endeavors to bridge the gap between Soft Computing techniques and their application to intricate problems. On the other hand, it explores the hypothetical benefits of Soft Computing methodologies as novel effective tools for such problems. This thesis synthesizes the results of extensive research on Soft Computing methods and their applications to optimization, Computer Vision, and medicine. This work is composed of several individual projects, which employ classical and new optimization algorithms. The manuscript presented here intends to provide an overview of the different aspects of Soft Computing methods in order to enable the reader to reach a global understanding of the field. Therefore, this document is assembled as a monograph that summarizes the outcomes of these projects across 12 chapters. The chapters are structured so that they can be read independently. The key focus of this work is the application and design of Soft Computing approaches for solving problems in the following: Block Matching, Pattern Detection, Thresholding, Corner Detection, Template Matching, Circle Detection, Color Segmentation, Leukocyte Detection, and Breast Thermogram Analysis. One of the outcomes presented in this thesis involves the development of two evolutionary approaches for global optimization. These were tested over complex benchmark datasets and showed promising results, thus opening the debate for future applications. Moreover, the applications for Computer Vision and medicine presented in this work have highlighted the utility of different Soft Computing methodologies in the solution of problems in such subjects. A milestone in this area is the translation of the Computer Vision and medical issues into optimization problems. Additionally, this work also strives to provide tools for combating public health issues by expanding the concepts to automated detection and diagnosis aid for pathologies such as Leukemia and breast cancer. The application of Soft Computing techniques in this field has attracted great interest worldwide due to the exponential growth of these diseases. Lastly, the use of Fuzzy Logic, Artificial Neural Networks, and Expert Systems in many everyday domestic appliances, such as washing machines, cookers, and refrigerators is now a reality. Many other industrial and commercial applications of Soft Computing have also been integrated into everyday use, and this is expected to increase within the next decade. Therefore, the research conducted here contributes an important piece for expanding these developments. The applications presented in this work are intended to serve as technological tools that can then be used in the development of new devices

Institutional Repository of the Freie Universität Berlin

A cross-modal investigation into the relationships between bistable perception and a global temporal mechanism

Author: Parker Amanda Louise
Publication venue: Faculty of Science, School of Psychology
Publication date: 01/01/2013
Field of study

When the two eyes are presented with sufficiently different images, Binocular Rivalry (BR) occurs. BR is a form of bistable perception involving stochastic alternations in awareness between distinct images shown to each eye. It has been suggested that the dynamics of BR are due to the activity of a central temporal process and are linked to involuntary mechanisms of selective attention (aka exogenous attention). To test these ideas, stimuli designed to evoke exogenous attention and central temporal processes were employed during BR observation. These stimuli included auditory and visual looming motion and streams of transient events of varied temporal rate and pattern. Although these stimuli exerted a strong impact over some aspects of BR, they were unable to override its characteristic stochastic pattern of alternations completely. It is concluded that BR is subject to distributed influences, but ultimately, is achieved in neural processing areas specific to the binocular conflict

Sydney eScholarship

Understanding Mode and Modality Transfer in Unistroke Gesture Input

Author: Henderson Jay
Publication venue: 'University of Waterloo'
Publication date: 18/10/2021
Field of study

Unistroke gestures are an attractive input method with an extensive research history, but one challenge with their usage is that the gestures are not always self-revealing. To obtain expertise with these gestures, interaction designers often deploy a guided novice mode -- where users can rely on recognizing visual UI elements to perform a gestural command. Once a user knows the gesture and associated command, they can perform it without guidance; thus, relying on recall. The primary aim of my thesis is to obtain a comprehensive understanding of why, when, and how users transfer from guided modes or modalities to potentially more efficient, or novel, methods of interaction -- through symbolic-abstract unistroke gestures. The goal of my work is to not only study user behaviour from novice to more efficient interaction mechanisms, but also to expand upon the concept of intermodal transfer to different contexts. We garner this understanding by empirically evaluating three different use cases of mode and/or modality transitions. Leveraging marking menus, the first piece investigates whether or not designers should force expertise transfer by penalizing use of the guided mode, in an effort to encourage use of the recall mode. Second, we investigate how well users can transfer skills between modalities, particularly when it is impractical to present guidance in the target or recall modality. Lastly, we assess how well users' pre-existing spatial knowledge of an input method (the QWERTY keyboard layout), transfers to performance in a new modality. Applying lessons from these three assessments, we segment intermodal transfer into three possible characterizations -- beyond the traditional novice to expert contextualization. This is followed by a series of implications and potential areas of future exploration spawning from our work

University of Waterloo's Institutional Repository

Connected Attribute Filtering Based on Contour Smoothness

Author: Ouzounis Georgios
Urbach Erik R.
Wilkinson M.H.F.
Publication venue: The Russian Academie of Science
Publication date: 01/01/2013
Field of study

University of Groningen

Connected Attribute Filtering Based on Contour Smoothness

Author: Ouzounis Georgios
Urbach Erik R.
Wilkinson M.H.F.
Publication venue: The Russian Academie of Science
Publication date: 01/01/2013
Field of study

A new attribute measuring the contour smoothness of 2-D objects is presented in the context of morphological attribute filtering. The attribute is based on the ratio of the circularity and non-compactness, and has a maximum of 1 for a perfect circle. It decreases as the object boundary becomes irregular. Computation on hierarchical image representation structures relies on five auxiliary data members and is rapid. Contour smoothness is a suitable descriptor for detecting and discriminating man-made structures from other image features. An example is demonstrated on a very-high-resolution satellite image using connected pattern spectra and the switchboard platform

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen