Search CORE

71 research outputs found

Learn to Model Blurry Motion via Directional Similarity and Filtering

Author: Chen Da
Cosker Darren
Li Wenbin
Yan Yan
Zhihan Lv.
Publication venue: 'Elsevier BV'
Publication date: 01/03/2018
Field of study

NPC: Neural Point Characters from Video

Author: Bagautdinov Timur
Rhodin Helge
Su Shih-Yang
Publication venue
Publication date: 01/09/2023
Field of study

High-fidelity human 3D models can now be learned directly from videos, typically by combining a template-based surface model with neural representations. However, obtaining a template surface requires expensive multi-view capture systems, laser scans, or strictly controlled conditions. Previous methods avoid using a template but rely on a costly or ill-posed mapping from observation to canonical space. We propose a hybrid point-based representation for reconstructing animatable characters that does not require an explicit surface model, while being generalizable to novel poses. For a given video, our method automatically produces an explicit set of 3D points representing approximate canonical geometry, and learns an articulated deformation model that produces pose-dependent point transformations. The points serve both as a scaffold for high-frequency neural features and an anchor for efficiently mapping between observation and canonical space. We demonstrate on established benchmarks that our representation overcomes limitations of prior work operating in either canonical or in observation space. Moreover, our automatic point extraction approach enables learning models of human and animal characters alike, matching the performance of the methods using rigged surface templates despite being more general. Project website: https://lemonatsu.github.io/npc/Comment: Project website: https://lemonatsu.github.io/npc

arXiv.org e-Print Archive

XMem++: Production-level Video Segmentation From Few Annotated Frames

Author: Bekuzarov Maksym
Bermudez Ariana
Lee Joon-Young
Li Hao
Publication venue
Publication date: 15/08/2023
Field of study

Despite advancements in user-guided video segmentation, extracting complex objects consistently for highly complex scenes is still a labor-intensive task, especially for production. It is not uncommon that a majority of frames need to be annotated. We introduce a novel semi-supervised video object segmentation (SSVOS) model, XMem++, that improves existing memory-based models, with a permanent memory module. Most existing methods focus on single frame annotations, while our approach can effectively handle multiple user-selected frames with varying appearances of the same object or region. Our method can extract highly consistent results while keeping the required number of frame annotations low. We further introduce an iterative and attention-based frame suggestion mechanism, which computes the next best frame for annotation. Our method is real-time and does not require retraining after each user input. We also introduce a new dataset, PUMaVOS, which covers new challenging use cases not found in previous benchmarks. We demonstrate SOTA performance on challenging (partial and multi-class) segmentation scenarios as well as long videos, while ensuring significantly fewer frame annotations than any existing method. Project page: https://max810.github.io/xmem2-project-page/Comment: Accepted to ICCV 2023. 18 pages, 16 figure

arXiv.org e-Print Archive

Fully convolutional architectures for multi-part body segmentation

Author: Borrego Carazo Juan
Publication venue
Publication date: 01/09/2018
Field of study

Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona, Any: 2018, Tutor: Meysam Madadi i Sergio Escalera Guerrero[en] Since the appearance of the baseline Fully Convolutinal Network (FCN), convolution architectures usage has spread widely among Deep Neural Networks: from classification tasks to object tracking, they are found ubiquitously in the Deep Learning field. In this study, three different convolutional architectures are studied with regard its application to the semantic segmentation of the human body: ICNet, a different resolution cascade network, SegNet, a encoder-decoder network, and Stacked Hourglass, a specially purposed network for the human body. For this purpose, the SURREAL (Synthetic hUmans foR REAL tasks) dataset, which consists of synthetically rendered but realistic images of people, is used. As a result, is shown that the best performing network for this task is the Stacked Hourglass. Due to its continuous refinement of the output and the use of the full network for inference a 55.3% mIoU is achieved on the 24 body part dataset

Diposit Digital de la Universitat de Barcelona

Increasing Occupational Participation of Older Adults with Low Vision Through an Occupation-Based Exercise Video

Author: DeRoos Valerie J.
Moon Skyler
Publication venue: Dominican Scholar
Publication date: 01/05/2016
Field of study

With the increasingly large population of older adults with low vision, many older adults would benefit from having a guide dog as an assistive device. When walking with a guide dog, different upper extremity muscles and postures are adopted to handle the guide dog. However, older adults with low vision may not be in the proper physical condition to meet the strenuous demands of handling a guide dog due to the normal aging process and decreased mobility. To prevent pain and injury, stretching and strengthening muscles used when handling a guide dog may benefit older adults before entering the Guide Dogs for the Blind (GDB) training program. The objective of the project is to improve older adults’ strength and endurance through the use of an evidence-based, occupational exercise video. The exercises within the video are integrated into daily life activities to promote habituation and adherence to the exercises

Dominican Scholar

Continuous Camera-Based Premature-Infant Monitoring Algorithms for NICU

Author: Dániel Terbe
Imre Jánoki
Judit Varga
Miklós Szabó
Máté Siket
Péter Földesy
Ádám Nagy
Ákos Zarándy
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Non-contact visual monitoring of vital signs in neonatology has been demonstrated by several recent studies in ideal scenarios where the baby is calm and there is no medical or parental intervention. Similar to contact monitoring methods (e.g., ECG, pulse oximeter) the camera-based solutions suffer from motion artifacts. Therefore, during care and the infants’ active periods, calculated values typically differ largely from the real ones. In this way, our main contribution to existing remote camera-based techniques is to detect and classify such situations with a high level of confidence. Our algorithms can not only evaluate quiet periods, but can also provide continuous monitoring. Altogether, our proposed algorithms can measure pulse rate, breathing rate, and to recognize situations such as medical intervention or very active subjects using only a single camera, while the system does not exceed the computational capabilities of average CPU-GPU-based hardware. The performance of the algorithms was evaluated on our database collected at the Ist Dept. of Neonatology of Pediatrics, Dept of Obstetrics and Gynecology, Semmelweis University, Budapest, Hungary

SZTAKI Publication Repository

Directory of Open Access Journals