71 research outputs found
NPC: Neural Point Characters from Video
High-fidelity human 3D models can now be learned directly from videos,
typically by combining a template-based surface model with neural
representations. However, obtaining a template surface requires expensive
multi-view capture systems, laser scans, or strictly controlled conditions.
Previous methods avoid using a template but rely on a costly or ill-posed
mapping from observation to canonical space. We propose a hybrid point-based
representation for reconstructing animatable characters that does not require
an explicit surface model, while being generalizable to novel poses. For a
given video, our method automatically produces an explicit set of 3D points
representing approximate canonical geometry, and learns an articulated
deformation model that produces pose-dependent point transformations. The
points serve both as a scaffold for high-frequency neural features and an
anchor for efficiently mapping between observation and canonical space. We
demonstrate on established benchmarks that our representation overcomes
limitations of prior work operating in either canonical or in observation
space. Moreover, our automatic point extraction approach enables learning
models of human and animal characters alike, matching the performance of the
methods using rigged surface templates despite being more general. Project
website: https://lemonatsu.github.io/npc/Comment: Project website: https://lemonatsu.github.io/npc
XMem++: Production-level Video Segmentation From Few Annotated Frames
Despite advancements in user-guided video segmentation, extracting complex
objects consistently for highly complex scenes is still a labor-intensive task,
especially for production. It is not uncommon that a majority of frames need to
be annotated. We introduce a novel semi-supervised video object segmentation
(SSVOS) model, XMem++, that improves existing memory-based models, with a
permanent memory module. Most existing methods focus on single frame
annotations, while our approach can effectively handle multiple user-selected
frames with varying appearances of the same object or region. Our method can
extract highly consistent results while keeping the required number of frame
annotations low. We further introduce an iterative and attention-based frame
suggestion mechanism, which computes the next best frame for annotation. Our
method is real-time and does not require retraining after each user input. We
also introduce a new dataset, PUMaVOS, which covers new challenging use cases
not found in previous benchmarks. We demonstrate SOTA performance on
challenging (partial and multi-class) segmentation scenarios as well as long
videos, while ensuring significantly fewer frame annotations than any existing
method. Project page: https://max810.github.io/xmem2-project-page/Comment: Accepted to ICCV 2023. 18 pages, 16 figure
Fully convolutional architectures for multi-part body segmentation
Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona, Any: 2018, Tutor: Meysam Madadi i Sergio Escalera Guerrero[en] Since the appearance of the baseline Fully Convolutinal Network (FCN), convolution architectures usage has spread widely among Deep Neural Networks: from classification tasks to object tracking, they are found ubiquitously in the Deep Learning field. In this study, three different convolutional architectures are studied with regard its application to the semantic segmentation of the human body: ICNet, a different resolution cascade network, SegNet, a encoder-decoder network, and Stacked Hourglass, a specially purposed network for the human body. For this purpose, the SURREAL (Synthetic hUmans foR REAL tasks) dataset, which consists of synthetically rendered but realistic images of people, is used. As a result, is shown that the best performing network for this task is the Stacked Hourglass. Due to its continuous refinement of the output
and the use of the full network for inference a 55.3% mIoU is achieved on the 24 body part dataset
Increasing Occupational Participation of Older Adults with Low Vision Through an Occupation-Based Exercise Video
With the increasingly large population of older adults with low vision, many older adults would benefit from having a guide dog as an assistive device. When walking with a guide dog, different upper extremity muscles and postures are adopted to handle the guide dog. However, older adults with low vision may not be in the proper physical condition to meet the strenuous demands of handling a guide dog due to the normal aging process and decreased mobility. To prevent pain and injury, stretching and strengthening muscles used when handling a guide dog may benefit older adults before entering the Guide Dogs for the Blind (GDB) training program. The objective of the project is to improve older adults’ strength and endurance through the use of an evidence-based, occupational exercise video. The exercises within the video are integrated into daily life activities to promote habituation and adherence to the exercises
Continuous Camera-Based Premature-Infant Monitoring Algorithms for NICU
Non-contact visual monitoring of vital signs in neonatology has been demonstrated by several recent studies in ideal scenarios where the baby is calm and there is no medical or parental intervention. Similar to contact monitoring methods (e.g., ECG, pulse oximeter) the camera-based solutions suffer from motion artifacts. Therefore, during care and the infants’ active periods, calculated values typically differ largely from the real ones. In this way, our main contribution to existing remote camera-based techniques is to detect and classify such situations with a high level of confidence. Our algorithms can not only evaluate quiet periods, but can also provide continuous monitoring. Altogether, our proposed algorithms can measure pulse rate, breathing rate, and to recognize situations such as medical intervention or very active subjects using only a single camera, while the system does not exceed the computational capabilities of average CPU-GPU-based hardware. The performance of the algorithms was evaluated on our database collected at the Ist Dept. of Neonatology of Pediatrics, Dept of Obstetrics and Gynecology, Semmelweis University, Budapest, Hungary
- …