54 research outputs found
Deep Learning based Real-time Recognition of Dynamic Finger Gestures using a Data Glove
In this article, a real-time dynamic finger gesture recognition using a soft sensor embedded data glove is presented, which measures the metacarpophalangeal (MCP) and proximal interphalangeal (PIP) joint angles of five fingers. In the gesture recognition field, a challenging problem is that of separating meaningful dynamic gestures from a continuous data stream. Unconscious hand motions or sudden tremors, which can easily lead to segmentation ambiguity, makes this problem difficult. Furthermore, the hand shapes and speeds of users differ when performing the same dynamic gesture, and even those made by one user often vary. To solve the problem of separating meaningful dynamic gestures, we propose a deep learning-based gesture spotting algorithm that detects the start/end of a gesture sequence in a continuous data stream. The gesture spotting algorithm takes window data and estimates a scalar value named gesture progress sequence (GPS). GPS is a quantity that represents gesture progress. Moreover, to solve the gesture variation problem, we propose a sequence simplification algorithm and a deep learning-based gesture recognition algorithm. The proposed three algorithms (gesture spotting algorithm, sequence simplification algorithm, and gesture recognition algorithm) are unified into the real-time gesture recognition system and the system was tested with 11 dynamic finger gestures in real-time. The proposed system took only 6 ms to estimate a GPS and no more than 12 ms to recognize the completed gesture in real-time
SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions
The remarkable capabilities of pretrained image diffusion models have been
utilized not only for generating fixed-size images but also for creating
panoramas. However, naive stitching of multiple images often results in visible
seams. Recent techniques have attempted to address this issue by performing
joint diffusions in multiple windows and averaging latent features in
overlapping regions. However, these approaches, which focus on seamless montage
generation, often yield incoherent outputs by blending different scenes within
a single image. To overcome this limitation, we propose SyncDiffusion, a
plug-and-play module that synchronizes multiple diffusions through gradient
descent from a perceptual similarity loss. Specifically, we compute the
gradient of the perceptual loss using the predicted denoised images at each
denoising step, providing meaningful guidance for achieving coherent montages.
Our experimental results demonstrate that our method produces significantly
more coherent outputs compared to previous methods (66.35% vs. 33.65% in our
user study) while still maintaining fidelity (as assessed by GIQA) and
compatibility with the input prompt (as measured by CLIP score).Comment: Project page: https://syncdiffusion.github.i
Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes
We present Implicit Two Hands (Im2Hands), the first neural implicit
representation of two interacting hands. Unlike existing methods on two-hand
reconstruction that rely on a parametric hand model and/or low-resolution
meshes, Im2Hands can produce fine-grained geometry of two hands with high
hand-to-hand and hand-to-image coherency. To handle the shape complexity and
interaction context between two hands, Im2Hands models the occupancy volume of
two hands - conditioned on an RGB image and coarse 3D keypoints - by two novel
attention-based modules responsible for (1) initial occupancy estimation and
(2) context-aware occupancy refinement, respectively. Im2Hands first learns
per-hand neural articulated occupancy in the canonical space designed for each
hand using query-image attention. It then refines the initial two-hand
occupancy in the posed space to enhance the coherency between the two hand
shapes using query-anchor attention. In addition, we introduce an optional
keypoint refinement module to enable robust two-hand shape estimation from
predicted hand keypoints in a single-image reconstruction scenario. We
experimentally demonstrate the effectiveness of Im2Hands on two-hand
reconstruction in comparison to related methods, where ours achieves
state-of-the-art results. Our code is publicly available at
https://github.com/jyunlee/Im2Hands.Comment: 6 figures, 14 pages, accepted to CVPR 2023, project page:
https://jyunlee.github.io/projects/implicit-two-hands
Fine-Grained Socioeconomic Prediction from Satellite Images with Distributional Adjustment
While measuring socioeconomic indicators is critical for local governments to
make informed policy decisions, such measurements are often unavailable at
fine-grained levels like municipality. This study employs deep learning-based
predictions from satellite images to close the gap. We propose a method that
assigns a socioeconomic score to each satellite image by capturing the
distributional behavior observed in larger areas based on the ground truth. We
train an ordinal regression scoring model and adjust the scores to follow the
common power law within and across regions. Evaluation based on official
statistics in South Korea shows that our method outperforms previous models in
predicting population and employment size at both the municipality and grid
levels. Our method also demonstrates robust performance in districts with
uneven development, suggesting its potential use in developing countries where
reliable, fine-grained data is scarce
- โฆ