324 research outputs found
3D Shape Knowledge Graph for Cross-domain and Cross-modal 3D Shape Retrieval
With the development of 3D modeling and fabrication, 3D shape retrieval has
become a hot topic. In recent years, several strategies have been put forth to
address this retrieval issue. However, it is difficult for them to handle
cross-modal 3D shape retrieval because of the natural differences between
modalities. In this paper, we propose an innovative concept, namely, geometric
words, which is regarded as the basic element to represent any 3D or 2D entity
by combination, and assisted by which, we can simultaneously handle
cross-domain or cross-modal retrieval problems. First, to construct the
knowledge graph, we utilize the geometric word as the node, and then use the
category of the 3D shape as well as the attribute of the geometry to bridge the
nodes. Second, based on the knowledge graph, we provide a unique way for
learning each entity's embedding. Finally, we propose an effective similarity
measure to handle the cross-domain and cross-modal 3D shape retrieval.
Specifically, every 3D or 2D entity could locate its geometric terms in the 3D
knowledge graph, which serve as a link between cross-domain and cross-modal
data. Thus, our approach can achieve the cross-domain and cross-modal 3D shape
retrieval at the same time. We evaluated our proposed method on the ModelNet40
dataset and ShapeNetCore55 dataset for both the 3D shape retrieval task and
cross-domain 3D shape retrieval task. The classic cross-modal dataset (MI3DOR)
is utilized to evaluate cross-modal 3D shape retrieval. Experimental results
and comparisons with state-of-the-art methods illustrate the superiority of our
approach
Dynamic Causal Disentanglement Model for Dialogue Emotion Detection
Emotion detection is a critical technology extensively employed in diverse
fields. While the incorporation of commonsense knowledge has proven beneficial
for existing emotion detection methods, dialogue-based emotion detection
encounters numerous difficulties and challenges due to human agency and the
variability of dialogue content.In dialogues, human emotions tend to accumulate
in bursts. However, they are often implicitly expressed. This implies that many
genuine emotions remain concealed within a plethora of unrelated words and
dialogues.In this paper, we propose a Dynamic Causal Disentanglement Model
based on hidden variable separation, which is founded on the separation of
hidden variables. This model effectively decomposes the content of dialogues
and investigates the temporal accumulation of emotions, thereby enabling more
precise emotion recognition. First, we introduce a novel Causal Directed
Acyclic Graph (DAG) to establish the correlation between hidden emotional
information and other observed elements. Subsequently, our approach utilizes
pre-extracted personal attributes and utterance topics as guiding factors for
the distribution of hidden variables, aiming to separate irrelevant ones.
Specifically, we propose a dynamic temporal disentanglement model to infer the
propagation of utterances and hidden variables, enabling the accumulation of
emotion-related information throughout the conversation. To guide this
disentanglement process, we leverage the ChatGPT-4.0 and LSTM networks to
extract utterance topics and personal attributes as observed
information.Finally, we test our approach on two popular datasets in dialogue
emotion detection and relevant experimental results verified the model's
superiority
Dimension Reduction Using Samples’ Inner Structure Based Graph for Face Recognition
Acknowledgments This research is supported by (1) the Ph.D. Programs Foundation of Ministry of Education of China under Grant (no. 20120061110045) and (2) the Natural Science Foundation of Jilin Province of China under Grant (no. 201115022).Peer reviewedPublisher PD
Point-PC: Point Cloud Completion Guided by Prior Knowledge via Causal Inference
Point cloud completion aims to recover raw point clouds captured by scanners
from partial observations caused by occlusion and limited view angles. Many
approaches utilize a partial-complete paradigm in which missing parts are
directly predicted by a global feature learned from partial inputs. This makes
it hard to recover details because the global feature is unlikely to capture
the full details of all missing parts. In this paper, we propose a novel
approach to point cloud completion called Point-PC, which uses a memory network
to retrieve shape priors and designs an effective causal inference model to
choose missing shape information as additional geometric information to aid
point cloud completion. Specifically, we propose a memory operating mechanism
where the complete shape features and the corresponding shapes are stored in
the form of ``key-value'' pairs. To retrieve similar shapes from the partial
input, we also apply a contrastive learning-based pre-training scheme to
transfer features of incomplete shapes into the domain of complete shape
features. Moreover, we use backdoor adjustment to get rid of the confounder,
which is a part of the shape prior that has the same semantic structure as the
partial input. Experimental results on the ShapeNet-55, PCN, and KITTI datasets
demonstrate that Point-PC performs favorably against the state-of-the-art
methods
Deep Reinforcement Learning Framework for Thoracic Diseases Classification via Prior Knowledge Guidance
The chest X-ray is often utilized for diagnosing common thoracic diseases. In
recent years, many approaches have been proposed to handle the problem of
automatic diagnosis based on chest X-rays. However, the scarcity of labeled
data for related diseases still poses a huge challenge to an accurate
diagnosis. In this paper, we focus on the thorax disease diagnostic problem and
propose a novel deep reinforcement learning framework, which introduces prior
knowledge to direct the learning of diagnostic agents and the model parameters
can also be continuously updated as the data increases, like a person's
learning process. Especially, 1) prior knowledge can be learned from the
pre-trained model based on old data or other domains' similar data, which can
effectively reduce the dependence on target domain data, and 2) the framework
of reinforcement learning can make the diagnostic agent as exploratory as a
human being and improve the accuracy of diagnosis through continuous
exploration. The method can also effectively solve the model learning problem
in the case of few-shot data and improve the generalization ability of the
model. Finally, our approach's performance was demonstrated using the
well-known NIH ChestX-ray 14 and CheXpert datasets, and we achieved competitive
results. The source code can be found here:
\url{https://github.com/NeaseZ/MARL}
Building recognition on subregion’s multi-scale gist feature extraction and corresponding columns information based dimensionality reduction
In this paper, we proposed a new building recognition method named subregion’s multiscale gist feature (SM-gist) extraction and corresponding columns information based dimensionality reduction (CCI-DR). Our proposed building recognition method is presented as a two-stage model: in the first stage, a building image is divided into 4 × 5 subregions, and gist vectors are extracted from these regions individually. Then, we combine these gist vectors into a matrix with relatively high dimensions. In the second stage, we proposed CCI-DR to project the high dimensional manifold matrix to low dimensional subspace. Compared with the previous building recognition method the advantages of our proposed method are that (1) gist features extracted by SM-gist have the ability to adapt to nonuniform illumination and that (2) CCI-DR can address the limitation of traditional dimensionality reduction methods, which convert gist matrices into vectors and thus mix the corresponding gist vectors from different feature maps. Our building recognition method is evaluated on the Sheffield buildings database, and experiments show that our method can achieve satisfactory performance
Excitatory nucleo-olivary pathway shapes cerebellar outputs for motor control
The brain generates predictive motor commands to control the spatiotemporal precision of high-velocity movements. Yet, how the brain organizes automated internal feedback to coordinate the kinematics of such fast movements is unclear. Here we unveil a unique nucleo-olivary loop in the cerebellum and its involvement in coordinating high-velocity movements. Activating the excitatory nucleo-olivary pathway induces well-timed internal feedback complex spike signals in Purkinje cells to shape cerebellar outputs. Anatomical tracing reveals extensive axonal collaterals from the excitatory nucleo-olivary neurons to downstream motor regions, supporting integration of motor output and internal feedback signals within the cerebellum. This pathway directly drives saccades and head movements with a converging direction, while curtailing their amplitude and velocity via the powerful internal feedback mechanism. Our finding challenges the long-standing dogma that the cerebellum inhibits the inferior olivary pathway and provides a new circuit mechanism for the cerebellar control of high-velocity movements.</p
- …