8 research outputs found
Emergent Communication in Interactive Sketch Question Answering
Vision-based emergent communication (EC) aims to learn to communicate through
sketches and demystify the evolution of human communication. Ironically,
previous works neglect multi-round interaction, which is indispensable in human
communication. To fill this gap, we first introduce a novel Interactive Sketch
Question Answering (ISQA) task, where two collaborative players are interacting
through sketches to answer a question about an image in a multi-round manner.
To accomplish this task, we design a new and efficient interactive EC system,
which can achieve an effective balance among three evaluation factors,
including the question answering accuracy, drawing complexity and human
interpretability. Our experimental results including human evaluation
demonstrate that multi-round interactive mechanism facilitates targeted and
efficient communication between intelligent agents with decent human
interpretability.Comment: Accepted by NeurIPS 202
Model-Theoretic Logic for Mathematical Theory of Semantic Information and Communication
In this paper, we propose an advancement to Tarskian model-theoretic
semantics, leading to a unified quantitative theory of semantic information and
communication. We start with description of inductive logic and probabilities,
which serve as notable tools in development of the proposed theory. Then, we
identify two disparate kinds of uncertainty in semantic communication, that of
physical and content, present refined interpretations of semantic information
measures, and conclude with proposing a new measure for semantic
content-information and entropy. Our proposition standardizes semantic
information across different universes and systems, hence bringing
measurability and comparability into semantic communication. We then proceed
with introducing conditional and mutual semantic cont-information measures and
point out to their utility in formulating practical and optimizable lossless
and lossy semantic compression objectives. Finally, we experimentally
demonstrate the value of our theoretical propositions
TEILP: Time Prediction over Knowledge Graphs via Logical Reasoning
Conventional embedding-based models approach event time prediction in
temporal knowledge graphs (TKGs) as a ranking problem. However, they often fall
short in capturing essential temporal relationships such as order and distance.
In this paper, we propose TEILP, a logical reasoning framework that naturally
integrates such temporal elements into knowledge graph predictions. We first
convert TKGs into a temporal event knowledge graph (TEKG) which has a more
explicit representation of time in term of nodes of the graph. The TEKG equips
us to develop a differentiable random walk approach to time prediction.
Finally, we introduce conditional probability density functions, associated
with the logical rules involving the query interval, using which we arrive at
the time prediction. We compare TEILP with state-of-the-art methods on five
benchmark datasets. We show that our model achieves a significant improvement
over baselines while providing interpretable explanations. In particular, we
consider several scenarios where training samples are limited, event types are
imbalanced, and forecasting the time of future events based on only past events
is desired. In all these cases, TEILP outperforms state-of-the-art methods in
terms of robustness.Comment: AAAI24 (Oral
Lemur: Harmonizing Natural Language and Code for Language Agents
We introduce Lemur and Lemur-Chat, openly accessible language models
optimized for both natural language and coding capabilities to serve as the
backbone of versatile language agents. The evolution from language chat models
to functional language agents demands that models not only master human
interaction, reasoning, and planning but also ensure grounding in the relevant
environments. This calls for a harmonious blend of language and coding
capabilities in the models. Lemur and Lemur-Chat are proposed to address this
necessity, demonstrating balanced proficiencies in both domains, unlike
existing open-source models that tend to specialize in either. Through
meticulous pre-training using a code-intensive corpus and instruction
fine-tuning on text and code data, our models achieve state-of-the-art averaged
performance across diverse text and coding benchmarks among open-source models.
Comprehensive experiments demonstrate Lemur's superiority over existing
open-source models and its proficiency across various agent tasks involving
human communication, tool usage, and interaction under fully- and partially-
observable environments. The harmonization between natural and programming
languages enables Lemur-Chat to significantly narrow the gap with proprietary
models on agent abilities, providing key insights into developing advanced
open-source agents adept at reasoning, planning, and operating seamlessly
across environments. https://github.com/OpenLemur/Lemu
Temporal Inductive Logic Reasoning
Inductive logic reasoning is one of the fundamental tasks on graphs, which
seeks to generalize patterns from the data. This task has been studied
extensively for traditional graph datasets such as knowledge graphs (KGs), with
representative techniques such as inductive logic programming (ILP). Existing
ILP methods typically assume learning from KGs with static facts and binary
relations. Beyond KGs, graph structures are widely present in other
applications such as video instructions, scene graphs and program executions.
While inductive logic reasoning is also beneficial for these applications,
applying ILP to the corresponding graphs is nontrivial: they are more complex
than KGs, which usually involve timestamps and n-ary relations, effectively a
type of hypergraph with temporal events.
In this work, we study two of such applications and propose to represent them
as hypergraphs with time intervals. To reason on this graph, we propose the
multi-start random B-walk that traverses this hypergraph. Combining it with a
path-consistency algorithm, we propose an efficient backward-chaining ILP
method that learns logic rules by generalizing from both the temporal and the
relational data
Hierarchical Spherical CNNs with Lifting-based Adaptive Wavelets for Pooling and Unpooling
Pooling and unpooling are two essential operations in constructing
hierarchical spherical convolutional neural networks (HS-CNNs) for
comprehensive feature learning in the spherical domain. Most existing models
employ downsampling-based pooling, which will inevitably incur information loss
and cannot adapt to different spherical signals and tasks. Besides, the
preserved information after pooling cannot be well restored by the subsequent
unpooling to characterize the desirable features for a task. In this paper, we
propose a novel framework of HS-CNNs with a lifting structure to learn adaptive
spherical wavelets for pooling and unpooling, dubbed LiftHS-CNN, which ensures
a more efficient hierarchical feature learning for both image- and pixel-level
tasks. Specifically, adaptive spherical wavelets are learned with a lifting
structure that consists of trainable lifting operators (i.e., update and
predict operators). With this learnable lifting structure, we can adaptively
partition a signal into two sub-bands containing low- and high-frequency
components, respectively, and thus generate a better down-scaled representation
for pooling by preserving more information in the low-frequency sub-band. The
update and predict operators are parameterized with graph-based attention to
jointly consider the signal's characteristics and the underlying geometries. We
further show that particular properties are promised by the learned wavelets,
ensuring the spatial-frequency localization for better exploiting the signal's
correlation in both spatial and frequency domains. We then propose an unpooling
operation that is invertible to the lifting-based pooling, where an inverse
wavelet transform is performed by using the learned lifting operators to
restore an up-scaled representation. Extensive empirical evaluations on various
spherical domain tasks validate the superiority of the proposed LiftHS-CNN
Task-Level Aware Scheduling of Energy-Constrained Applications on Heterogeneous Multi-Core System
Minimizing the schedule length of parallel applications, which run on a heterogeneous multi-core system and are subject to energy consumption constraints, has recently attracted much attention. The key point of this problem is the strategy to pre-allocate the energy consumption of unscheduled tasks. Previous articles used the minimum value, average value or a power consumption weight value as the pre-allocation energy consumption of tasks. However, they all ignored the different levels of tasks. The tasks in different task levels have different impact on the overall schedule length when they are allocated the same energy consumption. Considering the task levels, we designed a novel task energy consumption pre-allocation strategy that is conducive to minimizing the scheduling time and developed a novel task schedule algorithm based on it. After getting the preliminary scheduling results, we also proposed a task execution frequency re-adjustment mechanism that can re-adjust the execution frequency of tasks, to further reduce the overall schedule length. We carried out a considerable number of experiments with practical parallel application models. The results of the experiments show that our method can reach better performance compared with the existing algorithms