29 research outputs found
GANN: Graph Alignment Neural Network for Semi-Supervised Learning
Graph neural networks (GNNs) have been widely investigated in the field of
semi-supervised graph machine learning. Most methods fail to exploit adequate
graph information when labeled data is limited, leading to the problem of
oversmoothing. To overcome this issue, we propose the Graph Alignment Neural
Network (GANN), a simple and effective graph neural architecture. A unique
learning algorithm with three alignment rules is proposed to thoroughly explore
hidden information for insufficient labels. Firstly, to better investigate
attribute specifics, we suggest the feature alignment rule to align the inner
product of both the attribute and embedding matrices. Secondly, to properly
utilize the higher-order neighbor information, we propose the cluster center
alignment rule, which involves aligning the inner product of the cluster center
matrix with the unit matrix. Finally, to get reliable prediction results with
few labels, we establish the minimum entropy alignment rule by lining up the
prediction probability matrix with its sharpened result. Extensive studies on
graph benchmark datasets demonstrate that GANN can achieve considerable
benefits in semi-supervised node classification and outperform state-of-the-art
competitors
Revisiting Initializing Then Refining: An Incomplete and Missing Graph Imputation Network
With the development of various applications, such as social networks and
knowledge graphs, graph data has been ubiquitous in the real world.
Unfortunately, graphs usually suffer from being absent due to
privacy-protecting policies or copyright restrictions during data collection.
The absence of graph data can be roughly categorized into attribute-incomplete
and attribute-missing circumstances. Specifically, attribute-incomplete
indicates that a part of the attribute vectors of all nodes are incomplete,
while attribute-missing indicates that the whole attribute vectors of partial
nodes are missing. Although many efforts have been devoted, none of them is
custom-designed for a common situation where both types of graph data absence
exist simultaneously. To fill this gap, we develop a novel network termed
Revisiting Initializing Then Refining (RITR), where we complete both
attribute-incomplete and attribute-missing samples under the guidance of a
novel initializing-then-refining imputation criterion. Specifically, to
complete attribute-incomplete samples, we first initialize the incomplete
attributes using Gaussian noise before network learning, and then introduce a
structure-attribute consistency constraint to refine incomplete values by
approximating a structure-attribute correlation matrix to a high-order
structural matrix. To complete attribute-missing samples, we first adopt
structure embeddings of attribute-missing samples as the embedding
initialization, and then refine these initial values by adaptively aggregating
the reliable information of attribute-incomplete samples according to a dynamic
affinity structure. To the best of our knowledge, this newly designed method is
the first unsupervised framework dedicated to handling hybrid-absent graphs.
Extensive experiments on four datasets have verified that our methods
consistently outperform existing state-of-the-art competitors
Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine
This report provides a preliminary evaluation of ChatGPT for machine
translation, including translation prompt, multilingual translation, and
translation robustness. We adopt the prompts advised by ChatGPT to trigger its
translation ability and find that the candidate prompts generally work well
with minor performance differences. By evaluating on a number of benchmark test
sets, we find that ChatGPT performs competitively with commercial translation
products (e.g., Google Translate) on high-resource European languages but lags
behind significantly on low-resource or distant languages. As for the
translation robustness, ChatGPT does not perform as well as the commercial
systems on biomedical abstracts or Reddit comments but exhibits good results on
spoken language. Further, we explore an interesting strategy named
for distant languages, which asks ChatGPT to
translate the source sentence into a high-resource pivot language before into
the target language, improving the translation performance noticeably. With the
launch of the GPT-4 engine, the translation performance of ChatGPT is
significantly boosted, becoming comparable to commercial translation products,
even for distant languages. Human analysis on Google Translate and ChatGPT
suggests that ChatGPT with GPT-3.5 tends to generate more hallucinations and
mis-translation errors while that with GPT-4 makes the least errors. In other
words, ChatGPT has already become a good translator. Please refer to our Github
project for more details:
https://github.com/wxjiao/Is-ChatGPT-A-Good-TranslatorComment: Analyzed/compared the outputs between ChatGPT and Google Translate;
both automatic and human evaluatio
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Safety lies at the core of the development of Large Language Models (LLMs).
There is ample work on aligning LLMs with human ethics and preferences,
including data filtering in pretraining, supervised fine-tuning, reinforcement
learning from human feedback, and red teaming, etc. In this study, we discover
that chat in cipher can bypass the safety alignment techniques of LLMs, which
are mainly conducted in natural languages. We propose a novel framework
CipherChat to systematically examine the generalizability of safety alignment
to non-natural languages -- ciphers. CipherChat enables humans to chat with
LLMs through cipher prompts topped with system role descriptions and few-shot
enciphered demonstrations. We use CipherChat to assess state-of-the-art LLMs,
including ChatGPT and GPT-4 for different representative human ciphers across
11 safety domains in both English and Chinese. Experimental results show that
certain ciphers succeed almost 100% of the time to bypass the safety alignment
of GPT-4 in several safety domains, demonstrating the necessity of developing
safety alignment for non-natural languages. Notably, we identify that LLMs seem
to have a ''secret cipher'', and propose a novel SelfCipher that uses only role
play and several demonstrations in natural language to evoke this capability.
SelfCipher surprisingly outperforms existing human ciphers in almost all cases.
Our code and data will be released at https://github.com/RobustNLP/CipherChat.Comment: 13 pages, 4 figures, 9 table
Self-Supervised Temporal Graph learning with Temporal and Structural Intensity Alignment
Temporal graph learning aims to generate high-quality representations for
graph-based tasks along with dynamic information, which has recently drawn
increasing attention. Unlike the static graph, a temporal graph is usually
organized in the form of node interaction sequences over continuous time
instead of an adjacency matrix. Most temporal graph learning methods model
current interactions by combining historical information over time. However,
such methods merely consider the first-order temporal information while
ignoring the important high-order structural information, leading to
sub-optimal performance. To solve this issue, by extracting both temporal and
structural information to learn more informative node representations, we
propose a self-supervised method termed S2T for temporal graph learning. Note
that the first-order temporal information and the high-order structural
information are combined in different ways by the initial node representations
to calculate two conditional intensities, respectively. Then the alignment loss
is introduced to optimize the node representations to be more informative by
narrowing the gap between the two intensities. Concretely, besides modeling
temporal information using historical neighbor sequences, we further consider
the structural information from both local and global levels. At the local
level, we generate structural intensity by aggregating features from the
high-order neighbor sequences. At the global level, a global representation is
generated based on all nodes to adjust the structural intensity according to
the active statuses on different nodes. Extensive experiments demonstrate that
the proposed method S2T achieves at most 10.13% performance improvement
compared with the state-of-the-art competitors on several datasets
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models
In this paper, we identify a cultural dominance issue within large language
models (LLMs) due to the predominant use of English data in model training
(e.g. ChatGPT). LLMs often provide inappropriate English-culture-related
answers that are not relevant to the expected culture when users ask in
non-English languages. To systematically evaluate the cultural dominance issue,
we build a benchmark that consists of both concrete (e.g. holidays and songs)
and abstract (e.g. values and opinions) cultural objects. Empirical results
show that the representative GPT models suffer from the culture dominance
problem, where GPT-4 is the most affected while text-davinci-003 suffers the
least from this problem. Our study emphasizes the need for critical examination
of cultural dominance and ethical consideration in their development and
deployment. We show two straightforward methods in model development (i.e.
pretraining on more diverse data) and deployment (e.g. culture-aware prompting)
can significantly mitigate the cultural dominance issue in LLMs