92 research outputs found
Password Cracking and Countermeasures in Computer Security: A Survey
With the rapid development of internet technologies, social networks, and
other related areas, user authentication becomes more and more important to
protect the data of the users. Password authentication is one of the widely
used methods to achieve authentication for legal users and defense against
intruders. There have been many password cracking methods developed during the
past years, and people have been designing the countermeasures against password
cracking all the time. However, we find that the survey work on the password
cracking research has not been done very much. This paper is mainly to give a
brief review of the password cracking methods, import technologies of password
cracking, and the countermeasures against password cracking that are usually
designed at two stages including the password design stage (e.g. user
education, dynamic password, use of tokens, computer generations) and after the
design (e.g. reactive password checking, proactive password checking, password
encryption, access control). The main objective of this work is offering the
abecedarian IT security professionals and the common audiences with some
knowledge about the computer security and password cracking, and promoting the
development of this area.Comment: add copyright to the tables to the original authors, add
acknowledgement to helpe
Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation
This paper proposes a hierarchical attentional neural translation model which
focuses on enhancing source-side hierarchical representations by covering both
local and global semantic information using a bidirectional tree-based encoder.
To maximize the predictive likelihood of target words, a weighted variant of an
attention mechanism is used to balance the attentive information between
lexical and phrase vectors. Using a tree-based rare word encoding, the proposed
model is extended to sub-word level to alleviate the out-of-vocabulary (OOV)
problem. Empirical results reveal that the proposed model significantly
outperforms sequence-to-sequence attention-based and tree-based neural
translation models in English-Chinese translation tasks.Comment: Accepted for publication at EMNLP 201
Unsupervised Chunking Based on Graph Propagation from Bilingual Corpus
This paper presents a novel approach for unsupervised shallow parsing model trained on the unannotated Chinese text of parallel Chinese-English corpus. In this approach, no information of the Chinese side is applied. The exploitation of graph-based label propagation for bilingual knowledge transfer, along with an application of using the projected labels as features in unsupervised model, contributes to a better performance. The experimental comparisons with the state-of-the-art algorithms show that the proposed approach is able to achieve impressive higher accuracy in terms of F-score
Assessing the Ability of Self-Attention Networks to Learn Word Order
Self-attention networks (SAN) have attracted a lot of interests due to their
high parallelization and strong performance on a variety of NLP tasks, e.g.
machine translation. Due to the lack of recurrence structure such as recurrent
neural networks (RNN), SAN is ascribed to be weak at learning positional
information of words for sequence modeling. However, neither this speculation
has been empirically confirmed, nor explanations for their strong performances
on machine translation tasks when "lacking positional information" have been
explored. To this end, we propose a novel word reordering detection task to
quantify how well the word order information learned by SAN and RNN.
Specifically, we randomly move one word to another position, and examine
whether a trained model can detect both the original and inserted positions.
Experimental results reveal that: 1) SAN trained on word reordering detection
indeed has difficulty learning the positional information even with the
position embedding; and 2) SAN trained on machine translation learns better
positional information than its RNN counterpart, in which position embedding
plays a critical role. Although recurrence structure make the model more
universally-effective on learning word order, learning objectives matter more
in the downstream tasks such as machine translation.Comment: ACL 201
Human-in-the-loop Machine Translation with Large Language Model
The large language model (LLM) has garnered significant attention due to its
in-context learning mechanisms and emergent capabilities. The research
community has conducted several pilot studies to apply LLMs to machine
translation tasks and evaluate their performance from diverse perspectives.
However, previous research has primarily focused on the LLM itself and has not
explored human intervention in the inference process of LLM. The
characteristics of LLM, such as in-context learning and prompt engineering,
closely mirror human cognitive abilities in language tasks, offering an
intuitive solution for human-in-the-loop generation. In this study, we propose
a human-in-the-loop pipeline that guides LLMs to produce customized outputs
with revision instructions. The pipeline initiates by prompting the LLM to
produce a draft translation, followed by the utilization of automatic retrieval
or human feedback as supervision signals to enhance the LLM's translation
through in-context learning. The human-machine interactions generated in this
pipeline are also stored in an external database to expand the in-context
retrieval database, enabling us to leverage human supervision in an offline
setting. We evaluate the proposed pipeline using GPT-3.5-turbo API on five
domain-specific benchmarks for German-English translation. The results
demonstrate the effectiveness of the pipeline in tailoring in-domain
translations and improving translation performance compared to direct
translation. Additionally, we discuss the results from the following
perspectives: 1) the effectiveness of different in-context retrieval methods;
2) the construction of a retrieval database under low-resource scenarios; 3)
the observed domains differences; 4) the quantitative analysis of linguistic
statistics; and 5) the qualitative analysis of translation cases. The code and
data are available at https://github.com/NLP2CT/HIL-MT/.Comment: Accepted to MT Summit 202
Context-Aware Self-Attention Networks
Self-attention model have shown its flexibility in parallel computation and
the effectiveness on modeling both long- and short-term dependencies. However,
it calculates the dependencies between representations without considering the
contextual information, which have proven useful for modeling dependencies
among neural representations in various natural language tasks. In this work,
we focus on improving self-attention networks through capturing the richness of
context. To maintain the simplicity and flexibility of the self-attention
networks, we propose to contextualize the transformations of the query and key
layers, which are used to calculates the relevance between elements.
Specifically, we leverage the internal representations that embed both global
and deep contexts, thus avoid relying on external resources. Experimental
results on WMT14 English-German and WMT17 Chinese-English translation tasks
demonstrate the effectiveness and universality of the proposed methods.
Furthermore, we conducted extensive analyses to quantity how the context
vectors participate in the self-attention model.Comment: AAAI 201
- …