1,467 research outputs found

    A Hybrid Model for Sense Guessing of Chinese Unknown Words

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    A Survey on Password Guessing

    Full text link
    Text password has served as the most popular method for user authentication so far, and is not likely to be totally replaced in foreseeable future. Password authentication offers several desirable properties (e.g., low-cost, highly available, easy-to-implement, reusable). However, it suffers from a critical security issue mainly caused by the inability to memorize complicated strings of humans. Users tend to choose easy-to-remember passwords which are not uniformly distributed in the key space. Thus, user-selected passwords are susceptible to guessing attacks. In order to encourage and support users to use strong passwords, it is necessary to simulate automated password guessing methods to determine the passwords' strength and identify weak passwords. A large number of password guessing models have been proposed in the literature. However, little attention was paid to the task of providing a systematic survey which is necessary to review the state-of-the-art approaches, identify gaps, and avoid duplicate studies. Motivated by that, we conduct a comprehensive survey on all password guessing studies presented in the literature from 1979 to 2022. We propose a generic methodology map to present an overview of existing methods. Then, we explain each representative approach in detail. The experimental procedures and available datasets used to evaluate password guessing models are summarized, and the reported performances of representative studies are compared. Finally, the current limitations and the open problems as future research directions are discussed. We believe that this survey is helpful to both experts and newcomers who are interested in password securityComment: 35 pages, 5 figures, 5 table

    Korean Language Resources for Everyone

    Get PDF

    Compilation of Malay criminological terms from online news

    Get PDF
    A Malay language corpus has been established by the Institute of Language and Literature (Dewan Bahasa dan Pustaka, DBP in Malaysia). Most of the past research on the Malay language corpus has focused on the description, lexicography and translation of the Malay language. However, in the existing literature, there is no list of Malay words that categorizes crime terminologies. This study aims to fill that linguistic gap. First, we aggregated the most frequently used crime terminology words from Malaysian online news sources. Five hundred crime-related words were compiled. No automatic machines were in the initial process, but they were subsequently used to verify the data. Four human coders were used to validate the data and ensure the originality of the semantic understanding of the Malay text. Finally, major crime terminologies were outlined from a set of keywords to serve as taggers in our solution. The ultimate goal of this study is to provide a corpus for forensic linguistics, police investigations, and general crime research. This study has established the first corpus of a criminological text in the Malay language

    Hybrid tag-set for natural language processing.

    Get PDF
    Leung Wai Kwong.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 90-95).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Objective --- p.3Chapter 1.3 --- Organization of thesis --- p.3Chapter 2 --- Background --- p.5Chapter 2.1 --- Chinese Noun Phrases Parsing --- p.5Chapter 2.2 --- Chinese Noun Phrases --- p.6Chapter 2.3 --- Problems with Syntactic Parsing --- p.11Chapter 2.3.1 --- Conjunctive Noun Phrases --- p.11Chapter 2.3.2 --- De-de Noun Phrases --- p.12Chapter 2.3.3 --- Compound Noun Phrases --- p.13Chapter 2.4 --- Observations --- p.15Chapter 2.4.1 --- Inadequacy in Part-of-Speech Categorization for Chi- nese NLP --- p.16Chapter 2.4.2 --- The Need of Semantic in Noun Phrase Parsing --- p.17Chapter 2.5 --- Summary --- p.17Chapter 3 --- Hybrid Tag-set --- p.19Chapter 3.1 --- Objectives --- p.19Chapter 3.1.1 --- Resolving Parsing Ambiguities --- p.19Chapter 3.1.2 --- Investigation of Nominal Compound Noun Phrases --- p.20Chapter 3.2 --- Definition of Hybrid Tag-set --- p.20Chapter 3.3 --- Introduction to Cilin --- p.21Chapter 3.4 --- Problems with Cilin --- p.23Chapter 3.4.1 --- Unknown words --- p.23Chapter 3.4.2 --- Multiple Semantic Classes --- p.25Chapter 3.5 --- Introduction to Chinese Word Formation --- p.26Chapter 3.5.1 --- Disyllabic Word Formation --- p.26Chapter 3.5.2 --- Polysyllabic Word Formation --- p.28Chapter 3.5.3 --- Observation --- p.29Chapter 3.6 --- Automatic Assignment of Hybrid Tag to Chinese Word --- p.31Chapter 3.7 --- Summary --- p.34Chapter 4 --- Automatic Semantic Assignment --- p.35Chapter 4.1 --- Previous Researches on Semantic Tagging --- p.36Chapter 4.2 --- SAUW - Automatic Semantic Assignment of Unknown Words --- p.37Chapter 4.2.1 --- POS-to-SC Association (Process 1) --- p.38Chapter 4.2.2 --- Morphology-based Deduction (Process 2) --- p.39Chapter 4.2.3 --- Di-syllabic Word Analysis (Process 3 and 4) --- p.41Chapter 4.2.4 --- Poly-syllabic Word Analysis (Process 5) --- p.47Chapter 4.3 --- Illustrative Examples --- p.47Chapter 4.4 --- Evaluation and Analysis --- p.49Chapter 4.4.1 --- Experiments --- p.49Chapter 4.4.2 --- Error Analysis --- p.51Chapter 4.5 --- Summary --- p.52Chapter 5 --- Word Sense Disambiguation --- p.53Chapter 5.1 --- Introduction to Word Sense Disambiguation --- p.54Chapter 5.2 --- Previous Works on Word Sense Disambiguation --- p.55Chapter 5.2.1 --- Linguistic-based Approaches --- p.56Chapter 5.2.2 --- Corpus-based Approaches --- p.58Chapter 5.3 --- Our Approach --- p.60Chapter 5.3.1 --- Bi-gram Co-occurrence Probabilities --- p.62Chapter 5.3.2 --- Tri-gram Co-occurrence Probabilities --- p.63Chapter 5.3.3 --- Design consideration --- p.65Chapter 5.3.4 --- Error Analysis --- p.67Chapter 5.4 --- Summary --- p.68Chapter 6 --- Hybrid Tag-set for Chinese Noun Phrase Parsing --- p.69Chapter 6.1 --- Resolving Ambiguous Noun Phrases --- p.70Chapter 6.1.1 --- Experiment --- p.70Chapter 6.1.2 --- Results --- p.72Chapter 6.2 --- Summary --- p.78Chapter 7 --- Conclusion --- p.80Chapter 7.1 --- Summary --- p.80Chapter 7.2 --- Difficulties Encountered --- p.83Chapter 7.2.1 --- Lack of Training Corpus --- p.83Chapter 7.2.2 --- Features of Chinese word formation --- p.84Chapter 7.2.3 --- Problems with linguistic sources --- p.85Chapter 7.3 --- Contributions --- p.86Chapter 7.3.1 --- Enrichment to the Cilin --- p.86Chapter 7.3.2 --- Enhancement in syntactic parsing --- p.87Chapter 7.4 --- Further Researches --- p.88Chapter 7.4.1 --- Investigation into words that undergo semantic changes --- p.88Chapter 7.4.2 --- Incorporation of more information into the hybrid tag-set --- p.89Chapter A --- POS Tag-set by Tsinghua University (清華大學) --- p.96Chapter B --- Morphological Rules --- p.100Chapter C --- Syntactic Rules for Di-syllabic Words Formation --- p.10
    corecore