22,969 research outputs found

    Self-adaptive GA, quantitative semantic similarity measures and ontology-based text clustering

    Get PDF
    As the common clustering algorithms use vector space model (VSM) to represent document, the conceptual relationships between related terms which do not co-occur literally are ignored. A genetic algorithm-based clustering technique, named GA clustering, in conjunction with ontology is proposed in this article to overcome this problem. In general, the ontology measures can be partitioned into two categories: thesaurus-based methods and corpus-based methods. We take advantage of the hierarchical structure and the broad coverage taxonomy of Wordnet as the thesaurus-based ontology. However, the corpus-based method is rather complicated to handle in practical application. We propose a transformed latent semantic analysis (LSA) model as the corpus-based method in this paper. Moreover, two hybrid strategies, the combinations of the various similarity measures, are implemented in the clustering experiments. The results show that our GA clustering algorithm, in conjunction with the thesaurus-based and the LSA-based method, apparently outperforms that with other similarity measures. Moreover, the superiority of the GA clustering algorithm proposed over the commonly used k-means algorithm and the standard GA is demonstrated by the improvements of the clustering performance

    Verbal Learning and Memory After Cochlear Implantation in Postlingually Deaf Adults: Some New Findings with the CVLT-II

    Get PDF
    OBJECTIVES: Despite the importance of verbal learning and memory in speech and language processing, this domain of cognitive functioning has been virtually ignored in clinical studies of hearing loss and cochlear implants in both adults and children. In this article, we report the results of two studies that used a newly developed visually based version of the California Verbal Learning Test-Second Edition (CVLT-II), a well-known normed neuropsychological measure of verbal learning and memory. DESIGN: The first study established the validity and feasibility of a computer-controlled visual version of the CVLT-II, which eliminates the effects of audibility of spoken stimuli, in groups of young normal-hearing and older normal-hearing (ONH) adults. A second study was then carried out using the visual CVLT-II format with a group of older postlingually deaf experienced cochlear implant (ECI) users (N = 25) and a group of ONH controls (N = 25) who were matched to ECI users for age, socioeconomic status, and nonverbal IQ. In addition to the visual CVLT-II, subjects provided data on demographics, hearing history, nonverbal IQ, reading fluency, vocabulary, and short-term memory span for visually presented digits. ECI participants were also tested for speech recognition in quiet. RESULTS: The ECI and ONH groups did not differ on most measures of verbal learning and memory obtained with the visual CVLT-II, but deficits were identified in ECI participants that were related to recency recall, the buildup of proactive interference, and retrieval-induced forgetting. Within the ECI group, nonverbal fluid IQ, reading fluency, and resistance to the buildup of proactive interference from the CVLT-II consistently predicted better speech recognition outcomes. CONCLUSIONS: Results from this study suggest that several underlying foundational neurocognitive abilities are related to core speech perception outcomes after implantation in older adults. Implications of these findings for explaining individual differences and variability and predicting speech recognition outcomes after implantation are discussed

    Log-log Convexity of Type-Token Growth in Zipf's Systems

    Full text link
    It is traditionally assumed that Zipf's law implies the power-law growth of the number of different elements with the total number of elements in a system - the so-called Heaps' law. We show that a careful definition of Zipf's law leads to the violation of Heaps' law in random systems, and obtain alternative growth curves. These curves fulfill universal data collapses that only depend on the value of the Zipf's exponent. We observe that real books behave very much in the same way as random systems, despite the presence of burstiness in word occurrence. We advance an explanation for this unexpected correspondence
    • …
    corecore