527 research outputs found
Concept Extraction and Clustering for Topic Digital Library Construction
This paper is to introduce a new approach to build
topic digital library using concept extraction and
document clustering. Firstly, documents in a special
domain are automatically produced by document
classification approach. Then, the keywords of each
document are extracted using the machine learning
approach. The keywords are used to cluster the
documents subset. The clustered result is the taxonomy
of the subset. Lastly, the taxonomy is modified to the
hierarchical structure for user navigation by manual
adjustments. The topic digital library is constructed
after combining the full-text retrieval and hierarchical
navigation function
The Translatability Of Puns In Selected Shakespeare’s Sonnets Into Chinese: From The Translator’ Perspectives
The translatability of pun has been solved in previous studies. The translation strategies of puns can be further explored. The objectives of this study are to examine the translatability of puns, translation strategies of puns. The sources texte of these puns are taken from Ingram & redapth’s (1978) edition of Shakespeare sonnets. The target texts of these puns are taken from the nine corresponding Chinese versions
Self-adaptive GA, quantitative semantic similarity measures and ontology-based text clustering
As the common clustering algorithms use vector space model (VSM) to represent document, the conceptual relationships between related terms which do not co-occur literally are ignored. A genetic algorithm-based clustering technique, named GA clustering, in conjunction with ontology is proposed in this article to overcome this problem. In general, the ontology measures can be partitioned into two categories: thesaurus-based methods and corpus-based methods. We take advantage of the hierarchical structure and the broad coverage taxonomy of Wordnet as the thesaurus-based ontology. However, the corpus-based method is rather complicated to handle in practical application. We propose a transformed latent semantic analysis (LSA) model as the corpus-based method in this paper. Moreover, two hybrid strategies, the combinations of the various similarity measures, are implemented in the clustering experiments. The results show that our GA clustering algorithm, in conjunction with the thesaurus-based and the LSA-based method, apparently outperforms that with other similarity measures. Moreover, the superiority of the GA clustering algorithm proposed over the commonly used k-means algorithm and the standard GA is demonstrated by the improvements of the clustering performance
Concept Extraction and Clustering for Topic Digital Library Construction
This paper is to introduce a new approach to build
topic digital library using concept extraction and
document clustering. Firstly, documents in a special
domain are automatically produced by document
classification approach. Then, the keywords of each
document are extracted using the machine learning
approach. The keywords are used to cluster the
documents subset. The clustered result is the taxonomy
of the subset. Lastly, the taxonomy is modified to the
hierarchical structure for user navigation by manual
adjustments. The topic digital library is constructed
after combining the full-text retrieval and hierarchical
navigation function
The Impact of Byline Order of Corresponding Author - A Preliminary Study
Corresponding author (C-Au) holds an important position in byline order. Some papers have analyzed the contribution of C-Au, but they do not consider the variation in different byline order. Furthermore, some studies use ques-tionnaire and found that people perception on other authors’ contribution would be influence by the byline order of C-Au, but the real situation remains unclear. Thus, this poster aims to analyze two questions: (1) What kind of byline order do C-Au have and are their contribution influenced by their by-line order? (2) Are other authors contributions influenced by the byline order of C-Au? Three main findings emerge: firstly, the last author are not always to be C-Au; following with the decline of byline order of C-Au, the contribution of C-Au deceases; finally, as the byline order of C-Au changes, other authors’ contribution change significantly. For instance, second author has the lowest contribution when the last author is C-Au
Does Attention Mechanism Possess the Feature of Human Reading? A Perspective of Sentiment Classification Task
[Purpose] To understand the meaning of a sentence, humans can focus on
important words in the sentence, which reflects our eyes staying on each word
in different gaze time or times. Thus, some studies utilize eye-tracking values
to optimize the attention mechanism in deep learning models. But these studies
lack to explain the rationality of this approach. Whether the attention
mechanism possesses this feature of human reading needs to be explored.
[Design/methodology/approach] We conducted experiments on a sentiment
classification task. Firstly, we obtained eye-tracking values from two
open-source eye-tracking corpora to describe the feature of human reading.
Then, the machine attention values of each sentence were learned from a
sentiment classification model. Finally, a comparison was conducted to analyze
machine attention values and eye-tracking values. [Findings] Through
experiments, we found the attention mechanism can focus on important words,
such as adjectives, adverbs, and sentiment words, which are valuable for
judging the sentiment of sentences on the sentiment classification task. It
possesses the feature of human reading, focusing on important words in
sentences when reading. Due to the insufficient learning of the attention
mechanism, some words are wrongly focused. The eye-tracking values can help the
attention mechanism correct this error and improve the model performance.
[Originality/value] Our research not only provides a reasonable explanation for
the study of using eye-tracking values to optimize the attention mechanism, but
also provides new inspiration for the interpretability of attention mechanism
Abelian integrals of quadratic hamiltonian vector fields with an invariant straight line
We prove that the lowest upper bound for the number of isolated zeros of the Abelian integrals associated to quadratic Hamiltonian vector fields having a center and an invariant straight line after quadratic perturbations is on
- …