4,297,969 research outputs found
Text Analysis and Text Created in Grammar Learning
The complain of the lecturers on English students' grammar reflected in their understanding of reading text and their writing product stimulated grammar lecturers to find out the ways to make students use the grammar learned. The paper aims at exploring the learning tasks of text analysis and text created assigned to students of undergraduate level. This activity is the stepping stone to get the rank scale of the clause structure in inductive way. The topics discussed can be in the level of phrase, clause, or clause complex. After comprehending the level of scale structure, the students are directed to identify the rules they have learned by analyzing the text. To use what they have learned, students are assigned to create the text related to the topics. There are various activities that can be done, such as individual, pair-up, or group activities. Practices may be in spoken and written activities.
As it was implemented in my grammar class, the result indicates that students were motivated and reflective in using the language. In the low level of structure, such as phrases and simple clause, it worked quite well. It didn't take much time for the students to analyze the text for the rules and created the text assigned. In clause complex structure, however, it took a little longer for them to do the analysis. But the result is good. It is also a challenge for students to create independent activities outside the class, especially for the bright students
Multinomial Inverse Regression for Text Analysis
Text data, including speeches, stories, and other document forms, are often
connected to sentiment variables that are of interest for research in
marketing, economics, and elsewhere. It is also very high dimensional and
difficult to incorporate into statistical analyses. This article introduces a
straightforward framework of sentiment-preserving dimension reduction for text
data. Multinomial inverse regression is introduced as a general tool for
simplifying predictor sets that can be represented as draws from a multinomial
distribution, and we show that logistic regression of phrase counts onto
document annotations can be used to obtain low dimension document
representations that are rich in sentiment information. To facilitate this
modeling, a novel estimation technique is developed for multinomial logistic
regression with very high-dimension response. In particular, independent
Laplace priors with unknown variance are assigned to each regression
coefficient, and we detail an efficient routine for maximization of the joint
posterior over coefficients and their prior scale. This "gamma-lasso" scheme
yields stable and effective estimation for general high-dimension logistic
regression, and we argue that it will be superior to current methods in many
settings. Guidelines for prior specification are provided, algorithm
convergence is detailed, and estimator properties are outlined from the
perspective of the literature on non-concave likelihood penalization. Related
work on sentiment analysis from statistics, econometrics, and machine learning
is surveyed and connected. Finally, the methods are applied in two detailed
examples and we provide out-of-sample prediction studies to illustrate their
effectiveness.Comment: Published in the Journal of the American Statistical Association 108,
2013, with discussion (rejoinder is here: http://arxiv.org/abs/1304.4200).
Software is available in the textir package for
An Automata Based Text Analysis System
This report describes and implements an automata based text analysis system. We have collected some of the writing samples. Each sample establishes a tree, and uses the ALERGIA algorithm to merge all compatible nodes in order to get a merged stochastic finite automaton. We store these automatons which demonstrate writing style of the sample texts in the hard drive. For a new testing piece, we can test if it has similar writing style compared to those sample texts
Text Classification For Authorship Attribution Analysis
Authorship attribution mainly deals with undecided authorship of literary
texts. Authorship attribution is useful in resolving issues like uncertain
authorship, recognize authorship of unknown texts, spot plagiarism so on.
Statistical methods can be used to set apart the approach of an author
numerically. The basic methodologies that are made use in computational
stylometry are word length, sentence length, vocabulary affluence, frequencies
etc. Each author has an inborn style of writing, which is particular to
himself. Statistical quantitative techniques can be used to differentiate the
approach of an author in a numerical way. The problem can be broken down into
three sub problems as author identification, author characterization and
similarity detection. The steps involved are pre-processing, extracting
features, classification and author identification. For this different
classifiers can be used. Here fuzzy learning classifier and SVM are used. After
author identification the SVM was found to have more accuracy than Fuzzy
classifier. Later combined the classifiers to obtain a better accuracy when
compared to individual SVM and fuzzy classifier.Comment: 10 page
Recommended from our members
Research Collaboration Analysis Using Text and Graph Features
Patterns of scientific collaboration and their effect on scientific production have been the subject of many studies. In this paper we analyze the nature of ties between co-authors and study collaboration patterns in science from the perspective of semantic similarity of authors who wrote a paper together and the strength of ties between these authors (i.e. how much have they previously collaborated together). These two views of scientific collaboration are used to analyze publications in the TrueImpactDataset [11], a new dataset containing two types of publications - publications regarded as seminal and publications regarded as literature reviews by field experts. We show there are distinct differences between seminal publications and literature reviews in terms of author similarity and the strength of ties between their authors. In particular, we find that seminal publications tend to be written by authors who have previously worked on dissimilar problems (i.e. authors from different fields or even disciplines), and by authors who are not frequent collaborators. On the other hand, literature reviews in our dataset tend to be the result of an established collaboration within a discipline. This demonstrates that our method provides meaningful information about potential future impacts of a publication which does not require citation information
Text Coherence Analysis Based on Deep Neural Network
In this paper, we propose a novel deep coherence model (DCM) using a
convolutional neural network architecture to capture the text coherence. The
text coherence problem is investigated with a new perspective of learning
sentence distributional representation and text coherence modeling
simultaneously. In particular, the model captures the interactions between
sentences by computing the similarities of their distributional
representations. Further, it can be easily trained in an end-to-end fashion.
The proposed model is evaluated on a standard Sentence Ordering task. The
experimental results demonstrate its effectiveness and promise in coherence
assessment showing a significant improvement over the state-of-the-art by a
wide margin.Comment: 4 pages, 2 figures, CIKM 201
- …
