4,297,969 research outputs found

    Table for text analysis

    Get PDF
    This table is a guide to text analysi

    Text Analysis and Text Created in Grammar Learning

    Full text link
    The complain of the lecturers on English students' grammar reflected in their understanding of reading text and their writing product stimulated grammar lecturers to find out the ways to make students use the grammar learned. The paper aims at exploring the learning tasks of text analysis and text created assigned to students of undergraduate level. This activity is the stepping stone to get the rank scale of the clause structure in inductive way. The topics discussed can be in the level of phrase, clause, or clause complex. After comprehending the level of scale structure, the students are directed to identify the rules they have learned by analyzing the text. To use what they have learned, students are assigned to create the text related to the topics. There are various activities that can be done, such as individual, pair-up, or group activities. Practices may be in spoken and written activities. As it was implemented in my grammar class, the result indicates that students were motivated and reflective in using the language. In the low level of structure, such as phrases and simple clause, it worked quite well. It didn't take much time for the students to analyze the text for the rules and created the text assigned. In clause complex structure, however, it took a little longer for them to do the analysis. But the result is good. It is also a challenge for students to create independent activities outside the class, especially for the bright students

    Multinomial Inverse Regression for Text Analysis

    Full text link
    Text data, including speeches, stories, and other document forms, are often connected to sentiment variables that are of interest for research in marketing, economics, and elsewhere. It is also very high dimensional and difficult to incorporate into statistical analyses. This article introduces a straightforward framework of sentiment-preserving dimension reduction for text data. Multinomial inverse regression is introduced as a general tool for simplifying predictor sets that can be represented as draws from a multinomial distribution, and we show that logistic regression of phrase counts onto document annotations can be used to obtain low dimension document representations that are rich in sentiment information. To facilitate this modeling, a novel estimation technique is developed for multinomial logistic regression with very high-dimension response. In particular, independent Laplace priors with unknown variance are assigned to each regression coefficient, and we detail an efficient routine for maximization of the joint posterior over coefficients and their prior scale. This "gamma-lasso" scheme yields stable and effective estimation for general high-dimension logistic regression, and we argue that it will be superior to current methods in many settings. Guidelines for prior specification are provided, algorithm convergence is detailed, and estimator properties are outlined from the perspective of the literature on non-concave likelihood penalization. Related work on sentiment analysis from statistics, econometrics, and machine learning is surveyed and connected. Finally, the methods are applied in two detailed examples and we provide out-of-sample prediction studies to illustrate their effectiveness.Comment: Published in the Journal of the American Statistical Association 108, 2013, with discussion (rejoinder is here: http://arxiv.org/abs/1304.4200). Software is available in the textir package for

    An Automata Based Text Analysis System

    Get PDF
    This report describes and implements an automata based text analysis system. We have collected some of the writing samples. Each sample establishes a tree, and uses the ALERGIA algorithm to merge all compatible nodes in order to get a merged stochastic finite automaton. We store these automatons which demonstrate writing style of the sample texts in the hard drive. For a new testing piece, we can test if it has similar writing style compared to those sample texts

    Text Classification For Authorship Attribution Analysis

    Full text link
    Authorship attribution mainly deals with undecided authorship of literary texts. Authorship attribution is useful in resolving issues like uncertain authorship, recognize authorship of unknown texts, spot plagiarism so on. Statistical methods can be used to set apart the approach of an author numerically. The basic methodologies that are made use in computational stylometry are word length, sentence length, vocabulary affluence, frequencies etc. Each author has an inborn style of writing, which is particular to himself. Statistical quantitative techniques can be used to differentiate the approach of an author in a numerical way. The problem can be broken down into three sub problems as author identification, author characterization and similarity detection. The steps involved are pre-processing, extracting features, classification and author identification. For this different classifiers can be used. Here fuzzy learning classifier and SVM are used. After author identification the SVM was found to have more accuracy than Fuzzy classifier. Later combined the classifiers to obtain a better accuracy when compared to individual SVM and fuzzy classifier.Comment: 10 page

    Text Coherence Analysis Based on Deep Neural Network

    Full text link
    In this paper, we propose a novel deep coherence model (DCM) using a convolutional neural network architecture to capture the text coherence. The text coherence problem is investigated with a new perspective of learning sentence distributional representation and text coherence modeling simultaneously. In particular, the model captures the interactions between sentences by computing the similarities of their distributional representations. Further, it can be easily trained in an end-to-end fashion. The proposed model is evaluated on a standard Sentence Ordering task. The experimental results demonstrate its effectiveness and promise in coherence assessment showing a significant improvement over the state-of-the-art by a wide margin.Comment: 4 pages, 2 figures, CIKM 201
    corecore