Search CORE

4,297,969 research outputs found

Table for text analysis

Author: Lipson Maxine
Publication venue
Publication date: 08/10/2009
Field of study

This table is a guide to text analysi

Almae Matris Studiorum Campus

Text Analysis and Text Created in Grammar Learning

Author: Syarif H. (Hermawati)
Publication venue: Muhammadiyah University Purwokerto
Publication date: 01/01/2015
Field of study

The complain of the lecturers on English students' grammar reflected in their understanding of reading text and their writing product stimulated grammar lecturers to find out the ways to make students use the grammar learned. The paper aims at exploring the learning tasks of text analysis and text created assigned to students of undergraduate level. This activity is the stepping stone to get the rank scale of the clause structure in inductive way. The topics discussed can be in the level of phrase, clause, or clause complex. After comprehending the level of scale structure, the students are directed to identify the rules they have learned by analyzing the text. To use what they have learned, students are assigned to create the text related to the topics. There are various activities that can be done, such as individual, pair-up, or group activities. Practices may be in spoken and written activities. As it was implemented in my grammar class, the result indicates that students were motivated and reflective in using the language. In the low level of structure, such as phrases and simple clause, it worked quite well. It didn't take much time for the students to analyze the text for the rules and created the text assigned. In clause complex structure, however, it took a little longer for them to do the analysis. But the result is good. It is also a challenge for students to create independent activities outside the class, especially for the bright students

Neliti

Multinomial Inverse Regression for Text Analysis

Author: Taddy Matt
Publication venue
Publication date: 01/01/2013
Field of study

Text data, including speeches, stories, and other document forms, are often connected to sentiment variables that are of interest for research in marketing, economics, and elsewhere. It is also very high dimensional and difficult to incorporate into statistical analyses. This article introduces a straightforward framework of sentiment-preserving dimension reduction for text data. Multinomial inverse regression is introduced as a general tool for simplifying predictor sets that can be represented as draws from a multinomial distribution, and we show that logistic regression of phrase counts onto document annotations can be used to obtain low dimension document representations that are rich in sentiment information. To facilitate this modeling, a novel estimation technique is developed for multinomial logistic regression with very high-dimension response. In particular, independent Laplace priors with unknown variance are assigned to each regression coefficient, and we detail an efficient routine for maximization of the joint posterior over coefficients and their prior scale. This "gamma-lasso" scheme yields stable and effective estimation for general high-dimension logistic regression, and we argue that it will be superior to current methods in many settings. Guidelines for prior specification are provided, algorithm convergence is detailed, and estimator properties are outlined from the perspective of the literature on non-concave likelihood penalization. Related work on sentiment analysis from statistics, econometrics, and machine learning is surveyed and connected. Finally, the methods are applied in two detailed examples and we provide out-of-sample prediction studies to illustrate their effectiveness.Comment: Published in the Journal of the American Statistical Association 108, 2013, with discussion (rejoinder is here: http://arxiv.org/abs/1304.4200). Software is available in the textir package for

arXiv.org e-Print Archive

CiteSeerX

An Automata Based Text Analysis System

Author: Lu Yue
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2009
Field of study

This report describes and implements an automata based text analysis system. We have collected some of the writing samples. Each sample establishes a tree, and uses the ALERGIA algorithm to merge all compatible nodes in order to get a merged stochastic finite automaton. We store these automatons which demonstrate writing style of the sample texts in the hard drive. For a new testing piece, we can test if it has similar writing style compared to those sample texts

SJSU ScholarWorks

Text Classification For Authorship Attribution Analysis

Author: Elayidom M. Sudheep
Jose Chinchu
Puthussery Anitta
Sasi Neenu K
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 30/09/2013
Field of study

Authorship attribution mainly deals with undecided authorship of literary texts. Authorship attribution is useful in resolving issues like uncertain authorship, recognize authorship of unknown texts, spot plagiarism so on. Statistical methods can be used to set apart the approach of an author numerically. The basic methodologies that are made use in computational stylometry are word length, sentence length, vocabulary affluence, frequencies etc. Each author has an inborn style of writing, which is particular to himself. Statistical quantitative techniques can be used to differentiate the approach of an author in a numerical way. The problem can be broken down into three sub problems as author identification, author characterization and similarity detection. The steps involved are pre-processing, extracting features, classification and author identification. For this different classifiers can be used. Here fuzzy learning classifier and SVM are used. After author identification the SVM was found to have more accuracy than Fuzzy classifier. Later combined the classifiers to obtain a better accuracy when compared to individual SVM and fuzzy classifier.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Recommended from our members

Research Collaboration Analysis Using Text and Graph Features

Author: Herrmannova Drahomira
Knoth Petr
Patton Robert
Stahl Christopher
Wells Jack
Publication venue
Publication date: 01/01/2018
Field of study

Patterns of scientific collaboration and their effect on scientific production have been the subject of many studies. In this paper we analyze the nature of ties between co-authors and study collaboration patterns in science from the perspective of semantic similarity of authors who wrote a paper together and the strength of ties between these authors (i.e. how much have they previously collaborated together). These two views of scientific collaboration are used to analyze publications in the TrueImpactDataset [11], a new dataset containing two types of publications - publications regarded as seminal and publications regarded as literature reviews by field experts. We show there are distinct differences between seminal publications and literature reviews in terms of author similarity and the strength of ties between their authors. In particular, we find that seminal publications tend to be written by authors who have previously worked on dissimilar problems (i.e. authors from different fields or even disciplines), and by authors who are not frequent collaborators. On the other hand, literature reviews in our dataset tend to be the result of an established collaboration within a discipline. This demonstrates that our method provides meaningful information about potential future impacts of a publication which does not require citation information

Open Research Online (OU)

Text Coherence Analysis Based on Deep Neural Network

Author: Cui Baiyun
Li Yingming
Zhang Yaqing
Zhang Zhongfei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/10/2017
Field of study

In this paper, we propose a novel deep coherence model (DCM) using a convolutional neural network architecture to capture the text coherence. The text coherence problem is investigated with a new perspective of learning sentence distributional representation and text coherence modeling simultaneously. In particular, the model captures the interactions between sentences by computing the similarities of their distributional representations. Further, it can be easily trained in an end-to-end fashion. The proposed model is evaluated on a standard Sentence Ordering task. The experimental results demonstrate its effectiveness and promise in coherence assessment showing a significant improvement over the state-of-the-art by a wide margin.Comment: 4 pages, 2 figures, CIKM 201

arXiv.org e-Print Archive

Crossref