5 research outputs found
Recommended from our members
Document Cohesion Flow: Striving towards Coherence
Text cohesion is an important element of discourseprocessing. This paper presents a new approach to modeling,quantifying, and visualizing text cohesion using automatedcohesion flow indices that capture semantic links amongparagraphs. Cohesion flow is calculated by applyingCohesion Network Analysis, a combination of semanticdistances, Latent Semantic Analysis, and Latent DirichletAllocation, as well as Social Network Analysis. Experimentsperformed on 315 timed essays indicated that cohesion flowindices are significantly correlated with human ratings of textcoherence and essay quality. Visualizations of the globalcohesion indices are also included to support a more facileunderstanding of how cohesion flow impacts coherence interms of semantic dependencies between paragraphs
Investigative Report Writing Support System for Effective Knowledge Construction from the Web
Investigative reports plagiarized from the web should be eliminated because such reports result in ineffective knowledge construction. In this study, we developed an investigative report writing support system for effective knowledge construction from the web. The proposed system attempts to prevent plagiarism by restricting copying and pasting information from web pages. With this system, students can verify information through web browsing, externalize their constructed knowledge as notes for report materials, write reports using these notes, and remove inadequacies in the report by reflection. A comparative experiment showed that the proposed system can potentially prevent web page plagiarism and make knowledge construction from the web more effective compared to a conventional report writing environment
Document Cohesion Flow: Striving towards Coherence
Abstract Text cohesion is an important element of discourse processing. This paper presents a new approach to modeling, quantifying, and visualizing text cohesion using automated cohesion flow indices that capture semantic links among paragraphs. Cohesion flow is calculated by applying Cohesion Network Analysis, a combination of semantic distances, Latent Semantic Analysis, and Latent Dirichlet Allocation, as well as Social Network Analysis. Experiments performed on 315 timed essays indicated that cohesion flow indices are significantly correlated with human ratings of text coherence and essay quality. Visualizations of the global cohesion indices are also included to support a more facile understanding of how cohesion flow impacts coherence in terms of semantic dependencies between paragraphs
A Data Mining Toolbox for Collaborative Writing Processes
Collaborative writing (CW) is an essential skill in academia and industry. Providing support during the process of CW can be useful not only for achieving better quality documents, but also for improving the CW skills of the writers. In order to properly support collaborative writing, it is essential to understand how ideas and concepts are developed during the writing process, which consists of a series of steps of writing activities. These steps can be considered as sequence patterns comprising both time events and the semantics of the changes made during those steps. Two techniques can be combined to examine those patterns: process mining, which focuses on extracting process-related knowledge from event logs recorded by an information system; and semantic analysis, which focuses on extracting knowledge about what the student wrote or edited. This thesis contributes (i) techniques to automatically extract process models of collaborative writing processes and (ii) visualisations to describe aspects of collaborative writing. These two techniques form a data mining toolbox for collaborative writing by using process mining, probabilistic graphical models, and text mining. First, I created a framework, WriteProc, for investigating collaborative writing processes, integrated with the existing cloud computing writing tools in Google Docs. Secondly, I created new heuristic to extract the semantic nature of text edits that occur in the document revisions and automatically identify the corresponding writing activities. Thirdly, based on sequences of writing activities, I propose methods to discover the writing process models and transitional state diagrams using a process mining algorithm, Heuristics Miner, and Hidden Markov Models, respectively. Finally, I designed three types of visualisations and made contributions to their underlying techniques for analysing writing processes. All components of the toolbox are validated against annotated writing activities of real documents and a synthetic dataset. I also illustrate how the automatically discovered process models and visualisations are used in the process analysis with real documents written by groups of graduate students. I discuss how the analyses can be used to gain further insight into how students work and create their collaborative documents