118 research outputs found
Recommended from our members
Learning Analytics for Academic Writing through Automatic Identification of Meta-discourse
Effective written communication is an essential skill which promotes educational success for undergraduates. Argumentation is a key requirement of successful writing, which is the most common genre that undergraduates have to write particularly in the social sciences. Therefore, when assessing student writing academic tutors look for students’ ability to present and pursue well-reasoned and strong arguments through scholarly argumentation, which is articulated by meta-discourse.
Today, there are some natural language processing systems which automatically detect authors’ rhetorical moves in scholarly texts. Hence, when assessing their students’ essays, educators could benefit from the available automated textual analysis which can detect meta-discourse. However, previous work has not shown whether these technologies can be used to analyse student writing reliably. The aim of this thesis therefore has been to understand how automated analysis of meta-discourse in student writing can be used to support tutors’ essay assessment practices. This thesis evaluates a particular language analysis tool, the Xerox Incremental Parser (XIP) as an exemplar of this type of automated technology.
The studies presented in this thesis investigates how tutors define the quality of undergraduate writing and suggests key elements that make for good quality student writing in the social sciences, where XIP seems to work best. This thesis also sets out the changes that needs to be made to the XIP and proposes in what ways its output can be delivered to tutors so that they make use of this output to give feedback on student essays.
The findings reported also show problems that academic tutors experience in essay assessment, which potentially could be solved by automated support. However, tutors have preconceptions about the use of automated support.
The study revealed that tutors want to be assured that they retain the ‘power’ themselves in any decision of using automated support to overcome these preconceptions
La Salle University Evening Division Bulletin 1988-1989
Issued for La Salle University Evening Division 1988-1989https://digitalcommons.lasalle.edu/course_catalogs/1142/thumbnail.jp
Extrakce znalostních grafů z projektové dokumentace
Název práce: Extrakce znalostních grafů z projektové dokumentace Autor: Bc. Tomáš Helešic Katedra: Katedra softwarového inženýrství Vedoucí diplomové práce: Mgr. Martin Nečaský, Ph.D. Abstrakt: Cílem této práce je prozkoumat možnosti automatické extrakce infor- mací z firemní projektové dokumentace s využitím nástroje pro strojové zpra- cování přirozeného jazyka a analýza přesnosti lingvistického zpracování těchto dokumentů. Dále navrhnout metody, jak získat klíčové pojmy a vazby mezi nimi. Z těchto pojmů a vazeb se vytváří znalostní grafy, které se uchovávají ve vhodném úložisti s vyhledávací službou. Práce se snaží propojit již ex- istující technologie, implementovat je do jednoduché aplikace a ověřit jejich připravenost pro praktické využití. Cílem je inspirovat budoucí výzkum v této oblasti, identifikovat kritická místa a navhrnout zlepšení. Hlavní přínos tkví v propojení zpracování přirozeného jazyka, metod extrakce informací, sémantické vyhledávání s firemnímy dokumenty. Přínos praktické části spočívá ve způsobu identifikace důležitých informací, které popisují jednotlivé dokumenty a jejich využití ve vyhledávání. Klíčová slova: Znalostní grafy, Extrakce informace, Zpracování...Title: Knowledge Graph Extraction from Project Documentation Author: Bc. Tomáš Helešic Department: Department of Software Engineering Supervisor: Mgr. Martin Nečaský, Ph.D. Abstract: The goal of this thesis is to explore the possibilities of automatic in- formation extraction from company project documentation with the use of ma- chine natural language processing and the analysis of the precision of linguistic processing of these documents. Furthermore suggest methods how acquire key terms and dependencies between them. From this terms and dependencies cre- ate knowledge graphs, that are stored in an appropriate database with search engine. The work is trying to interconnect already existing technologies in a shape of a simple application and test their readiness for a practical use. The goal is to inspire future research in this field, identify critical parts and propose improvements. The main gain is in the interconnection between natural lan- guage processing, methods of information extraction and semantic searching in corporate documents. The gain of the practical part reside in the way how to identify key information that is uniquely describing each document and its use in search. Keywords: Knowledge graphs, Information extraction, Natural language pro- cessing, Resource Description Framework 1Katedra softwarového inženýrstvíDepartment of Software EngineeringMatematicko-fyzikální fakultaFaculty of Mathematics and Physic
Towards a science of human stories: using sentiment analysis and emotional arcs to understand the building blocks of complex social systems
We can leverage data and complex systems science to better understand society and human nature on a population scale through language --- utilizing tools that include sentiment analysis, machine learning, and data visualization.
Data-driven science and the sociotechnical systems that we use every day are enabling a transformation from hypothesis-driven, reductionist methodology to complex systems sciences.
Namely, the emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, with profound implications for our understanding of human behavior.
Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture\u27s evolution through its texts using a big data lens.
Given the growing assortment of sentiment measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts.
Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods.
We show that while inappropriate for sentences, dictionary-based methods are generally robust in their classification accuracy for longer texts.
Most importantly they can aid understanding of texts with reliable and meaningful word shift graphs if (1) the dictionary covers a sufficiently large enough portion of a given text\u27s lexicon when weighted by word usage frequency; and (2) words are scored on a continuous scale.
Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories, forming patterns that are meaningful to us.
By classifying the emotional arcs for a filtered subset of 4,803 stories from Project Gutenberg\u27s fiction collection, we find a set of six core trajectories which form the building blocks of complex narratives.
We strengthen our findings by separately applying optimization, linear decomposition, supervised learning, and unsupervised learning.
For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads.
Within stories lie the core values of social behavior, rich with both strategies and proper protocol, which we can begin to study more broadly and systematically as a true reflection of culture.
Of profound scientific interest will be the degree to which we can eventually understand the full landscape of human stories, and data driven approaches will play a crucial role.
Finally, we utilize web-scale data from Twitter to study the limits of what social data can tell us about public health, mental illness, discourse around the protest movement of #BlackLivesMatter, discourse around climate change, and hidden networks.
We conclude with a review of published works in complex systems that separately analyze charitable donations, the happiness of words in 10 languages, 100 years of daily temperature data across the United States, and Australian Rules Football games
CLARIN. The infrastructure for language resources
CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future.
The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU)
CLARIN
The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium
- …