3,318 research outputs found
Translation Alignment Applied to Historical Languages: methods, evaluation, applications, and visualization
Translation alignment is an essential task in Digital Humanities and Natural
Language Processing, and it aims to link words/phrases in the source
text with their translation equivalents in the translation. In addition to
its importance in teaching and learning historical languages, translation
alignment builds bridges between ancient and modern languages through
which various linguistics annotations can be transferred. This thesis focuses
on word-level translation alignment applied to historical languages in general
and Ancient Greek and Latin in particular. As the title indicates, the thesis
addresses four interdisciplinary aspects of translation alignment.
The starting point was developing Ugarit, an interactive annotation tool
to perform manual alignment aiming to gather training data to train an
automatic alignment model. This effort resulted in more than 190k accurate
translation pairs that I used for supervised training later. Ugarit has been
used by many researchers and scholars also in the classroom at several
institutions for teaching and learning ancient languages, which resulted
in a large, diverse crowd-sourced aligned parallel corpus allowing us to
conduct experiments and qualitative analysis to detect recurring patterns in
annotatorsâ alignment practice and the generated translation pairs.
Further, I employed the recent advances in NLP and language modeling to
develop an automatic alignment model for historical low-resourced languages,
experimenting with various training objectives and proposing a training
strategy for historical languages that combines supervised and unsupervised
training with mono- and multilingual texts. Then, I integrated this alignment
model into other development workflows to project cross-lingual annotations
and induce bilingual dictionaries from parallel corpora.
Evaluation is essential to assess the quality of any model. To ensure employing the best practice, I reviewed the current evaluation procedure, defined
its limitations, and proposed two new evaluation metrics. Moreover, I introduced a visual analytics framework to explore and inspect alignment gold
standard datasets and support quantitative and qualitative evaluation of
translation alignment models. Besides, I designed and implemented visual
analytics tools and reading environments for parallel texts and proposed
various visualization approaches to support different alignment-related tasks
employing the latest advances in information visualization and best practice.
Overall, this thesis presents a comprehensive study that includes manual and
automatic alignment techniques, evaluation methods and visual analytics
tools that aim to advance the field of translation alignment for historical
languages
Mining Meaning from Wikipedia
Wikipedia is a goldmine of information; not just for its many readers, but
also for the growing community of researchers who recognize it as a resource of
exceptional scale and utility. It represents a vast investment of manual effort
and judgment: a huge, constantly evolving tapestry of concepts and relations
that is being applied to a host of tasks.
This article provides a comprehensive description of this work. It focuses on
research that extracts and makes use of the concepts, relations, facts and
descriptions found in Wikipedia, and organizes the work into four broad
categories: applying Wikipedia to natural language processing; using it to
facilitate information retrieval and information extraction; and as a resource
for ontology building. The article addresses how Wikipedia is being used as is,
how it is being improved and adapted, and how it is being combined with other
structures to create entirely new resources. We identify the research groups
and individuals involved, and how their work has developed in the last few
years. We provide a comprehensive list of the open-source software they have
produced.Comment: An extensive survey of re-using information in Wikipedia in natural
language processing, information retrieval and extraction and ontology
building. Accepted for publication in International Journal of Human-Computer
Studie
Developing the scales on evaluation beliefs of student teachers
The purpose of the study reported in this paper was to investigate the validity and the reliability of a newly developed questionnaire named âTeacher Evaluation Beliefsâ (TEB). The framework for developing items was provided by the two models. The first model focuses on Student-Centered and Teacher-Centered beliefs about evaluation while the other centers on five dimensions (what/ who/ when/ why/ how). The validity and reliability of the new instrument was investigated using both exploratory and confirmatory factor analysis study (n=446). Overall results indicate that the two-factor structure is more reasonable than the five-factor one. Further research needs additional items about the latent dimensions âwhatâ âwhoâ âwhenâ âwhyâ âhowâ for each existing factor based on Student-centered and Teacher-centered approaches
Event Extraction: A Survey
Extracting the reported events from text is one of the key research themes in
natural language processing. This process includes several tasks such as event
detection, argument extraction, role labeling. As one of the most important
topics in natural language processing and natural language understanding, the
applications of event extraction spans across a wide range of domains such as
newswire, biomedical domain, history and humanity, and cyber security. This
report presents a comprehensive survey for event detection from textual
documents. In this report, we provide the task definition, the evaluation
method, as well as the benchmark datasets and a taxonomy of methodologies for
event extraction. We also present our vision of future research direction in
event detection.Comment: 20 page
EXPLORING LITERACIES IN THE ASSEMBLAGE OF ADULT EDUCATION ENGLISH FOR SPEAKERS OF OTHER LANGUAGES CLASSROOMS
The purpose of this dissertation is to provide a posthuman perspective of adult second language and literacy learning using the philosophy of Gilles Deleuze and his collaborative work with FĂ©lix Guattari, Masnyâs (2005/6) multiple literacies theory or MLT, and DeLandaâs (2016) assemblage theory. Thinking with these scholars, I employ a post-qualitative, posthuman MLT conceptual framework to study literacy as a process that flows through and connects with globally-diverse students, languages, worldviews, and texts in the assemblage of adult education, English for speakers of other languages (ESOL) classrooms. I posit this assemblage as a remarkable and important context for literacy research because of its heterogeneity and potential to produce creative expressions of multiple literacies. With the MLT framework, I explore expressions of multiple literacies as emergent multilingual subjectivities that deterritorialize commonsense worldviews about adult second language and literacy learning. I use observations and student work as data to map a posthuman perspective of adult education to address three research questions: (1) How might we use an MLT framework to explore multiple literacies in adult education ESOL classrooms? (2) With an MLT framework, how are multiple literacies expressed in adult education ESOL classrooms? (3) What are the benefits and implications of an MLT perspective for the field? This project offers a counter-story about the research context and problematizes qualitative inquiry by asking questions and raising problems that might otherwise be invisible. What emerges is a feminist practice of immanent ethics with important implications for the field of adult literacy and second language learning
An exploration of online information spaces that support instructional design and teacher professional development
Members in online communities of practice (CoPs) take advantage of information and communication technologies (ICTs) to exchange practical or work-related knowledge in asynchronous online environments. Practical knowledge represents individuals' mental models allowing them to interact with the environment and perform tasks. With ICTs, practical knowledge accumulates over time and becomes an integral part of online CoPs. Due to ease of implementation, content management systems (CMSs) and social media platforms, primarily Facebook, have enabled the emergence of large online CoPs. However, research has shown that online CoPs are not conducive information spaces for seeking solutions independently, and hashtags used for topic organization are not representative of the wealth of practical knowledge. This three-article dissertation describes design recommendations for supporting the information needs of community members by analyzing the practical knowledge in instructional design and technology (IDT) that rely on a CMS and the Facebook platform and conducting usability testing to improve an existing teacher professional development CoP. By applying natural language processing (NLP) and usability testing, quantitative and qualitative approaches were implemented to examine the practical knowledge and help guide the design of information spaces that enable members to search for solutions through better topic representations or categories. The results of the first study showed that the e-learning development CoP emphasized producing online articles related to educational technology and the lack of transparency in evaluating such materials. The results of the second study showed that the four IDT CoPs on the Facebook platform were characterized by the lack of effective topic structures representative of the accumulated knowledge and the lack of community protocols for curating knowledge and taking corrective actions toward misinformation. The third study relied on usability testing to design an information space to support educators' ability to align materials with Missouri teacher standards. This three-article dissertation suggests five design features that online CoPs can implement in addressing the shortcomings of asynchronous online environments, including (1) improving topic organization, (2) establishing community protocols, (3) increasing transparency, (4) improving search functions, and (5) leveraging NLP in future web technologies. Lastly, the dissertation discussed the results of the three published studies, offered recommendations for improving online CoPs as conducive information spaces, and provided future directions.Includes bibliographical references
Design of a Controlled Language for Critical Infrastructures Protection
We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates
from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically
represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of
traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an
analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen
Visual Analytics for the Exploratory Analysis and Labeling of Cultural Data
Cultural data can come in various forms and modalities, such as text traditions, artworks, music, crafted objects, or even as intangible heritage such as biographies of people, performing arts, cultural customs and rites.
The assignment of metadata to such cultural heritage objects is an important task that people working in galleries, libraries, archives, and museums (GLAM) do on a daily basis.
These rich metadata collections are used to categorize, structure, and study collections, but can also be used to apply computational methods.
Such computational methods are in the focus of Computational and Digital Humanities projects and research.
For the longest time, the digital humanities community has focused on textual corpora, including text mining, and other natural language processing techniques.
Although some disciplines of the humanities, such as art history and archaeology have a long history of using visualizations.
In recent years, the digital humanities community has started to shift the focus to include other modalities, such as audio-visual data.
In turn, methods in machine learning and computer vision have been proposed for the specificities of such corpora.
Over the last decade, the visualization community has engaged in several collaborations with the digital humanities, often with a focus on exploratory or comparative analysis of the data at hand.
This includes both methods and systems that support classical Close Reading of the material and Distant Reading methods that give an overview of larger collections, as well as methods in between, such as Meso Reading.
Furthermore, a wider application of machine learning methods can be observed on cultural heritage collections.
But they are rarely applied together with visualizations to allow for further perspectives on the collections in a visual analytics or human-in-the-loop setting.
Visual analytics can help in the decision-making process by guiding domain experts through the collection of interest.
However, state-of-the-art supervised machine learning methods are often not applicable to the collection of interest due to missing ground truth.
One form of ground truth are class labels, e.g., of entities depicted in an image collection, assigned to the individual images.
Labeling all objects in a collection is an arduous task when performed manually, because cultural heritage collections contain a wide variety of different objects with plenty of details.
A problem that arises with these collections curated in different institutions is that not always a specific standard is followed, so the vocabulary used can drift apart from another, making it difficult to combine the data from these institutions for large-scale analysis.
This thesis presents a series of projects that combine machine learning methods with interactive visualizations for the exploratory analysis and labeling of cultural data.
First, we define cultural data with regard to heritage and contemporary data, then we look at the state-of-the-art of existing visualization, computer vision, and visual analytics methods and projects focusing on cultural data collections.
After this, we present the problems addressed in this thesis and their solutions, starting with a series of visualizations to explore different facets of rap lyrics and rap artists with a focus on text reuse.
Next, we engage in a more complex case of text reuse, the collation of medieval vernacular text editions.
For this, a human-in-the-loop process is presented that applies word embeddings and interactive visualizations to perform textual alignments on under-resourced languages supported by labeling of the relations between lines and the relations between words.
We then switch the focus from textual data to another modality of cultural data by presenting a Virtual Museum that combines interactive visualizations and computer vision in order to explore a collection of artworks.
With the lessons learned from the previous projects, we engage in the labeling and analysis of medieval illuminated manuscripts and so combine some of the machine learning methods and visualizations that were used for textual data with computer vision methods.
Finally, we give reflections on the interdisciplinary projects and the lessons learned, before we discuss existing challenges when working with cultural heritage data from the computer science perspective to outline potential research directions for machine learning and visual analytics of cultural heritage data
Theoretical and Empirical Models of Organizational Learning Processes in Knowledge Management
Introduction. This study undertakes a comprehensive exploration of existing instructional organizational models, spanning various disciplines within contemporary educational theory and knowledge management practice. The core objective is to propose an all-encompassing model tailored specifically to the preparation of future educational managers. This model places a significant emphasis on integrated educational strategies, further enriched by the integration of organizational learning processes in the context of knowledge management.
Aim and tasks. This study critically examines established models of organizational learning processes with the goal of developing a tailored model for training future educational managers. The goal is to equip aspiring educational managers with integrated didactic skills based on analyses of existing educational models that include concept analysis, model evaluation, and theoretical framework establishment.
Result. Organizational learning principles drive data-driven refinement, collaborative cross-disciplinary strategies, and leadership development. Sharing best practices enhances strength, whereas iterative feedback processes mitigate its limitations. This dynamic framework encourages adaptable education, fostering continuous improvement in teaching methods, curricula, and managerial training for a sustained educational evolution. Leveraging insights from existing models, the primary aim is to establish an instructional framework that seamlessly integrates a diverse range of content. Notably, the suggested model for training educational managers integrates teaching methodologies, character development, and methodological support for cultivating cultural learning skills, all underpinned by organizational learning processes within the domain of knowledge management. Furthermore, this integrated model incorporates progressive learning objectives that progressively increase in complexity and span the methodologies and resources employed to ensure effective learning outcomes based on comprehensive feature assessment techniques that gauge understanding and competencies.
Conclusions. This study navigates the landscape of models, culminating in the proposal of an integrated framework that caters to comprehensive aspiring training. This model facilitates the harmonious amalgamation of various subjects, and proficiencies introduce organizational learning processes within the domain of knowledge management. By fostering a multidisciplinary and holistic approach, this model equips future educators with the multifaceted demands of modern primary education while adequately managing knowledge within their organizational contexts
- âŠ