3,318 research outputs found

    Translation Alignment Applied to Historical Languages: methods, evaluation, applications, and visualization

    Get PDF
    Translation alignment is an essential task in Digital Humanities and Natural Language Processing, and it aims to link words/phrases in the source text with their translation equivalents in the translation. In addition to its importance in teaching and learning historical languages, translation alignment builds bridges between ancient and modern languages through which various linguistics annotations can be transferred. This thesis focuses on word-level translation alignment applied to historical languages in general and Ancient Greek and Latin in particular. As the title indicates, the thesis addresses four interdisciplinary aspects of translation alignment. The starting point was developing Ugarit, an interactive annotation tool to perform manual alignment aiming to gather training data to train an automatic alignment model. This effort resulted in more than 190k accurate translation pairs that I used for supervised training later. Ugarit has been used by many researchers and scholars also in the classroom at several institutions for teaching and learning ancient languages, which resulted in a large, diverse crowd-sourced aligned parallel corpus allowing us to conduct experiments and qualitative analysis to detect recurring patterns in annotators’ alignment practice and the generated translation pairs. Further, I employed the recent advances in NLP and language modeling to develop an automatic alignment model for historical low-resourced languages, experimenting with various training objectives and proposing a training strategy for historical languages that combines supervised and unsupervised training with mono- and multilingual texts. Then, I integrated this alignment model into other development workflows to project cross-lingual annotations and induce bilingual dictionaries from parallel corpora. Evaluation is essential to assess the quality of any model. To ensure employing the best practice, I reviewed the current evaluation procedure, defined its limitations, and proposed two new evaluation metrics. Moreover, I introduced a visual analytics framework to explore and inspect alignment gold standard datasets and support quantitative and qualitative evaluation of translation alignment models. Besides, I designed and implemented visual analytics tools and reading environments for parallel texts and proposed various visualization approaches to support different alignment-related tasks employing the latest advances in information visualization and best practice. Overall, this thesis presents a comprehensive study that includes manual and automatic alignment techniques, evaluation methods and visual analytics tools that aim to advance the field of translation alignment for historical languages

    Mining Meaning from Wikipedia

    Get PDF
    Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.Comment: An extensive survey of re-using information in Wikipedia in natural language processing, information retrieval and extraction and ontology building. Accepted for publication in International Journal of Human-Computer Studie

    Developing the scales on evaluation beliefs of student teachers

    Get PDF
    The purpose of the study reported in this paper was to investigate the validity and the reliability of a newly developed questionnaire named ‘Teacher Evaluation Beliefs’ (TEB). The framework for developing items was provided by the two models. The first model focuses on Student-Centered and Teacher-Centered beliefs about evaluation while the other centers on five dimensions (what/ who/ when/ why/ how). The validity and reliability of the new instrument was investigated using both exploratory and confirmatory factor analysis study (n=446). Overall results indicate that the two-factor structure is more reasonable than the five-factor one. Further research needs additional items about the latent dimensions “what” ”who” ”when” ”why” “how” for each existing factor based on Student-centered and Teacher-centered approaches

    Event Extraction: A Survey

    Full text link
    Extracting the reported events from text is one of the key research themes in natural language processing. This process includes several tasks such as event detection, argument extraction, role labeling. As one of the most important topics in natural language processing and natural language understanding, the applications of event extraction spans across a wide range of domains such as newswire, biomedical domain, history and humanity, and cyber security. This report presents a comprehensive survey for event detection from textual documents. In this report, we provide the task definition, the evaluation method, as well as the benchmark datasets and a taxonomy of methodologies for event extraction. We also present our vision of future research direction in event detection.Comment: 20 page

    EXPLORING LITERACIES IN THE ASSEMBLAGE OF ADULT EDUCATION ENGLISH FOR SPEAKERS OF OTHER LANGUAGES CLASSROOMS

    Get PDF
    The purpose of this dissertation is to provide a posthuman perspective of adult second language and literacy learning using the philosophy of Gilles Deleuze and his collaborative work with FĂ©lix Guattari, Masny’s (2005/6) multiple literacies theory or MLT, and DeLanda’s (2016) assemblage theory. Thinking with these scholars, I employ a post-qualitative, posthuman MLT conceptual framework to study literacy as a process that flows through and connects with globally-diverse students, languages, worldviews, and texts in the assemblage of adult education, English for speakers of other languages (ESOL) classrooms. I posit this assemblage as a remarkable and important context for literacy research because of its heterogeneity and potential to produce creative expressions of multiple literacies. With the MLT framework, I explore expressions of multiple literacies as emergent multilingual subjectivities that deterritorialize commonsense worldviews about adult second language and literacy learning. I use observations and student work as data to map a posthuman perspective of adult education to address three research questions: (1) How might we use an MLT framework to explore multiple literacies in adult education ESOL classrooms? (2) With an MLT framework, how are multiple literacies expressed in adult education ESOL classrooms? (3) What are the benefits and implications of an MLT perspective for the field? This project offers a counter-story about the research context and problematizes qualitative inquiry by asking questions and raising problems that might otherwise be invisible. What emerges is a feminist practice of immanent ethics with important implications for the field of adult literacy and second language learning

    An exploration of online information spaces that support instructional design and teacher professional development

    Get PDF
    Members in online communities of practice (CoPs) take advantage of information and communication technologies (ICTs) to exchange practical or work-related knowledge in asynchronous online environments. Practical knowledge represents individuals' mental models allowing them to interact with the environment and perform tasks. With ICTs, practical knowledge accumulates over time and becomes an integral part of online CoPs. Due to ease of implementation, content management systems (CMSs) and social media platforms, primarily Facebook, have enabled the emergence of large online CoPs. However, research has shown that online CoPs are not conducive information spaces for seeking solutions independently, and hashtags used for topic organization are not representative of the wealth of practical knowledge. This three-article dissertation describes design recommendations for supporting the information needs of community members by analyzing the practical knowledge in instructional design and technology (IDT) that rely on a CMS and the Facebook platform and conducting usability testing to improve an existing teacher professional development CoP. By applying natural language processing (NLP) and usability testing, quantitative and qualitative approaches were implemented to examine the practical knowledge and help guide the design of information spaces that enable members to search for solutions through better topic representations or categories. The results of the first study showed that the e-learning development CoP emphasized producing online articles related to educational technology and the lack of transparency in evaluating such materials. The results of the second study showed that the four IDT CoPs on the Facebook platform were characterized by the lack of effective topic structures representative of the accumulated knowledge and the lack of community protocols for curating knowledge and taking corrective actions toward misinformation. The third study relied on usability testing to design an information space to support educators' ability to align materials with Missouri teacher standards. This three-article dissertation suggests five design features that online CoPs can implement in addressing the shortcomings of asynchronous online environments, including (1) improving topic organization, (2) establishing community protocols, (3) increasing transparency, (4) improving search functions, and (5) leveraging NLP in future web technologies. Lastly, the dissertation discussed the results of the three published studies, offered recommendations for improving online CoPs as conducive information spaces, and provided future directions.Includes bibliographical references

    Design of a Controlled Language for Critical Infrastructures Protection

    Get PDF
    We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen

    Visual Analytics for the Exploratory Analysis and Labeling of Cultural Data

    Get PDF
    Cultural data can come in various forms and modalities, such as text traditions, artworks, music, crafted objects, or even as intangible heritage such as biographies of people, performing arts, cultural customs and rites. The assignment of metadata to such cultural heritage objects is an important task that people working in galleries, libraries, archives, and museums (GLAM) do on a daily basis. These rich metadata collections are used to categorize, structure, and study collections, but can also be used to apply computational methods. Such computational methods are in the focus of Computational and Digital Humanities projects and research. For the longest time, the digital humanities community has focused on textual corpora, including text mining, and other natural language processing techniques. Although some disciplines of the humanities, such as art history and archaeology have a long history of using visualizations. In recent years, the digital humanities community has started to shift the focus to include other modalities, such as audio-visual data. In turn, methods in machine learning and computer vision have been proposed for the specificities of such corpora. Over the last decade, the visualization community has engaged in several collaborations with the digital humanities, often with a focus on exploratory or comparative analysis of the data at hand. This includes both methods and systems that support classical Close Reading of the material and Distant Reading methods that give an overview of larger collections, as well as methods in between, such as Meso Reading. Furthermore, a wider application of machine learning methods can be observed on cultural heritage collections. But they are rarely applied together with visualizations to allow for further perspectives on the collections in a visual analytics or human-in-the-loop setting. Visual analytics can help in the decision-making process by guiding domain experts through the collection of interest. However, state-of-the-art supervised machine learning methods are often not applicable to the collection of interest due to missing ground truth. One form of ground truth are class labels, e.g., of entities depicted in an image collection, assigned to the individual images. Labeling all objects in a collection is an arduous task when performed manually, because cultural heritage collections contain a wide variety of different objects with plenty of details. A problem that arises with these collections curated in different institutions is that not always a specific standard is followed, so the vocabulary used can drift apart from another, making it difficult to combine the data from these institutions for large-scale analysis. This thesis presents a series of projects that combine machine learning methods with interactive visualizations for the exploratory analysis and labeling of cultural data. First, we define cultural data with regard to heritage and contemporary data, then we look at the state-of-the-art of existing visualization, computer vision, and visual analytics methods and projects focusing on cultural data collections. After this, we present the problems addressed in this thesis and their solutions, starting with a series of visualizations to explore different facets of rap lyrics and rap artists with a focus on text reuse. Next, we engage in a more complex case of text reuse, the collation of medieval vernacular text editions. For this, a human-in-the-loop process is presented that applies word embeddings and interactive visualizations to perform textual alignments on under-resourced languages supported by labeling of the relations between lines and the relations between words. We then switch the focus from textual data to another modality of cultural data by presenting a Virtual Museum that combines interactive visualizations and computer vision in order to explore a collection of artworks. With the lessons learned from the previous projects, we engage in the labeling and analysis of medieval illuminated manuscripts and so combine some of the machine learning methods and visualizations that were used for textual data with computer vision methods. Finally, we give reflections on the interdisciplinary projects and the lessons learned, before we discuss existing challenges when working with cultural heritage data from the computer science perspective to outline potential research directions for machine learning and visual analytics of cultural heritage data

    DARIAH and the Benelux

    Get PDF

    Theoretical and Empirical Models of Organizational Learning Processes in Knowledge Management

    Get PDF
    Introduction. This study undertakes a comprehensive exploration of existing instructional organizational models, spanning various disciplines within contemporary educational theory and knowledge management practice. The core objective is to propose an all-encompassing model tailored specifically to the preparation of future educational managers. This model places a significant emphasis on integrated educational strategies, further enriched by the integration of organizational learning processes in the context of knowledge management. Aim and tasks. This study critically examines established models of organizational learning processes with the goal of developing a tailored model for training future educational managers. The goal is to equip aspiring educational managers with integrated didactic skills based on analyses of existing educational models that include concept analysis, model evaluation, and theoretical framework establishment. Result. Organizational learning principles drive data-driven refinement, collaborative cross-disciplinary strategies, and leadership development. Sharing best practices enhances strength, whereas iterative feedback processes mitigate its limitations. This dynamic framework encourages adaptable education, fostering continuous improvement in teaching methods, curricula, and managerial training for a sustained educational evolution. Leveraging insights from existing models, the primary aim is to establish an instructional framework that seamlessly integrates a diverse range of content. Notably, the suggested model for training educational managers integrates teaching methodologies, character development, and methodological support for cultivating cultural learning skills, all underpinned by organizational learning processes within the domain of knowledge management. Furthermore, this integrated model incorporates progressive learning objectives that progressively increase in complexity and span the methodologies and resources employed to ensure effective learning outcomes based on comprehensive feature assessment techniques that gauge understanding and competencies. Conclusions. This study navigates the landscape of models, culminating in the proposal of an integrated framework that caters to comprehensive aspiring training. This model facilitates the harmonious amalgamation of various subjects, and proficiencies introduce organizational learning processes within the domain of knowledge management. By fostering a multidisciplinary and holistic approach, this model equips future educators with the multifaceted demands of modern primary education while adequately managing knowledge within their organizational contexts
    • 

    corecore