707 research outputs found

    Enhancing Context Modeling with a Query-Guided Capsule Network for Document-level Translation

    Full text link
    Context modeling is essential to generate coherent and consistent translation for Document-level Neural Machine Translations. The widely used method for document-level translation usually compresses the context information into a representation via hierarchical attention networks. However, this method neither considers the relationship between context words nor distinguishes the roles of context words. To address this problem, we propose a query-guided capsule networks to cluster context information into different perspectives from which the target translation may concern. Experiment results show that our method can significantly outperform strong baselines on multiple data sets of different domains.Comment: 11 pages, 7 figures, 2019 Conference on Empirical Methods in Natural Language Processin

    Dynamic Context-guided Capsule Network for Multimodal Machine Translation

    Full text link
    Multimodal machine translation (MMT), which mainly focuses on enhancing text-only translation with visual features, has attracted considerable attention from both computer vision and natural language processing communities. Most current MMT models resort to attention mechanism, global context modeling or multimodal joint representation learning to utilize visual features. However, the attention mechanism lacks sufficient semantic interactions between modalities while the other two provide fixed visual context, which is unsuitable for modeling the observed variability when generating translation. To address the above issues, in this paper, we propose a novel Dynamic Context-guided Capsule Network (DCCN) for MMT. Specifically, at each timestep of decoding, we first employ the conventional source-target attention to produce a timestep-specific source-side context vector. Next, DCCN takes this vector as input and uses it to guide the iterative extraction of related visual features via a context-guided dynamic routing mechanism. Particularly, we represent the input image with global and regional visual features, we introduce two parallel DCCNs to model multimodal context vectors with visual features at different granularities. Finally, we obtain two multimodal context vectors, which are fused and incorporated into the decoder for the prediction of the target word. Experimental results on the Multi30K dataset of English-to-German and English-to-French translation demonstrate the superiority of DCCN. Our code is available on https://github.com/DeepLearnXMU/MM-DCCN

    Advancement Auto-Assessment of Students Knowledge States from Natural Language Input

    Get PDF
    Knowledge Assessment is a key element in adaptive instructional systems and in particular in Intelligent Tutoring Systems because fully adaptive tutoring presupposes accurate assessment. However, this is a challenging research problem as numerous factors affect students’ knowledge state estimation such as the difficulty level of the problem, time spent in solving the problem, etc. In this research work, we tackle this research problem from three perspectives: assessing the prior knowledge of students, assessing the natural language short and long students’ responses, and knowledge tracing.Prior knowledge assessment is an important component of knowledge assessment as it facilitates the adaptation of the instruction from the very beginning, i.e., when the student starts interacting with the (computer) tutor. Grouping students into groups with similar mental models and patterns of prior level of knowledge allows the system to select the right level of scaffolding for each group of students. While not adapting instruction to each individual learner, the advantage of adapting to groups of students based on a limited number of prior knowledge levels has the advantage of decreasing the authoring costs of the tutoring system. To achieve this goal of identifying or clustering students based on their prior knowledge, we have employed effective clustering algorithms. Automatically assessing open-ended student responses is another challenging aspect of knowledge assessment in ITSs. In dialogue-based ITSs, the main interaction between the learner and the system is natural language dialogue in which students freely respond to various system prompts or initiate dialogue moves in mixed-initiative dialogue systems. Assessing freely generated student responses in such contexts is challenging as students can express the same idea in different ways owing to different individual style preferences and varied individual cognitive abilities. To address this challenging task, we have proposed several novel deep learning models as they are capable to capture rich high-level semantic features of text. Knowledge tracing (KT) is an important type of knowledge assessment which consists of tracking students’ mastery of knowledge over time and predicting their future performances. Despite the state-of-the-art results of deep learning in this task, it has many limitations. For instance, most of the proposed methods ignore pertinent information (e.g., Prior knowledge) that can enhance the knowledge tracing capability and performance. Working toward this objective, we have proposed a generic deep learning framework that accounts for the engagement level of students, the difficulty of questions and the semantics of the questions and uses a novel times series model called Temporal Convolutional Network for future performance prediction. The advanced auto-assessment methods presented in this dissertation should enable better ways to estimate learner’s knowledge states and in turn the adaptive scaffolding those systems can provide which in turn should lead to more effective tutoring and better learning gains for students. Furthermore, the proposed method should enable more scalable development and deployment of ITSs across topics and domains for the benefit of all learners of all ages and backgrounds

    A comparison of integration architectures

    Get PDF
    This paper presents GenSIF, a Generic Systems Integration Framework. GenSIF features a pre-planned development process on a domain-wide basis and facilitates system integration and project coordination for very large, complex and distributed systems. Domain analysis, integration architecture design and infrastructure design are identified as the three main components of GenSIF. In the next step we map Beilcore\u27s OSCA interoperability architecture, ANSA, IBM\u27s SAA and Bull\u27s DCM into GenSIF. Using the GenSIF concepts we compare each of these architectures. GenSIF serves as a general framework to evaluate and position specific architecture. The OSCA architecture is used to discuss the impact of vendor architectures on application development. All opinions expressed in this paper, especially with regard to the OSCA architecture, are the opinions of the author and do not necessarily reflect the point of view of any of the mentioned companies

    Hierarchical Text Classification: a review of current research

    Get PDF
    t is often the case that collections of documents are annotated with hierarchically-structured concepts. However, the benefits of this structure are rarely taken into account by commonly-used classification techniques. Conversely, Hierarchical Text Classification methods are devisedto take advantage of the labels’ organization to boost classification performance. With this work,we aim to deliver an updated overview of current research in this domain. We begin by definingthe task and framing it within the broader text classification area, examining important shared concepts such as text representation. Then, we dive into details regarding the specific task,providing a high-level description of its traditional approaches. We then summarize recentlyproposed methods, highlighting their main contributions. We additionally provide statisticsfor the most adopted datasets and describe the benefits of using evaluation metrics tailored to hierarchical settings. Finally, a selection of recent proposals is benchmarked against non-hierarchical baselines on five domain-specific datasets

    Building Customized Chatbots for Document Summarization and Question Answering using Large Language Models using a Framework with OpenAI, Lang chain, and Streamlit

    Get PDF
    This research presents a comprehensive framework for building customized chatbots empowered by large language models (LLMs) to summarize documents and answer user questions. Leveraging technologies such as OpenAI, LangChain, and Streamlit, the framework enables users to combat information overload by efficiently extracting insights from lengthy documents. This study discussed the framework's architecture, implementation, and practical applications, emphasizing its role in enhancing productivity and facilitating information retrieval. Through a step-by-step guide, this research has demonstrated how developers can utilize the framework to create end-to-end document summarization and question-answering application
    • …
    corecore