3 research outputs found

    Contextual Understanding in Neural Dialog Systems: the Integration of External Knowledge Graphs for Generating Coherent and Knowledge-rich Conversations

    Get PDF
    The integration of external knowledge graphs has emerged as a powerful approach to enrich conversational AI systems with coherent and knowledge-rich conversations. This paper provides an overview of the integration process and highlights its benefits. Knowledge graphs serve as structured representations of information, capturing the relationships between entities through nodes and edges. They offer an organized and efficient means of representing factual knowledge. External knowledge graphs, such as DBpedia, Wikidata, Freebase, and Google's Knowledge Graph, are pre-existing repositories that encompass a wide range of information across various domains. These knowledge graphs are compiled by aggregating data from diverse sources, including online encyclopedias, databases, and structured repositories. To integrate an external knowledge graph into a conversational AI system, a connection needs to be established between the system and the knowledge graph. This can be achieved through APIs or by importing a copy of the knowledge graph into the AI system's internal storage. Once integrated, the conversational AI system can query the knowledge graph to retrieve relevant information when a user poses a question or makes a statement. When analyzing user inputs, the conversational AI system identifies entities or concepts that require additional knowledge. It then formulates queries to retrieve relevant information from the integrated knowledge graph. These queries may involve searching for specific entities, retrieving related entities, or accessing properties and attributes associated with the entities. The obtained information is used to generate coherent and knowledge-rich responses. By integrating external knowledge graphs, conversational AI systems can augment their internal knowledge base and provide more accurate and up-to-date responses. The retrieved information allows the system to extract relevant facts, provide detailed explanations, or offer additional context. This integration empowers AI systems to deliver comprehensive and insightful responses that enhance user experience. As external knowledge graphs are regularly updated with new information and improvements, conversational AI systems should ensure their integrated knowledge graphs remain current. This can be achieved through periodic updates, either by synchronizing the system's internal representation with the external knowledge graph or by querying the external knowledge graph in real-time

    BEKG: A Built Environment Knowledge Graph

    Full text link
    Practices in the built environment have become more digitalized with the rapid development of modern design and construction technologies. However, the requirement of practitioners or scholars to gather complicated professional knowledge in the built environment has not been satisfied yet. In this paper, more than 80,000 paper abstracts in the built environment field were obtained to build a knowledge graph, a knowledge base storing entities and their connective relations in a graph-structured data model. To ensure the retrieval accuracy of the entities and relations in the knowledge graph, two well-annotated datasets have been created, containing 2,000 instances and 1,450 instances each in 29 relations for the named entity recognition task and relation extraction task respectively. These two tasks were solved by two BERT-based models trained on the proposed dataset. Both models attained an accuracy above 85% on these two tasks. More than 200,000 high-quality relations and entities were obtained using these models to extract all abstract data. Finally, this knowledge graph is presented as a self-developed visualization system to reveal relations between various entities in the domain. Both the source code and the annotated dataset can be found here: https://github.com/HKUST-KnowComp/BEKG

    Exploiting knowledge graphs for entity-centric prediction

    Get PDF
    As a special kind of ``big data'', text data can be regarded as data reported by human sensors. Since humans are far more intelligent than physical sensors, text data contains directly useful information and knowledge about the real world, making it possible to make predictions about real-world phenomena based on text. As all application domains involve humans, text-based prediction has widespread applications, especially for optimization of decision making. While the problem of text-based prediction resembles text classification when formulated as a supervised learning problem, it is more challenging because the variable to be predicted is generally not directly mentioned in the text data and thus there is a ``semantic gap'' between the target variable and the surface features that are often used for representing text data in conventional approaches. How to bridge such a gap is a key technical challenge, but has not been well studied in the existing work. In this thesis, we propose to leverage the increasingly available knowledge graphs on the Web to bridge this gap. We propose to bridge this gap by using knowledge graph to make text representation more focused on elements in a knowledge graph that are relevant to the prediction task. We mainly focus on two a family of text-based prediction -- entity-centric classification and regression where the response variable can be treated as an attribute of a group of central entities. As a form of knowledge representation, knowledge graphs have widespread applications in information retrieval, text mining, and natural language processing. Many knowledge graphs have been constructed and applied to diverse, real-world applications. The knowledge graph can help to enhance the interpretability of the textual information from the perspective of predictive analytics, and hence discovers more effective features. Despite the great success made in the application of knowledge graph in various domains, one of the main deficiencies of many existing works is that the knowledge graph applied in the application is pre-constructed, which remains unchanged when applied to very different specific application tasks. Such a static task-independent knowledge graph, while useful, is non-optimal for any specific application due to the unnecessary cost from processing large amounts of non-relevant knowledge as well as the insufficient coverage of task-specific knowledge. To address this limitation, we propose to construct a task-aware knowledge graph (TAKG) which would only contain the relevant knowledge to a particular task and absorb additional relevant knowledge from the data used in a particular task. We present a general formal framework for constructing a task-aware knowledge graph, develop specific algorithms for constructing a task-aware knowledge graph for entity-centric prediction in both knowledge-based and task-dependent ways, and apply it to a movie review categorization task. We propose two methods to expand the knowledge graph. One is to discover new entities and relations by a jointly embedding model which learns embedding vector for each entity and relation. In this way, the specific relationships in a finer-granularity that is pre-defined by the knowledge graph can be identified between related entities. An alternative way is to use more general word relations, e.g., paradigmatic and syntagmatic relation to expand the knowledge graph by including loosely related entities. Both methods work under certain circumstances, but the former one is helpful in a wider range of applications. We also make a systematic study of knowledge graph assisted feature engineering. We propose several different ways to construct knowledge graph-based features and investigate their performance in multiple real applications. Our study shows that different types of application may favor different ways of constructing knowledge graph-based features. We find that the coverage of the knowledge graph is important. If it cannot provide sufficient background knowledge, the effectiveness of the knowledge graph-based features will be impacted. Besides, the generated knowledge graph-based features can sometimes be very noisy, especially when the correlation between text and the response variables are weak. To distinguish the signal features from the noise, we propose a two-stage filtering method to further prune the features. Our experiment result shows that the pruned knowledge graph-based features have strong predictive power, which again confirms that leveraging text data is promising for real-world phenomenon prediction
    corecore