139 research outputs found

    Beyond {NED}: {F}ast and Effective Search Space Reduction for Complex Question Answering over Knowledge Bases

    Get PDF

    Towards a big data reference architecture

    Get PDF

    Efficient Contextualization using Top-k Operators for Question Answering over Knowledge Graphs

    Get PDF
    Answering complex questions over knowledge bases (KB-QA) faces huge input data with billions of facts, involving millions of entities and thousands of predicates. For efficiency, QA systems first reduce the answer search space by identifying a set of facts that is likely to contain all answers and relevant cues. The most common technique or doing this is to apply named entity disambiguation (NED) systems to the question, and retrieve KB facts for the disambiguated entities. This work presents CLOCQ, an efficient method that prunes irrelevant parts of the search space using KB-aware signals. CLOCQ uses a top-k query processor over score-ordered lists of KB items that combine signals about lexical matching, relevance to the question, coherence among candidate items, and connectivity in the KB graph. Experiments with two recent QA benchmarks for complex questions demonstrate the superiority of CLOCQ over state-of-the-art baselines with respect to answer presence, size of the search space, and runtimes

    Attribute lattice: a graph-based conceptual modeling grammar for heterogeneous data

    Get PDF
    One key characteristic of big data is variety. With massive and growing amounts of data existing in independent and heterogeneous (structured and unstructured) sources, assigning consistent and interoperable data semantics, which is essential for meaningful use of data, is an increasingly important challenge. I argue, conceptual models, in contrast to their traditional roles in the Information System development, can be used to represent data semantics as perceived by the user of data. In this thesis, I use principles from philosophical ontology, human cognition (i.e., classification theory), and graph theory to offer a theory-based conceptual modeling grammar for this purpose. This grammar reflects data from users of data perspective and independent from data source schema. I formally define the concept of attribute lattice as a graph-based, schema-free conceptual modeling grammar that represents attributes of instances in the domain of interest and precedence relations among them. Each node in an attribute lattice represents an attribute - a true statement (predicate) about some instances in the domain. Each directed arc represents a precedence relation indicating that possessing one attribute implies possessing another attribute. In this thesis, based on the premise that inherent classification is a barrier that hinders semantic interoperation of heterogeneous data sources, a human cognition based conceptual modeling grammar is introduced as an effective way to resolve semantic heterogeneity. This grammar represents the precedence relationship among attributes as perceived by human user and provides a mechanism to infer classes based on the pattern of precedences. Hence, a key contribution of attribute lattice is semantic relativism ā€“ that is, the classification in this grammar relies on the pattern of precedence relationship among attributes rather than fixed classes. This modeling grammar uses the immediate and semantic neighbourhoods of an attribute to designate an attribute as either a category, a class or a property and to specify the expansion of an attribute ā€“ attributes which are semantically equal to the given attribute. The introduced conceptual modeling grammar is implemented as an artifact to store and manage attribute lattices, to graphically represent them, and integrate lattices from various heterogeneous sources. With the ever-increasing amount of unstructured data (mostly text data) from various data sources such as social media, integrating text data with other data sources has gained considerable attention. This massive amount of data, however, makes finding the data relevant to a topic of interest a new challenge. I argue that the attribute lattice provides a robust semantic foundation to address this information retrieval challenge from unstructured data sources. Hence, a topic modeling approach based on the attribute lattice is proposed for Twitter. This topic model conceptualizes topic structure of tweets related to the domain of interest and enhances information retrieval by improving the semantic interpretability of hashtags

    A survey on the development status and application prospects of knowledge graph in smart grids

    Full text link
    With the advent of the electric power big data era, semantic interoperability and interconnection of power data have received extensive attention. Knowledge graph technology is a new method describing the complex relationships between concepts and entities in the objective world, which is widely concerned because of its robust knowledge inference ability. Especially with the proliferation of measurement devices and exponential growth of electric power data empowers, electric power knowledge graph provides new opportunities to solve the contradictions between the massive power resources and the continuously increasing demands for intelligent applications. In an attempt to fulfil the potential of knowledge graph and deal with the various challenges faced, as well as to obtain insights to achieve business applications of smart grids, this work first presents a holistic study of knowledge-driven intelligent application integration. Specifically, a detailed overview of electric power knowledge mining is provided. Then, the overview of the knowledge graph in smart grids is introduced. Moreover, the architecture of the big knowledge graph platform for smart grids and critical technologies are described. Furthermore, this paper comprehensively elaborates on the application prospects leveraged by knowledge graph oriented to smart grids, power consumer service, decision-making in dispatching, and operation and maintenance of power equipment. Finally, issues and challenges are summarised.Comment: IET Generation, Transmission & Distributio

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part ā€œTechnologies and Methodsā€ contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part ā€œProcesses and Applicationsā€ details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems
    • ā€¦
    corecore