19 research outputs found

    Development of knowledge representation based on markov logical networks in the business process mangement system

    Get PDF
    Досліджено проблему побудови представлення знань в системі процесного управління на основі аналізу поведінки бізнес-процесів, що представлена у вигляді логів подій. Кожна подія характеризує дію бізнес-процесу. Актуальність проблеми визначається тим, що при управлінні складними знання-ємними бізнес-процесами виконавці можуть змінювати послідовність дій з урахуванням додаткових знань про предметну область. В результаті виникає невідповідність між процесом та його моделлю, що створює труднощі для подальшого управління бізнес-процесом. Для усунення вказаної невідповідності потрібно формалізувати ці додаткові знання та використовувати їх при процесному управлінні, що потребує створення відповідного представлення знань. Запропоновано модель представлення знань враховує статичні й динамічні характеристики бізнес-процесу. Статичні характеристики бізнес-процесу задаються фактами та правилами із аргументами, представленими атрибутами подій логу. Факти і правила формуються на основі відповідних шаблонів. Атрибути задають значення властивостей об’єктів, з якими оперує бізнес-процес. Динамічні особливості бізнес-процесу визначаються через поточний розподіл ймовірностей виконання правил з урахуванням атрибутів поточної події логу бізнес-процесу. Запропонована модель відрізняється тим, що вона враховує обмеження на допустимі послідовності виконання дій бізнес-процесу, а також обмеження на основі апріорних знань про предметну область. Такі обмеження дозволить понизити складність задачі пошуку ймовірностей успішного завершення бізнес-процесу шляхом скорочення множини допустимих трас в тому випадку, якщо виконавці змінили послідовність дій. В практичному аспекті модель забезпечує можливість підтримки прийняття рішень з управління знання-ємними бізнес-процесами на основі прогнозування ймовірностей досягнення кінцевого стану процесу з урахуванням атрибутів подій логу.The problem of constructing knowledge representation in the process control system based on the analysis of the behavior of business processes, represented in the form of logs of events, is studied. Each event characterizes the action of the business process. The urgency of the problem is determined by the fact that when managing complex knowledge-capacious business processes, performers can change the sequence of actions taking into account additional knowledge about the subject area. As a result, there is a discrepancy between the process and its model, which creates difficulties for the further management of this business process. To eliminate this discrepancy, it is necessary to formalize the additional knowledge used and apply them in process management, which requires the creation of an appropriate knowledge representation. The proposed knowledge representation model takes into account the static and dynamic characteristics of the business process. The static characteristics of a business process are specified by facts and rules with arguments represented by the attributes of the log events. Facts and rules are formed on the basis of appropriate templates. Attributes specify the values of the properties of objects with which the business process operates. Dynamic features of the business process are determined through the current distribution of the probability that the rules will be executed, taking into account the attributes of the current business process log event. The proposed model is characterized by the fact that it takes into account the limitations on the permissible sequences of execution of the actions of the business process, as well as restrictions based on a priori knowledge of the subject area. Such restrictions will reduce the complexity of the problem of finding the probabilities of a successful completion of a business process by reducing the number of allowed trails in the event that the performers have changed the sequence of actions. In practical terms, the model provides the ability to support decision-making on the management of knowledge-intensive business processes based on predicting the probabilities of achieving the final state of the process, taking into account the attributes of log events

    Reasoning with Annotations of Texts

    No full text
    International audienceLinguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotations. The underlying knowledge representation issues are carefully analyzed and solved by studying a higher order logic, which accounts for the cooperation of different sorts of knowledge. An algorithm, implemented in our prototype, is proposed to reduce this logic to classical description logics by preserving the semantics, which allows us to benefit from cutting-edge Semantic Web reasoners. An application scenario shows interesting merits of this framework on reasoning with annotations of texts

    Facilitating Technology Transfer by Patent Knowledge Graph

    Get PDF
    Technologies are one of the most important driving forces of our societal development and realizing the value of technologies heavily depends on the transfer of technologies. Given the importance of technologies and technology transfer, an increasingly large amount of money has been invested to encourage technological innovation and technology transfer worldwide. However, while numerous innovative technologies are invented, most of them remain latent and un-transferred. The comprehension of technical documents and the identification of appropriate technologies for given needs are challenging problems in technology transfer due to information asymmetry and information overload problems. There is a lack of common knowledge base that can reveal the technical details of technical documents and assist with the identification of suitable technologies. To bridge this gap, this research proposes to construct knowledge graph for facilitating technology transfer. A case study is conducted to show the construction of a patent knowledge graph and to illustrate its benefit to finding relevant patents, the most common and important form of technologies

    Joining Extractions of Regular Expressions

    Get PDF
    Regular expressions with capture variables, also known as "regex formulas," extract relations of spans (interval positions) from text. These relations can be further manipulated via Relational Algebra as studied in the context of document spanners, Fagin et al.'s formal framework for information extraction. We investigate the complexity of querying text by Conjunctive Queries (CQs) and Unions of CQs (UCQs) on top of regex formulas. We show that the lower bounds (NP-completeness and W[1]-hardness) from the relational world also hold in our setting; in particular, hardness hits already single-character text! Yet, the upper bounds from the relational world do not carry over. Unlike the relational world, acyclic CQs, and even gamma-acyclic CQs, are hard to compute. The source of hardness is that it may be intractable to instantiate the relation defined by a regex formula, simply because it has an exponential number of tuples. Yet, we are able to establish general upper bounds. In particular, UCQs can be evaluated with polynomial delay, provided that every CQ has a bounded number of atoms (while unions and projection can be arbitrary). Furthermore, UCQ evaluation is solvable with FPT (Fixed-Parameter Tractable) delay when the parameter is the size of the UCQ

    Joining extractions of regular expressions

    Get PDF
    Regular expressions with capture variables, also known as “regex formulas,” extract relations of spans (interval positions) from text. These relations can be further manipulated via the relational Algebra as studied in the context of “document spanners,” Fagin et al.’s formal framework for information extraction. We investigate the complexity of querying text by Conjunctive Queries (CQs) and Unions of CQs (UCQs) on top of regex formulas. Such queries have been investigated in prior work on document spanners, but little is known about the (combined) complexity of their evaluation. We show that the lower bounds (NP-completeness and W[1]-hardness) from the relational world also hold in our setting; in particular, hardness hits already single-character text. Yet, the upper bounds from the relational world do not carry over. Unlike the relational world, acyclic CQs, and even gamma-acyclic CQs, are hard to compute. The source of hardness is that it may be intractable to instantiate the relation defined by a regex formula, simply because it has an exponential number of tuples. Yet, we are able to establish general upper bounds. In particular, UCQs can be evaluated with polynomial delay, provided that every CQ has a bounded number of atoms (while unions and projection can be arbitrary). Furthermore, UCQ evaluation is solvable with FPT (Fixed-Parameter Tractable) delay when the parameter is the size of the UCQ

    Query Optimization for On-Demand Information Extraction Tasks over Text Databases

    Get PDF
    Many modern applications involve analyzing large amounts of data that comes from unstructured text documents. In its original format, data contains information that, if extracted, can give more insight and help in the decision-making process. The ability to answer structured SQL queries over unstructured data allows for more complex data analysis. Querying unstructured data can be accomplished with the help of information extraction (IE) techniques. The traditional way is by using the Extract-Transform-Load (ETL) approach, which performs all possible extractions over the document corpus and stores the extracted relational results in a data warehouse. Then, the extracted data is queried. The ETL approach produces results that are out of date and causes an explosion in the number of possible relations and attributes to extract. Therefore, new approaches to perform extraction on-the-fly were developed; however, previous efforts relied on specialized extraction operators, or particular IE algorithms, which limited the optimization opportunities of such queries. In this work, we propose an on-line approach that integrates the engine of the database management system with IE systems using a new type of view called extraction views. Queries on text documents are evaluated using these extraction views, which get populated at query-time with newly extracted data. Our approach enables the optimizer to apply all well-defined optimization techniques. The optimizer selects the best execution plan using a defined cost model that considers a user-defined balance between the cost and quality of extraction, and we explain the trade-off between the two factors. The main contribution is the ability to run on-demand information extraction to consider latest changes in the data, while avoiding unnecessary extraction from irrelevant text documents
    corecore