30 research outputs found

    Intelligent Information Access to Linked Data - Weaving the Cultural Heritage Web

    Get PDF
    The subject of the dissertation is an information alignment experiment of two cultural heritage information systems (ALAP): The Perseus Digital Library and Arachne. In modern societies, information integration is gaining importance for many tasks such as business decision making or even catastrophe management. It is beyond doubt that the information available in digital form can offer users new ways of interaction. Also, in the humanities and cultural heritage communities, more and more information is being published online. But in many situations the way that information has been made publicly available is disruptive to the research process due to its heterogeneity and distribution. Therefore integrated information will be a key factor to pursue successful research, and the need for information alignment is widely recognized. ALAP is an attempt to integrate information from Perseus and Arachne, not only on a schema level, but to also perform entity resolution. To that end, technical peculiarities and philosophical implications of the concepts of identity and co-reference are discussed. Multiple approaches to information integration and entity resolution are discussed and evaluated. The methodology that is used to implement ALAP is mainly rooted in the fields of information retrieval and knowledge discovery. First, an exploratory analysis was performed on both information systems to get a first impression of the data. After that, (semi-)structured information from both systems was extracted and normalized. Then, a clustering algorithm was used to reduce the number of needed entity comparisons. Finally, a thorough matching was performed on the different clusters. ALAP helped with identifying challenges and highlighted the opportunities that arise during the attempt to align cultural heritage information systems

    Concepts and Methods from Artificial Intelligence in Modern Information Systems – Contributions to Data-driven Decision-making and Business Processes

    Get PDF
    Today, organizations are facing a variety of challenging, technology-driven developments, three of the most notable ones being the surge in uncertain data, the emergence of unstructured data and a complex, dynamically changing environment. These developments require organizations to transform in order to stay competitive. Artificial Intelligence with its fields decision-making under uncertainty, natural language processing and planning offers valuable concepts and methods to address the developments. The dissertation at hand utilizes and furthers these contributions in three focal points to address research gaps in existing literature and to provide concrete concepts and methods for the support of organizations in the transformation and improvement of data-driven decision-making, business processes and business process management. In particular, the focal points are the assessment of data quality, the analysis of textual data and the automated planning of process models. In regard to data quality assessment, probability-based approaches for measuring consistency and identifying duplicates as well as requirements for data quality metrics are suggested. With respect to analysis of textual data, the dissertation proposes a topic modeling procedure to gain knowledge from CVs as well as a model based on sentiment analysis to explain ratings from customer reviews. Regarding automated planning of process models, concepts and algorithms for an automated construction of parallelizations in process models, an automated adaptation of process models and an automated construction of multi-actor process models are provided

    Enrichment of Wind Turbine Health History for Condition-Based Maintenance

    Get PDF
    This research develops a methodology for and shows the benefit of linking records of wind turbine maintenance. It analyses commercially sensitive real-world maintenance records with the aim of improving the productivity of offshore wind farms. The novel achievements of this research are that it applies multi-feature record linkage techniques to maintenance data, that it applies statistical techniques for the interval estimation of a binomial proportion to record linkage techniques and that it estimates the distribution of the coverage error of statistical techniques for the interval estimation of a binomial proportion. The main contribution of this research is a process for the enrichment of offshore wind turbine health history. The economic productivity of a wind farm depends on the price of electricity and on the suitability of the weather, both of which are beyond the control of a maintenance team, but also on the cost of operating the wind farm, on the cost of maintaining the wind turbines and on how much of the wind farm’s potential production of electricity is lost to outages. Improvements in maintenance scheduling, in condition-based maintenance, in troubleshooting and in the measurement of maintenance effectiveness all require knowledge of the health history of the plant. To this end, this thesis presents new techniques for linking together existing records of offshore wind turbine health history. Multi-feature record linkage techniques are used to link records of maintenance data together. Both the quality of record linkage and the uncertainty of that quality are assessed. The quality of record linkage was measured by comparing the generated set of linked records to a gold standard set of linked records identified in collaboration with offshore wind turbine maintenance experts. The process for the enrichment of offshore wind turbine health history developed in this research requires a vector of weights and thresholds. The agreement and disagreement weights for each feature indicate the importance of the feature to the quality of record linkage. This research uses differential evolution to globally optimise this vector of weights and thresholds. There is inevitably some uncertainty associated with the measurement of the quality of record linkage, and consequently with the optimum values for the weights and thresholds; this research not only measures the quality of record linkage but also identifies robust techniques for the estimation of its uncertainty.

    Data-stream driven Fuzzy-granular approaches for system maintenance

    Get PDF
    Intelligent systems are currently inherent to the society, supporting a synergistic human-machine collaboration. Beyond economical and climate factors, energy consumption is strongly affected by the performance of computing systems. The quality of software functioning may invalidate any improvement attempt. In addition, data-driven machine learning algorithms are the basis for human-centered applications, being their interpretability one of the most important features of computational systems. Software maintenance is a critical discipline to support automatic and life-long system operation. As most software registers its inner events by means of logs, log analysis is an approach to keep system operation. Logs are characterized as Big data assembled in large-flow streams, being unstructured, heterogeneous, imprecise, and uncertain. This thesis addresses fuzzy and neuro-granular methods to provide maintenance solutions applied to anomaly detection (AD) and log parsing (LP), dealing with data uncertainty, identifying ideal time periods for detailed software analyses. LP provides deeper semantics interpretation of the anomalous occurrences. The solutions evolve over time and are general-purpose, being highly applicable, scalable, and maintainable. Granular classification models, namely, Fuzzy set-Based evolving Model (FBeM), evolving Granular Neural Network (eGNN), and evolving Gaussian Fuzzy Classifier (eGFC), are compared considering the AD problem. The evolving Log Parsing (eLP) method is proposed to approach the automatic parsing applied to system logs. All the methods perform recursive mechanisms to create, update, merge, and delete information granules according with the data behavior. For the first time in the evolving intelligent systems literature, the proposed method, eLP, is able to process streams of words and sentences. Essentially, regarding to AD accuracy, FBeM achieved (85.64+-3.69)%; eGNN reached (96.17+-0.78)%; eGFC obtained (92.48+-1.21)%; and eLP reached (96.05+-1.04)%. Besides being competitive, eLP particularly generates a log grammar, and presents a higher level of model interpretability

    온라인 게임에서 유저의 행태에 관한 연구

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 경영대학 경영학과, 2018. 2. 유병준.This dissertation consists of two essays on user behavior in online games. In the first essay, I identified multi-botting cheaters and measured their impacts using basic information in database such as user ID, playtime and item purchase record. I addressed the data availability issue and proposed a method for companies with limited data and resources. I also avoided large-scale transaction processing or complex development, which are fairly common in existing cheating detection methods. With respect to identifying cheaters, we used algorithms named DTW (Dynamic Time Warping) and JWD (Jaro–Winkler distance). I also measured the effects of using hacking tool by employing DID (Difference in Differences). My analysis results show some counter-intuitive results. Overall, cheaters constitute a minute part of users in terms of numbers – only about 0.25%. However, they hold approximately 12% of revenue. Furthermore, the usage of hacking tools causes a 102% and 79% increase in playtime and purchase respectively right after users start to use hacking tools. According to additional analysis, it could be shown that the positive effects of hacking tools are not just short-term. My granger causality test also reveals that cheating users activity does not affect other users' purchases or playtime trend. In the second essay, I propose a methodology to deal with churn prediction that meets two major purposes in the mobile casual game context. First, reducing the cost of data preparation, which is growing its importance in the big-data environment. Second, coming up with an algorithm that shows favorable performance comparable to that of the state-of-the-art. As a result, we succeed in greatly lowering the cost of the data preparation process by employing the sequence structure of the log data as it is. In addition, our sequence classification model based on CNN-LSTM shows superior results compared to the models of previous studies.Essay 1. Is Cheating Always Bad? A study of cheating identification and measurement of the effect 1 1. Introduction 2 2. Literature Review 8 3. Data 16 4. Hypotheses 17 5. Methodology 20 5.1 Cheating Identification 20 5.2 Measurement of Cheating Tool Usage Effect 28 6. Result 33 6.1 Cheating Identification 33 6.2 Measurement of Cheating Tool Usage Effect 33 7. Additional Analysis 35 7.1 Lifespan of Cheating Users 35 7.2 Granger Causality Test 36 8. Discussion and Conclusion 37 9. References 48 Essay 2. Churn Prediction in Mobile Casual Game: A Deep Sequence Classification Approach 61 1. Introduction 62 2. Definition of Churn 64 3. Related Works 65 4. Data 66 5. Methodology 66 5.1 Data Preparation 66 5.2 Prediction Model 71 6. Result and Discussion 74 7. References 77Docto

    Health of an aging America : issues on data for policy analysis

    Get PDF
    The papers in this report were background to a study conducted by the Panel on Statistics for an Aging Population, of the Committee on National Statistics, focusing on data needed over the next decade for health policy analysis for an aging America.Includes bibliographies.198

    Artificial Intelligence and Cognitive Computing

    Get PDF
    Artificial intelligence (AI) is a subject garnering increasing attention in both academia and the industry today. The understanding is that AI-enhanced methods and techniques create a variety of opportunities related to improving basic and advanced business functions, including production processes, logistics, financial management and others. As this collection demonstrates, AI-enhanced tools and methods tend to offer more precise results in the fields of engineering, financial accounting, tourism, air-pollution management and many more. The objective of this collection is to bring these topics together to offer the reader a useful primer on how AI-enhanced tools and applications can be of use in today’s world. In the context of the frequently fearful, skeptical and emotion-laden debates on AI and its value added, this volume promotes a positive perspective on AI and its impact on society. AI is a part of a broader ecosystem of sophisticated tools, techniques and technologies, and therefore, it is not immune to developments in that ecosystem. It is thus imperative that inter- and multidisciplinary research on AI and its ecosystem is encouraged. This collection contributes to that
    corecore