655 research outputs found

    Graph BI & analytics: current state and future challenges

    Get PDF
    In an increasingly competitive market, making well-informed decisions requires the analysis of a wide range of heterogeneous, large and complex data. This paper focuses on the emerging field of graph warehousing. Graphs are widespread structures that yield a great expressive power. They are used for modeling highly complex and interconnected domains, and efficiently solving emerging big data application. This paper presents the current status and open challenges of graph BI and analytics, and motivates the need for new warehousing frameworks aware of the topological nature of graphs. We survey the topics of graph modeling, management, processing and analysis in graph warehouses. Then we conclude by discussing future research directions and positioning them within a unified architecture of a graph BI and analytics framework.Peer ReviewedPostprint (author's final draft

    Mining Traversal Patterns from Weighted Traversals and Graph

    Get PDF
    μ‹€μ„Έκ³„μ˜ λ§Žμ€ λ¬Έμ œλ“€μ€ κ·Έλž˜ν”„μ™€ κ·Έ κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œ λͺ¨λΈλ§λ  수 μžˆλ‹€. 예λ₯Ό λ“€λ©΄, μ›Ή νŽ˜μ΄μ§€μ˜ μ—°κ²°κ΅¬μ‘°λŠ” κ·Έλž˜ν”„λ‘œ ν‘œν˜„λ  수 있고, μ‚¬μš©μžμ˜ μ›Ή νŽ˜μ΄μ§€ λ°©λ¬Έκ²½λ‘œλŠ” κ·Έ κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œ λͺ¨λΈλ§λ  수 μžˆλ‹€. 이와 같이 κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œλΆ€ν„° μ€‘μš”ν•˜κ³  κ°€μΉ˜ μžˆλŠ” νŒ¨ν„΄μ„ μ°Ύμ•„λ‚΄λŠ” 것은 의미 μžˆλŠ” 일이닀. μ΄λŸ¬ν•œ νŒ¨ν„΄μ„ μ°ΎκΈ° μœ„ν•œ μ§€κΈˆκΉŒμ§€μ˜ μ—°κ΅¬μ—μ„œλŠ” μˆœνšŒλ‚˜ κ·Έλž˜ν”„μ˜ κ°€μ€‘μΉ˜λ₯Ό κ³ λ €ν•˜μ§€ μ•Šκ³  λ‹¨μˆœνžˆ λΉˆλ°œν•˜λŠ” νŒ¨ν„΄λ§Œμ„ μ°ΎλŠ” μ•Œκ³ λ¦¬μ¦˜μ„ μ œμ•ˆν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ μ•Œκ³ λ¦¬μ¦˜μ˜ ν•œκ³„λŠ” 보닀 μ‹ λ’°μ„± 있고 μ •ν™•ν•œ νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 데 어렀움이 μžˆλ‹€λŠ” 것이닀. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μˆœνšŒλ‚˜ κ·Έλž˜ν”„μ˜ 정점에 λΆ€μ—¬λœ κ°€μ€‘μΉ˜λ₯Ό κ³ λ €ν•˜μ—¬ νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 두 가지 방법듀을 μ œμ•ˆν•œλ‹€. 첫 번째 방법은 κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” 정보에 κ°€μ€‘μΉ˜κ°€ μ‘΄μž¬ν•˜λŠ” κ²½μš°μ— 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 것이닀. κ·Έλž˜ν”„ μˆœνšŒμ— 뢀여될 수 μžˆλŠ” κ°€μ€‘μΉ˜λ‘œλŠ” 두 λ„μ‹œκ°„μ˜ 이동 μ‹œκ°„μ΄λ‚˜ μ›Ή μ‚¬μ΄νŠΈλ₯Ό λ°©λ¬Έν•  λ•Œ ν•œ νŽ˜μ΄μ§€μ—μ„œ λ‹€λ₯Έ νŽ˜μ΄μ§€λ‘œ μ΄λ™ν•˜λŠ” μ‹œκ°„ 등이 될 수 μžˆλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ’€ 더 μ •ν™•ν•œ 순회 νŒ¨ν„΄μ„ λ§ˆμ΄λ‹ν•˜κΈ° μœ„ν•΄ ν†΅κ³„ν•™μ˜ μ‹ λ’° ꡬ간을 μ΄μš©ν•œλ‹€. 즉, 전체 순회의 각 간선에 λΆ€μ—¬λœ κ°€μ€‘μΉ˜λ‘œλΆ€ν„° μ‹ λ’° ꡬ간을 κ΅¬ν•œ ν›„ μ‹ λ’° κ΅¬κ°„μ˜ 내에 μžˆλŠ” μˆœνšŒλ§Œμ„ μœ νš¨ν•œ κ²ƒμœΌλ‘œ μΈμ •ν•˜λŠ” 방법이닀. μ΄λŸ¬ν•œ 방법을 μ μš©ν•¨μœΌλ‘œμ¨ λ”μš± μ‹ λ’°μ„± μžˆλŠ” 순회 νŒ¨ν„΄μ„ λ§ˆμ΄λ‹ν•  수 μžˆλ‹€. λ˜ν•œ μ΄λ ‡κ²Œ κ΅¬ν•œ νŒ¨ν„΄κ³Ό κ·Έλž˜ν”„ 정보λ₯Ό μ΄μš©ν•˜μ—¬ νŒ¨ν„΄ κ°„μ˜ μš°μ„ μˆœμœ„λ₯Ό κ²°μ •ν•  수 μžˆλŠ” 방법과 μ„±λŠ₯ ν–₯상을 μœ„ν•œ μ•Œκ³ λ¦¬μ¦˜λ„ μ œμ‹œν•œλ‹€. 두 번째 방법은 κ·Έλž˜ν”„μ˜ 정점에 κ°€μ€‘μΉ˜κ°€ λΆ€μ—¬λœ κ²½μš°μ— κ°€μ€‘μΉ˜κ°€ 고렀된 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 방법이닀. κ·Έλž˜ν”„μ˜ 정점에 뢀여될 수 μžˆλŠ” κ°€μ€‘μΉ˜λ‘œλŠ” μ›Ή μ‚¬μ΄νŠΈ λ‚΄μ˜ 각 λ¬Έμ„œμ˜ μ •λ³΄λŸ‰μ΄λ‚˜ μ€‘μš”λ„ 등이 될 수 μžˆλ‹€. 이 λ¬Έμ œμ—μ„œλŠ” 빈발 순회 νŒ¨ν„΄μ„ κ²°μ •ν•˜κΈ° μœ„ν•˜μ—¬ νŒ¨ν„΄μ˜ λ°œμƒ λΉˆλ„λΏλ§Œ μ•„λ‹ˆλΌ λ°©λ¬Έν•œ μ •μ μ˜ κ°€μ€‘μΉ˜λ₯Ό λ™μ‹œμ— κ³ λ €ν•˜μ—¬μ•Ό ν•œλ‹€. 이λ₯Ό μœ„ν•΄ λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ •μ μ˜ κ°€μ€‘μΉ˜λ₯Ό μ΄μš©ν•˜μ—¬ ν–₯후에 빈발 νŒ¨ν„΄μ΄ 될 κ°€λŠ₯성이 μžˆλŠ” 후보 νŒ¨ν„΄μ€ 각 λ§ˆμ΄λ‹ λ‹¨κ³„μ—μ„œ μ œκ±°ν•˜μ§€ μ•Šκ³  μœ μ§€ν•˜λŠ” μ•Œκ³ λ¦¬μ¦˜μ„ μ œμ•ˆν•œλ‹€. λ˜ν•œ μ„±λŠ₯ ν–₯상을 μœ„ν•΄ 후보 νŒ¨ν„΄μ˜ 수λ₯Ό κ°μ†Œμ‹œν‚€λŠ” μ•Œκ³ λ¦¬μ¦˜λ„ μ œμ•ˆν•œλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œ μ œμ•ˆν•œ 두 가지 방법에 λŒ€ν•˜μ—¬ λ‹€μ–‘ν•œ μ‹€ν—˜μ„ ν†΅ν•˜μ—¬ μˆ˜ν–‰ μ‹œκ°„ 및 μƒμ„±λ˜λŠ” νŒ¨ν„΄μ˜ 수 등을 비ꡐ λΆ„μ„ν•˜μ˜€λ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μˆœνšŒμ— κ°€μ€‘μΉ˜κ°€ μžˆλŠ” κ²½μš°μ™€ κ·Έλž˜ν”„μ˜ 정점에 κ°€μ€‘μΉ˜κ°€ μžˆλŠ” κ²½μš°μ— 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” μƒˆλ‘œμš΄ 방법듀을 μ œμ•ˆν•˜μ˜€λ‹€. μ œμ•ˆν•œ 방법듀을 μ›Ή λ§ˆμ΄λ‹κ³Ό 같은 뢄야에 μ μš©ν•¨μœΌλ‘œμ¨ μ›Ή ꡬ쑰의 효율적인 λ³€κ²½μ΄λ‚˜ μ›Ή λ¬Έμ„œμ˜ μ ‘κ·Ό 속도 ν–₯상, μ‚¬μš©μžλ³„ κ°œμΈν™”λœ μ›Ή λ¬Έμ„œ ꡬ좕 등이 κ°€λŠ₯ν•  것이닀.Abstract β…Ά Chapter 1 Introduction 1.1 Overview 1.2 Motivations 1.3 Approach 1.4 Organization of Thesis Chapter 2 Related Works 2.1 Itemset Mining 2.2 Weighted Itemset Mining 2.3 Traversal Mining 2.4 Graph Traversal Mining Chapter 3 Mining Patterns from Weighted Traversals on Unweighted Graph 3.1 Definitions and Problem Statements 3.2 Mining Frequent Patterns 3.2.1 Augmentation of Base Graph 3.2.2 In-Mining Algorithm 3.2.3 Pre-Mining Algorithm 3.2.4 Priority of Patterns 3.3 Experimental Results Chapter 4 Mining Patterns from Unweighted Traversals on Weighted Graph 4.1 Definitions and Problem Statements 4.2 Mining Weighted Frequent Patterns 4.2.1 Pruning by Support Bounds 4.2.2 Candidate Generation 4.2.3 Mining Algorithm 4.3 Estimation of Support Bounds 4.3.1 Estimation by All Vertices 4.3.2 Estimation by Reachable Vertices 4.4 Experimental Results Chapter 5 Conclusions and Further Works Reference

    Web Mining for Web Personalization

    Get PDF
    Web personalization is the process of customizing a Web site to the needs of specific users, taking advantage of the knowledge acquired from the analysis of the user\u27s navigational behavior (usage data) in correlation with other information collected in the Web context, namely, structure, content, and user profile data. Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. In this article we present a survey of the use of Web mining for Web personalization. More specifically, we introduce the modules that comprise a Web personalization system, emphasizing the Web usage mining module. A review of the most common methods that are used as well as technical issues that occur is given, along with a brief overview of the most popular tools and applications available from software vendors. Moreover, the most important research initiatives in the Web usage mining and personalization areas are presented

    Intelligent agents for matching information providers and consumers on the World-Wide-Web

    Get PDF
    In this paper, we discuss the various issues in designing intelligent software systems to assist world-wide-web users in locating relevant information. We identify a number of key components in such intelligent systems. These include a web document database management system, a client-based goal-directed search engine, an intelligent learning agent which discovers users' topics of interest by studying their browsing behavior, and an intelligent agent which monitors `hot' web sites. We give examples and suggestions on how these components are designed and implemented. We also describe the architecture of a prototype system that integrates the various components.published_or_final_versio

    Information Visualization and Visual Data Mining

    Get PDF
    Data visualization is the graphical display of abstract information for two purposes: sense-making (also called data analysis) and communication. Important stories live in our data and data visualization is a powerful means to discover and understand these stories, and then to present them to others. In this paper, we propose a classification of information visualization and visual data mining techniques which is based on the data type to be visualized, the visualization technique and the interaction and distortion technique. We exemplify the classification using a few examples, most of them referring to techniques and systems presented in this special issue
    • …
    corecore