423 research outputs found
Mining Traversal Patterns from Weighted Traversals and Graph
μ€μΈκ³μ λ§μ λ¬Έμ λ€μ κ·Έλνμ κ·Έ κ·Έλνλ₯Ό μννλ νΈλμμ
μΌλ‘ λͺ¨λΈλ§λ μ μλ€. μλ₯Ό λ€λ©΄, μΉ νμ΄μ§μ μ°κ²°κ΅¬μ‘°λ κ·Έλνλ‘ ννλ μ μκ³ , μ¬μ©μμ μΉ νμ΄μ§ λ°©λ¬Έκ²½λ‘λ κ·Έ κ·Έλνλ₯Ό μννλ νΈλμμ
μΌλ‘ λͺ¨λΈλ§λ μ μλ€. μ΄μ κ°μ΄ κ·Έλνλ₯Ό μννλ νΈλμμ
μΌλ‘λΆν° μ€μνκ³ κ°μΉ μλ ν¨ν΄μ μ°Ύμλ΄λ κ²μ μλ―Έ μλ μΌμ΄λ€. μ΄λ¬ν ν¨ν΄μ μ°ΎκΈ° μν μ§κΈκΉμ§μ μ°κ΅¬μμλ μνλ κ·Έλνμ κ°μ€μΉλ₯Ό κ³ λ €νμ§ μκ³ λ¨μν λΉλ°νλ ν¨ν΄λ§μ μ°Ύλ μκ³ λ¦¬μ¦μ μ μνμλ€. μ΄λ¬ν μκ³ λ¦¬μ¦μ νκ³λ λ³΄λ€ μ λ’°μ± μκ³ μ νν ν¨ν΄μ νμ¬νλ λ° μ΄λ €μμ΄ μλ€λ κ²μ΄λ€.
λ³Έ λ
Όλ¬Έμμλ μνλ κ·Έλνμ μ μ μ λΆμ¬λ κ°μ€μΉλ₯Ό κ³ λ €νμ¬ ν¨ν΄μ νμ¬νλ λ κ°μ§ λ°©λ²λ€μ μ μνλ€. 첫 λ²μ§Έ λ°©λ²μ κ·Έλνλ₯Ό μννλ μ 보μ κ°μ€μΉκ° μ‘΄μ¬νλ κ²½μ°μ λΉλ° μν ν¨ν΄μ νμ¬νλ κ²μ΄λ€. κ·Έλν μνμ λΆμ¬λ μ μλ κ°μ€μΉλ‘λ λ λμκ°μ μ΄λ μκ°μ΄λ μΉ μ¬μ΄νΈλ₯Ό λ°©λ¬Έν λ ν νμ΄μ§μμ λ€λ₯Έ νμ΄μ§λ‘ μ΄λνλ μκ° λ±μ΄ λ μ μλ€. λ³Έ λ
Όλ¬Έμμλ μ’ λ μ νν μν ν¨ν΄μ λ§μ΄λνκΈ° μν΄ ν΅κ³νμ μ λ’° ꡬκ°μ μ΄μ©νλ€. μ¦, μ 체 μνμ κ° κ°μ μ λΆμ¬λ κ°μ€μΉλ‘λΆν° μ λ’° ꡬκ°μ ꡬν ν μ λ’° ꡬκ°μ λ΄μ μλ μνλ§μ μ ν¨ν κ²μΌλ‘ μΈμ νλ λ°©λ²μ΄λ€. μ΄λ¬ν λ°©λ²μ μ μ©ν¨μΌλ‘μ¨ λμ± μ λ’°μ± μλ μν ν¨ν΄μ λ§μ΄λν μ μλ€. λν μ΄λ κ² κ΅¬ν ν¨ν΄κ³Ό κ·Έλν μ 보λ₯Ό μ΄μ©νμ¬ ν¨ν΄ κ°μ μ°μ μμλ₯Ό κ²°μ ν μ μλ λ°©λ²κ³Ό μ±λ₯ ν₯μμ μν μκ³ λ¦¬μ¦λ μ μνλ€.
λ λ²μ§Έ λ°©λ²μ κ·Έλνμ μ μ μ κ°μ€μΉκ° λΆμ¬λ κ²½μ°μ κ°μ€μΉκ° κ³ λ €λ λΉλ° μν ν¨ν΄μ νμ¬νλ λ°©λ²μ΄λ€. κ·Έλνμ μ μ μ λΆμ¬λ μ μλ κ°μ€μΉλ‘λ μΉ μ¬μ΄νΈ λ΄μ κ° λ¬Έμμ μ 보λμ΄λ μ€μλ λ±μ΄ λ μ μλ€. μ΄ λ¬Έμ μμλ λΉλ° μν ν¨ν΄μ κ²°μ νκΈ° μνμ¬ ν¨ν΄μ λ°μ λΉλλΏλ§ μλλΌ λ°©λ¬Έν μ μ μ κ°μ€μΉλ₯Ό λμμ κ³ λ €νμ¬μΌ νλ€. μ΄λ₯Ό μν΄ λ³Έ λ
Όλ¬Έμμλ μ μ μ κ°μ€μΉλ₯Ό μ΄μ©νμ¬ ν₯νμ λΉλ° ν¨ν΄μ΄ λ κ°λ₯μ±μ΄ μλ ν보 ν¨ν΄μ κ° λ§μ΄λ λ¨κ³μμ μ κ±°νμ§ μκ³ μ μ§νλ μκ³ λ¦¬μ¦μ μ μνλ€. λν μ±λ₯ ν₯μμ μν΄ ν보 ν¨ν΄μ μλ₯Ό κ°μμν€λ μκ³ λ¦¬μ¦λ μ μνλ€.
λ³Έ λ
Όλ¬Έμμ μ μν λ κ°μ§ λ°©λ²μ λνμ¬ λ€μν μ€νμ ν΅νμ¬ μν μκ° λ° μμ±λλ ν¨ν΄μ μ λ±μ λΉκ΅ λΆμνμλ€.
λ³Έ λ
Όλ¬Έμμλ μνμ κ°μ€μΉκ° μλ κ²½μ°μ κ·Έλνμ μ μ μ κ°μ€μΉκ° μλ κ²½μ°μ λΉλ° μν ν¨ν΄μ νμ¬νλ μλ‘μ΄ λ°©λ²λ€μ μ μνμλ€. μ μν λ°©λ²λ€μ μΉ λ§μ΄λκ³Ό κ°μ λΆμΌμ μ μ©ν¨μΌλ‘μ¨ μΉ κ΅¬μ‘°μ ν¨μ¨μ μΈ λ³κ²½μ΄λ μΉ λ¬Έμμ μ κ·Ό μλ ν₯μ, μ¬μ©μλ³ κ°μΈνλ μΉ λ¬Έμ κ΅¬μΆ λ±μ΄ κ°λ₯ν κ²μ΄λ€.Abstract β
Ά
Chapter 1 Introduction
1.1 Overview
1.2 Motivations
1.3 Approach
1.4 Organization of Thesis
Chapter 2 Related Works
2.1 Itemset Mining
2.2 Weighted Itemset Mining
2.3 Traversal Mining
2.4 Graph Traversal Mining
Chapter 3 Mining Patterns from Weighted Traversals on
Unweighted Graph
3.1 Definitions and Problem Statements
3.2 Mining Frequent Patterns
3.2.1 Augmentation of Base Graph
3.2.2 In-Mining Algorithm
3.2.3 Pre-Mining Algorithm
3.2.4 Priority of Patterns
3.3 Experimental Results
Chapter 4 Mining Patterns from Unweighted Traversals on
Weighted Graph
4.1 Definitions and Problem Statements
4.2 Mining Weighted Frequent Patterns
4.2.1 Pruning by Support Bounds
4.2.2 Candidate Generation
4.2.3 Mining Algorithm
4.3 Estimation of Support Bounds
4.3.1 Estimation by All Vertices
4.3.2 Estimation by Reachable Vertices
4.4 Experimental Results
Chapter 5 Conclusions and Further Works
Reference
Using Markov Chains for link prediction in adaptive web sites
The large number of Web pages on many Web sites has raised
navigational problems. Markov chains have recently been used to model user navigational behavior on the World Wide Web (WWW). In this paper, we propose a method for constructing a Markov model of a Web site based on past
visitor behavior. We use the Markov model to make link predictions that assist new users to navigate the Web site. An algorithm for transition probability
matrix compression has been used to cluster Web pages with similar transition behaviors and compress the transition matrix to an optimal size for efficient probability calculation in link prediction. A maximal forward path method is used to further improve the efficiency of link prediction. Link prediction has been implemented in an online system called ONE (Online Navigation Explorer) to assist users' navigation in the adaptive Web site
JGraphT -- A Java library for graph data structures and algorithms
Mathematical software and graph-theoretical algorithmic packages to
efficiently model, analyze and query graphs are crucial in an era where
large-scale spatial, societal and economic network data are abundantly
available. One such package is JGraphT, a programming library which contains
very efficient and generic graph data-structures along with a large collection
of state-of-the-art algorithms. The library is written in Java with stability,
interoperability and performance in mind. A distinctive feature of this library
is the ability to model vertices and edges as arbitrary objects, thereby
permitting natural representations of many common networks including
transportation, social and biological networks. Besides classic graph
algorithms such as shortest-paths and spanning-tree algorithms, the library
contains numerous advanced algorithms: graph and subgraph isomorphism; matching
and flow problems; approximation algorithms for NP-hard problems such as
independent set and TSP; and several more exotic algorithms such as Berge graph
detection. Due to its versatility and generic design, JGraphT is currently used
in large-scale commercial, non-commercial and academic research projects. In
this work we describe in detail the design and underlying structure of the
library, and discuss its most important features and algorithms. A
computational study is conducted to evaluate the performance of JGraphT versus
a number of similar libraries. Experiments on a large number of graphs over a
variety of popular algorithms show that JGraphT is highly competitive with
other established libraries such as NetworkX or the BGL.Comment: Major Revisio
Co-Clustering Network-Constrained Trajectory Data
Recently, clustering moving object trajectories kept gaining interest from
both the data mining and machine learning communities. This problem, however,
was studied mainly and extensively in the setting where moving objects can move
freely on the euclidean space. In this paper, we study the problem of
clustering trajectories of vehicles whose movement is restricted by the
underlying road network. We model relations between these trajectories and road
segments as a bipartite graph and we try to cluster its vertices. We
demonstrate our approaches on synthetic data and show how it could be useful in
inferring knowledge about the flow dynamics and the behavior of the drivers
using the road network
Graph Sample and Hold: A Framework for Big-Graph Analytics
Sampling is a standard approach in big-graph analytics; the goal is to
efficiently estimate the graph properties by consulting a sample of the whole
population. A perfect sample is assumed to mirror every property of the whole
population. Unfortunately, such a perfect sample is hard to collect in complex
populations such as graphs (e.g. web graphs, social networks etc), where an
underlying network connects the units of the population. Therefore, a good
sample will be representative in the sense that graph properties of interest
can be estimated with a known degree of accuracy. While previous work focused
particularly on sampling schemes used to estimate certain graph properties
(e.g. triangle count), much less is known for the case when we need to estimate
various graph properties with the same sampling scheme. In this paper, we
propose a generic stream sampling framework for big-graph analytics, called
Graph Sample and Hold (gSH). To begin, the proposed framework samples from
massive graphs sequentially in a single pass, one edge at a time, while
maintaining a small state. We then show how to produce unbiased estimators for
various graph properties from the sample. Given that the graph analysis
algorithms will run on a sample instead of the whole population, the runtime
complexity of these algorithm is kept under control. Moreover, given that the
estimators of graph properties are unbiased, the approximation error is kept
under control. Finally, we show the performance of the proposed framework (gSH)
on various types of graphs, such as social graphs, among others
- β¦