8 research outputs found

    Mining Traversal Patterns from Weighted Traversals and Graph

    Get PDF
    실세계의 많은 문제들은 그래프와 그 그래프를 순회하는 트랜잭션으로 모델링될 수 있다. 예를 들면, 웹 페이지의 연결구조는 그래프로 표현될 수 있고, 사용자의 웹 페이지 방문경로는 그 그래프를 순회하는 트랜잭션으로 모델링될 수 있다. 이와 같이 그래프를 순회하는 트랜잭션으로부터 중요하고 가치 있는 패턴을 찾아내는 것은 의미 있는 일이다. 이러한 패턴을 찾기 위한 지금까지의 연구에서는 순회나 그래프의 가중치를 고려하지 않고 단순히 빈발하는 패턴만을 찾는 알고리즘을 제안하였다. 이러한 알고리즘의 한계는 보다 신뢰성 있고 정확한 패턴을 탐사하는 데 어려움이 있다는 것이다. 본 논문에서는 순회나 그래프의 정점에 부여된 가중치를 고려하여 패턴을 탐사하는 두 가지 방법들을 제안한다. 첫 번째 방법은 그래프를 순회하는 정보에 가중치가 존재하는 경우에 빈발 순회 패턴을 탐사하는 것이다. 그래프 순회에 부여될 수 있는 가중치로는 두 도시간의 이동 시간이나 웹 사이트를 방문할 때 한 페이지에서 다른 페이지로 이동하는 시간 등이 될 수 있다. 본 논문에서는 좀 더 정확한 순회 패턴을 마이닝하기 위해 통계학의 신뢰 구간을 이용한다. 즉, 전체 순회의 각 간선에 부여된 가중치로부터 신뢰 구간을 구한 후 신뢰 구간의 내에 있는 순회만을 유효한 것으로 인정하는 방법이다. 이러한 방법을 적용함으로써 더욱 신뢰성 있는 순회 패턴을 마이닝할 수 있다. 또한 이렇게 구한 패턴과 그래프 정보를 이용하여 패턴 간의 우선순위를 결정할 수 있는 방법과 성능 향상을 위한 알고리즘도 제시한다. 두 번째 방법은 그래프의 정점에 가중치가 부여된 경우에 가중치가 고려된 빈발 순회 패턴을 탐사하는 방법이다. 그래프의 정점에 부여될 수 있는 가중치로는 웹 사이트 내의 각 문서의 정보량이나 중요도 등이 될 수 있다. 이 문제에서는 빈발 순회 패턴을 결정하기 위하여 패턴의 발생 빈도뿐만 아니라 방문한 정점의 가중치를 동시에 고려하여야 한다. 이를 위해 본 논문에서는 정점의 가중치를 이용하여 향후에 빈발 패턴이 될 가능성이 있는 후보 패턴은 각 마이닝 단계에서 제거하지 않고 유지하는 알고리즘을 제안한다. 또한 성능 향상을 위해 후보 패턴의 수를 감소시키는 알고리즘도 제안한다. 본 논문에서 제안한 두 가지 방법에 대하여 다양한 실험을 통하여 수행 시간 및 생성되는 패턴의 수 등을 비교 분석하였다. 본 논문에서는 순회에 가중치가 있는 경우와 그래프의 정점에 가중치가 있는 경우에 빈발 순회 패턴을 탐사하는 새로운 방법들을 제안하였다. 제안한 방법들을 웹 마이닝과 같은 분야에 적용함으로써 웹 구조의 효율적인 변경이나 웹 문서의 접근 속도 향상, 사용자별 개인화된 웹 문서 구축 등이 가능할 것이다.Abstract ⅶ Chapter 1 Introduction 1.1 Overview 1.2 Motivations 1.3 Approach 1.4 Organization of Thesis Chapter 2 Related Works 2.1 Itemset Mining 2.2 Weighted Itemset Mining 2.3 Traversal Mining 2.4 Graph Traversal Mining Chapter 3 Mining Patterns from Weighted Traversals on Unweighted Graph 3.1 Definitions and Problem Statements 3.2 Mining Frequent Patterns 3.2.1 Augmentation of Base Graph 3.2.2 In-Mining Algorithm 3.2.3 Pre-Mining Algorithm 3.2.4 Priority of Patterns 3.3 Experimental Results Chapter 4 Mining Patterns from Unweighted Traversals on Weighted Graph 4.1 Definitions and Problem Statements 4.2 Mining Weighted Frequent Patterns 4.2.1 Pruning by Support Bounds 4.2.2 Candidate Generation 4.2.3 Mining Algorithm 4.3 Estimation of Support Bounds 4.3.1 Estimation by All Vertices 4.3.2 Estimation by Reachable Vertices 4.4 Experimental Results Chapter 5 Conclusions and Further Works Reference

    Relationship between product based loyalty and clustering based on supermarket visit and spending patterns

    Get PDF
    Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a number of weeks. Regular visitors and spenders are more likely to be loyal to the supermarket. Neither one of these two criteria can provide a complete picture of customers’ loyalty. The decision regarding the loyalty of a customer will have to take into account the visiting pattern as well as the categories of products purchased. This paper describes results of experiments that attempted to identify customer loyalty using thes e two sets of criteria separately. The experiments were based on transactional data obtained from a supermarket data collection program. Comparisons of results from these parallel sets of experiments were useful in fine tuning both the schemes of estimating the degree of loyalty of a customer. The project also provides useful insights for the development of more sophisticated measures for studying customer loyalty. It is hoped that the understanding of loyal customers will be helpful in identifying better marketing strategies

    Temporal and Contextual Dependencies in Relational Data Modeling

    Get PDF
    Although a solid theoretical foundation of relational data modeling has existed for decades, critical reassessment from temporal requirements’ perspective reveals shortcomings in its integrity constraints. We identify the need for this work by discussing how existing relational databases fail to ensure correctness of data when the data to be stored is time sensitive. The analysis presented in this work becomes particularly important in present times where, because of relational databases’ inadequacy to cater to all the requirements, new forms of database systems such as temporal databases, active databases, real time databases, and NoSQL (non-relational) databases have been introduced. In relational databases, temporal requirements have been dealt with either at application level using scripts or through manual assistance, but no attempts have been made to address them at design level. These requirements are the ones that need changing metadata as the time progresses, which remains unsupported by Relational Database Management System (RDBMS) to date. Starting with shortcomings of data, entity, and referential integrity in relational data modeling, we propose a new form of integrity that works at a more detailed level of granularity. We also present several important concepts including temporal dependency, contextual dependency, and cell level integrity. We then introduce cellular-constraints to implement the proposed integrity and dependencies, and also how they can be incorporated into the relational data model to enable RDBMS to handle temporal requirements in future. Overall, we provide a formal description to address the temporal requirements’ problem in relational data model, and design a framework for solving this problem. We have supplemented our proposition using examples, experiments and results

    Algorithms for Mining Parallel-Of-Serial Episodes

    Get PDF
    Department of Computer Scienc

    Granite: A scientific database model and implementation

    Get PDF
    The principal goal of this research was to develop a formal comprehensive model for representing highly complex scientific data. An effective model should provide a conceptually uniform way to represent data and it should serve as a framework for the implementation of an efficient and easy-to-use software environment that implements the model. The dissertation work presented here describes such a model and its contributions to the field of scientific databases. In particular, the Granite model encompasses a wide variety of datatypes used across many disciplines of science and engineering today. It is unique in that it defines dataset geometry and topology as separate conceptual components of a scientific dataset. We provide a novel classification of geometries and topologies that has important practical implications for a scientific database implementation. The Granite model also offers integrated support for multiresolution and adaptive resolution data. Many of these ideas have been addressed by others, but no one has tried to bring them all together in a single comprehensive model. The datasource portion of the Granite model offers several further contributions. In addition to providing a convenient conceptual view of rectilinear data, it also supports multisource data. Data can be taken from various sources and combined into a unified view. The rod storage model is an abstraction for file storage that has proven an effective platform upon which to develop efficient access to storage. Our spatial prefetching technique is built upon the rod storage model, and demonstrates very significant improvement in access to scientific datasets, and also allows machines to access data that is far too large to fit in main memory. These improvements bring the extremely large datasets now being generated in many scientific fields into the realm of tractability for the ordinary researcher. We validated the feasibility and viability of the model by implementing a significant portion of it in the Granite system. Extensive performance evaluations of the implementation indicate that the features of the model can be provided in a user-friendly manner with an efficiency that is competitive with more ad hoc systems and more specialized application specific solutions

    Método de Construcción de Currículos para Formación en Educación Superior a partir de Modelos de Gestión de Conocimiento

    Get PDF
    El currículo tiene una serie de conceptualizaciones que pueden llegar a ser contradictorias. La concepción sobre la cual se trabajará en esta tesis es la de entender el currículo como la estructura de un proceso formativo, quedando determinado por los contenidos culturales, las condiciones institucionales y la concepción curricular para llevarlo a la práctica. Su concepción está dada por las necesidades sociales de desarrollar, almacenar, transmitir y aplicar conocimiento en una institución de enseñanza superior (IES) y queda materializado en un programa académico. Llevar a cabo perfeccionamientos o cambios curriculares en las IES es complejo porque requiere de una gran capacidad comunicativa entre las partes interesadas (docentes, estudiantes y directivos), un conocimiento amplio del entorno social sobre el cual el futuro profesional actuará y un conocimiento profundo en los campos de saber que se intervendrán, entre otros. Esta investigación ofrece aportes conceptuales en torno a la representación de los aspectos más relevantes de un currículo, a través de modelos de gestión de conocimiento y también aporta metodológicamente, en torno a la articulación consistente de estos modelos, lo que permite mantener la coherencia, actualidad y propósito del currículo. Se partió de una revisión histórica y conceptual acerca del currículo, indagando por los orígenes del término y sus antecedentes y revisando su evolución a lo largo del tiempo, hasta llegar a una conceptualización que incluyera el conjunto de elementos que en la actualidad se considera conforman un currículo contemporáneo. Luego, se hizo una conceptualización en torno al conocimiento y la gestión de conocimiento, enfatizando en dos procesos fundamentales como son la creación de conocimiento organizacional y la representación del conocimiento. Una vez hechas estas conceptualizaciones, se ofrece un método que permite construir currículos de educación superior a partir de modelos de gestión de conocimiento y se presenta un caso de aplicación, en la revisión de la pertinencia social de un programa académico de educación superior con acreditación en alta calidad en Colombia. Esta tesis aborda un método que le da coherencia al discurso de currículo planteado, tomando algunos modelos de gestión de conocimiento y construyendo otros que apoyan el currículo desde el diseño en cada uno de sus ámbitos en los niveles macro y micro./ Abstract. The curriculum has a number of frameworks that can be contradictory. The concept on which to work this thesis is to understand the curriculum as the structure of formative process, being straitened by cultural contents, institutional conditions and curricular concepts to put into the execution. Its conception is given by the social needs to develop, to store, to transmit and to apply knowledge in a Higher Education Institution (HEI) and is embodied in an academic program. To carry out curricular improvements or changes in HEI’s is complex one because it need: some great communication skills among stakeholders (teachers, students and managers), a broad knowledge of social context on which the future professional will act and deep knowledge in subject matters will involve, etc. This research provides conceptual contributions about representation of relevant topics of curriculum through knowledge management models and also contributes methodologically about the consistent joint of these models. This preserves curriculum consistency, topicality and purpose.Maestrí

    Uma arquitetura para acesso e integração de dados em sistemas sensíveis ao contexto

    Get PDF
    Sistemas de integração de dados vêm sendo desenvolvidos na tentativa de fornecer aos usuários informações consolidadas com transparência de distribuição e formato de armazenamento, e sua presença vem se tornando crucial. Em um cenário de Computação Ubíqua, as aplicações manipulam novos tipos de dados, de natureza contextual e dinâmica, o que dificulta o processo de integração. Este trabalho propõe uma arquitetura conceitual para acesso e integração de dados em ambientes móveis e sensíveis ao contexto. Um estudo de caso em tele-medicina ilustra as funcionalidades da arquitetura proposta.Data integration systems have been used for providing to the users information regardless of distribution and storage format, and their presence have became crucial on Information Technology. On a Ubiquitous Computing scenario, applications manipulate new types of data, with contextual and dynamic behavior, what makes integration process more difficult and complex. This work proposes a conceptual architecture for data access and integration for mobile and context-aware systems. A scenario on Tele medicine shows the advantages and functionalities of proposed architecture
    corecore