706 research outputs found

    Efficient Incremental Breadth-Depth XML Event Mining

    Full text link
    Many applications log a large amount of events continuously. Extracting interesting knowledge from logged events is an emerging active research area in data mining. In this context, we propose an approach for mining frequent events and association rules from logged events in XML format. This approach is composed of two-main phases: I) constructing a novel tree structure called Frequency XML-based Tree (FXT), which contains the frequency of events to be mined; II) querying the constructed FXT using XQuery to discover frequent itemsets and association rules. The FXT is constructed with a single-pass over logged data. We implement the proposed algorithm and study various performance issues. The performance study shows that the algorithm is efficient, for both constructing the FXT and discovering association rules

    Building XML data warehouse based on frequent patterns in user queries

    Get PDF
    [Abstract]: With the proliferation of XML-based data sources available across the Internet, it is increasingly important to provide users with a data warehouse of XML data sources to facilitate decision-making processes. Due to the extremely large amount of XML data available on web, unguided warehousing of XML data turns out to be highly costly and usually cannot well accommodate the users’ needs in XML data acquirement. In this paper, we propose an approach to materialize XML data warehouses based on frequent query patterns discovered from historical queries issued by users. The schemas of integrated XML documents in the warehouse are built using these frequent query patterns represented as Frequent Query Pattern Trees (FreqQPTs). Using hierarchical clustering technique, the integration approach in the data warehouse is flexible with respect to obtaining and maintaining XML documents. Experiments show that the overall processing of the same queries issued against the global schema become much efficient by using the XML data warehouse built than by directly searching the multiple data sources

    Analysis of the NIST database towards the composition of vulnerabilities in attack scenarios

    Get PDF
    The composition of vulnerabilities in attack scenarios has been traditionally performed based on detailed pre- and post-conditions. Although very precise, this approach is dependent on human analysis, is time consuming, and not at all scalable. We investigate the NIST National Vulnerability Database (NVD) with three goals: (i) understand the associations among vulnerability attributes related to impact, exploitability, privilege, type of vulnerability and clues derived from plaintext descriptions, (ii) validate our initial composition model which is based on required access and resulting effect, and (iii) investigate the maturity of XML database technology for performing statistical analyses like this directly on the XML data. In this report, we analyse 27,273 vulnerability entries (CVE 1) from the NVD. Using only nominal information, we are able to e.g. identify clusters in the class of vulnerabilities with no privilege which represent 52% of the entries

    Web and Semantic Web Query Languages

    Get PDF
    A number of techniques have been developed to facilitate powerful data retrieval on the Web and Semantic Web. Three categories of Web query languages can be distinguished, according to the format of the data they can retrieve: XML, RDF and Topic Maps. This article introduces the spectrum of languages falling into these categories and summarises their salient aspects. The languages are introduced using common sample data and query types. Key aspects of the query languages considered are stressed in a conclusion

    Mining XML documents with association rule algorithms

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2008Includes bibliographical references (leaves: 59-63)Text in English; Abstract: Turkish and Englishx, 63 leavesFollowing the increasing use of XML technology for data storage and data exchange between applications, the subject of mining XML documents has become more researchable and important topic. In this study, we considered the problem of Mining Association Rules between items in XML document. The principal purpose of this study is applying association rule algorithms directly to the XML documents with using XQuery which is a functional expression language that can be used to query or process XML data. We used three different algorithms; Apriori, AprioriTid and High Efficient AprioriTid. We give comparisons of mining times of these three apriori-like algorithms on XML documents using different support levels, different datasets and different dataset sizes

    Reasoning & Querying – State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    Penghasilan dan penilaian video pembelajaran (CD) bagi mata pelajaran Prinsip Ekonomi (BPA 1013) bertajuk permintaan dan penawaran di KUITTHO

    Get PDF
    Kajian ini dijaiankan untuk meniiai keberkesanan sebuah video pembeiajaran (CD) mata peiajaran Prinsip Ekonomi (BPA 1013) bertajuk Permintaan dan Penawaran. Bagi tujuan tersebut, sebuah video pembelajaran telah dihasilkan membantu pelajar bagi memahami mata pelajaran berkenan semasa proses pengajaran dan pembelajaran berlaku. Video pembelajaran yang dihasilkan ini kemudian dinilai dari aspek proses pengajaran dan pembelajaran, minat dan persepsi responden terhadap ciri-ciri video (audio dan visual). Seramai 60 orang pelajar semester 2 Sarjana Muda Sains Pengurusan di Kolej Universiti Teknologi Tun Hussein Onn telah dipiih bagi membuat penilaian kebolehgunaan produk ini sebagai alat bantuan mengajar di dalam kelas. Semua data yang diperolehi kemudiannya dikumpulkan bagi dianalisis dengan menggunakan perisian "SrarMfKM/ Pac/rageybr Rocaj/ Sb/'eace " (SPSS). Hasil dapatan kajian yang dilakukan jelas menunjukkan video pengajaran yang dihasilkan dan dinilai ini amat sesuai digunakan bagi tujuan memenuhi keperluan proses pengajaran dan pembelajaran subjek ini di dalam kelas

    Deriving Conceptual Schema from XML Databases

    Get PDF
    In this paper, two concepts from different research areas are addressed together, namely functional dependency (FD) and multidimensional association rule (MAR). FD is a class of integrity constraints that have gained fundamental importance in relational database design. MAR is a class of patterns which has been studied rigorously in data mining. We employ MAR to mine the interesting rules from XML Databases. The mined interesting rules are considered as candidate FDs whose all confidence itemsets are 100%. To prune the weak rules, we pay attention to support and correlation itemsets. The final strong rules are used to generate an Object-Role Model conceptual schema diagram

    The Hidden Web, XML and Semantic Web: A Scientific Data Management Perspective

    Get PDF
    The World Wide Web no longer consists just of HTML pages. Our work sheds light on a number of trends on the Internet that go beyond simple Web pages. The hidden Web provides a wealth of data in semi-structured form, accessible through Web forms and Web services. These services, as well as numerous other applications on the Web, commonly use XML, the eXtensible Markup Language. XML has become the lingua franca of the Internet that allows customized markups to be defined for specific domains. On top of XML, the Semantic Web grows as a common structured data source. In this work, we first explain each of these developments in detail. Using real-world examples from scientific domains of great interest today, we then demonstrate how these new developments can assist the managing, harvesting, and organization of data on the Web. On the way, we also illustrate the current research avenues in these domains. We believe that this effort would help bridge multiple database tracks, thereby attracting researchers with a view to extend database technology.Comment: EDBT - Tutorial (2011
    corecore