1,561 research outputs found

    Content-Aware DataGuides for Indexing Large Collections of XML Documents

    Get PDF
    XML is well-suited for modelling structured data with textual content. However, most indexing approaches perform structure and content matching independently, combining the retrieved path and keyword occurrences in a third step. This paper shows that retrieval in XML documents can be accelerated significantly by processing text and structure simultaneously during all retrieval phases. To this end, the Content-Aware DataGuide (CADG) enhances the wellknown DataGuide with (1) simultaneous keyword and path matching and (2) a precomputed content/structure join. Extensive experiments prove the CADG to be 50-90% faster than the DataGuide for various sorts of query and document, including difficult cases such as poorly structured queries and recursive document paths. A new query classification scheme identifies precise query characteristics with a predominant influence on the performance of the individual indices. The experiments show that the CADG is applicable to many real-world applications, in particular large collections of heterogeneously structured XML documents

    A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

    Full text link
    The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses

    A Survey on Web Usage Mining, Applications and Tools

    Get PDF
    World Wide Web is a vast collection of unstructured web documents like text, images, audio, video or Multimedia content.  As web is growing rapidly with millions of documents, mining the data from the web is a difficult task. To mine various patterns from the web is known as Web mining. Web mining is further classified as content mining, structure mining and web usage mining. Web usage mining is the data mining technique to mine the knowledge of usage of web data from World Wide Web. Web usage mining extracts useful information from various web logs i.e. users usage history. This is useful for better understanding and serve the people for better web applications. Web usage mining not only useful for the people who access the documents from the World Wide Web, but also it useful for many applications like e-commerce to do personalized marketing, e-services, the government agencies to classify threats and fight against terrorism, fraud detection, to identify criminal activities, the companies can establish better customer relationship and can improve their businesses by analyzing the people buying strategies etc. This paper is going to explain in detail about web usage mining and how it is helpful. Web Usage Mining has seen rapid increase towards research and people communities
    • …
    corecore