15,328 research outputs found

    Personalization by Partial Evaluation.

    Get PDF
    The central contribution of this paper is to model personalization by the programmatic notion of partial evaluation.Partial evaluation is a technique used to automatically specialize programs, given incomplete information about their input.The methodology presented here models a collection of information resources as a program (which abstracts the underlying schema of organization and flow of information),partially evaluates the program with respect to user input,and recreates a personalized site from the specialized program.This enables a customizable methodology called PIPE that supports the automatic specialization of resources,without enumerating the interaction sequences beforehand .Issues relating to the scalability of PIPE,information integration,sessioniz-ling scenarios,and case studies are presented

    Assessment of hydrological and seasonal controls over the nitrate flushing from a forested watershed using a data mining technique

    Get PDF
    A data mining, regression tree algorithm M5 was used to review the role of mutual hydrological and seasonal settings which control the streamwater nitrate flushing during hydrological events within a forested watershed in the southwestern part of Slovenia, characterized by distinctive flushing, almost torrential hydrological regime. The basis for the research was an extensive dataset of continuous, high frequency measurements of seasonal meteorological conditions, watershed hydrological responses and streamwater nitrate concentrations. The dataset contained 16 recorded hydrographs occurring in different seasonal and hydrological conditions. Based on predefined regression tree pruning criteria, a comprehensible regression tree model was obtained in the sense of the domain knowledge, which was able to adequately describe most of the streamwater nitrate concentration variations (RMSE=1.02mg/l-N; r=0.91). The attributes which were found to be the most descriptive in the sense of streamwater nitrate concentrations were the antecedent precipitation index (API) and air temperatures in the preceding periods. The model was most successful in describing streamwater concentrations in the range 1-4 mg/l-N, covering large proportion of the dataset. The model performance was little worse in the periods of high streamwater nitrate concentration peaks during the summer hydrographs (up to 7 mg/l-N) but poor during the autumn hydrograph (up to 14 mg/l-N) related to highly variable hydrological conditions, which would require a less robust regression tree model based on the extended dataset

    Efficient Incremental Breadth-Depth XML Event Mining

    Full text link
    Many applications log a large amount of events continuously. Extracting interesting knowledge from logged events is an emerging active research area in data mining. In this context, we propose an approach for mining frequent events and association rules from logged events in XML format. This approach is composed of two-main phases: I) constructing a novel tree structure called Frequency XML-based Tree (FXT), which contains the frequency of events to be mined; II) querying the constructed FXT using XQuery to discover frequent itemsets and association rules. The FXT is constructed with a single-pass over logged data. We implement the proposed algorithm and study various performance issues. The performance study shows that the algorithm is efficient, for both constructing the FXT and discovering association rules

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Automated speech and audio analysis for semantic access to multimedia

    Get PDF
    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

    A Survey on Actionable Knowledge

    Full text link
    Actionable Knowledge Discovery (AKD) is a crucial aspect of data mining that is gaining popularity and being applied in a wide range of domains. This is because AKD can extract valuable insights and information, also known as knowledge, from large datasets. The goal of this paper is to examine different research studies that focus on various domains and have different objectives. The paper will review and discuss the methods used in these studies in detail. AKD is a process of identifying and extracting actionable insights from data, which can be used to make informed decisions and improve business outcomes. It is a powerful tool for uncovering patterns and trends in data that can be used for various applications such as customer relationship management, marketing, and fraud detection. The research studies reviewed in this paper will explore different techniques and approaches for AKD in different domains, such as healthcare, finance, and telecommunications. The paper will provide a thorough analysis of the current state of AKD in the field and will review the main methods used by various research studies. Additionally, the paper will evaluate the advantages and disadvantages of each method and will discuss any novel or new solutions presented in the field. Overall, this paper aims to provide a comprehensive overview of the methods and techniques used in AKD and the impact they have on different domains
    corecore