121,960 research outputs found

    Random rules from data streams

    Get PDF
    Existing works suggest that random inputs and random features produce good results in classification. In this paper we study the problem of generating random rule sets from data streams. One of the most interpretable and flexible models for data stream mining prediction tasks is the Very Fast Decision Rules learner (VFDR). In this work we extend the VFDR algorithm using random rules from data streams. The proposed algorithm generates several sets of rules. Each rule set is associated with a set of Natt attributes. The proposed algorithm maintains all properties required when learning from stationary data streams: online and any-time classification, processing each example once. Copyright 2013 ACM

    Development of a heuristic methodology for designing measurement networks for precise metal accounting

    Get PDF
    This thesis investigates the development of a heuristic based methodology for designing measurement networks with application to the precise accounting of metal flows in mineral beneficiation operations. The term 'measurement network' is used to refer to the 'system of sampling and weight measurement equipment' from which process measurements are routinely collected. Metal accounting is defined as the estimation of saleable metal in the mine and subsequent process streams over a defined time period. One of the greatest challenges facing metal accounting is 'uncertainty' that is caused by random errors, and sometimes gross errors, that obtain in process measurements. While gross errors can be eliminated through correct measurement practices, random errors are an inherent property of measured data and they can only be minimised. Two types of rules for designing measurement networks were considered. The first type of rules referred to as 'expert heuristics' consists of (i) Code of Practice Guidelines from the AMIRA P754 Code, and (ii) prevailing accounting practices from the mineral and metallurgical processing industry which were obtained through a questionnaire survey campaign. It was hypothesised that experts in the industry design measurement networks using rules or guidelines that ensure requisite quality in metal accounting. The second set of rules was derived from the symbolic manipulation of the general steady-state linear data reconciliation solution as well as from an intensive numerical study on the variance reduction response of measurements after data reconciliation conducted in this study. These were referred to as 'mathematical heuristics' and are based on the general principle of variance reduction through data reconciliation. It was hypothesised that data reconciliation can be used to target variance reduction for selected measurements by exploiting characteristics of entire measurement networks as well as individual measurement characteristics

    GreedyDual-Join: Locality-Aware Buffer Management for Approximate Join Processing Over Data Streams

    Full text link
    We investigate adaptive buffer management techniques for approximate evaluation of sliding window joins over multiple data streams. In many applications, data stream processing systems have limited memory or have to deal with very high speed data streams. In both cases, computing the exact results of joins between these streams may not be feasible, mainly because the buffers used to compute the joins contain much smaller number of tuples than the tuples contained in the sliding windows. Therefore, a stream buffer management policy is needed in that case. We show that the buffer replacement policy is an important determinant of the quality of the produced results. To that end, we propose GreedyDual-Join (GDJ) an adaptive and locality-aware buffering technique for managing these buffers. GDJ exploits the temporal correlations (at both long and short time scales), which we found to be prevalent in many real data streams. We note that our algorithm is readily applicable to multiple data streams and multiple joins and requires almost no additional system resources. We report results of an experimental study using both synthetic and real-world data sets. Our results demonstrate the superiority and flexibility of our approach when contrasted to other recently proposed techniques

    iTeleScope: Intelligent Video Telemetry and Classification in Real-Time using Software Defined Networking

    Full text link
    Video continues to dominate network traffic, yet operators today have poor visibility into the number, duration, and resolutions of the video streams traversing their domain. Current approaches are inaccurate, expensive, or unscalable, as they rely on statistical sampling, middle-box hardware, or packet inspection software. We present {\em iTelescope}, the first intelligent, inexpensive, and scalable SDN-based solution for identifying and classifying video flows in real-time. Our solution is novel in combining dynamic flow rules with telemetry and machine learning, and is built on commodity OpenFlow switches and open-source software. We develop a fully functional system, train it in the lab using multiple machine learning algorithms, and validate its performance to show over 95\% accuracy in identifying and classifying video streams from many providers including Youtube and Netflix. Lastly, we conduct tests to demonstrate its scalability to tens of thousands of concurrent streams, and deploy it live on a campus network serving several hundred real users. Our system gives unprecedented fine-grained real-time visibility of video streaming performance to operators of enterprise and carrier networks at very low cost.Comment: 12 pages, 16 figure

    Adaptive Process Control with Fuzzy Logic and Genetic Algorithms

    Get PDF
    Researchers at the U.S. Bureau of Mines have developed adaptive process control systems in which genetic algorithms (GA's) are used to augment fuzzy logic controllers (FLC's). GA's are search algorithms that rapidly locate near-optimum solutions to a wide spectrum of problems by modeling the search procedures of natural genetics. FLC's are rule based systems that efficiently manipulate a problem environment by modeling the 'rule-of-thumb' strategy used in human decision-making. Together, GA's and FLC's possess the capabilities necessary to produce powerful, efficient, and robust adaptive control systems. To perform efficiently, such control systems require a control element to manipulate the problem environment, an analysis element to recognize changes in the problem environment, and a learning element to adjust to the changes in the problem environment. Details of an overall adaptive control system are discussed. A specific laboratory acid-base pH system is used to demonstrate the ideas presented
    • …
    corecore