8,814 research outputs found

    Garbage collection auto-tuning for Java MapReduce on Multi-Cores

    Get PDF
    MapReduce has been widely accepted as a simple programming pattern that can form the basis for efficient, large-scale, distributed data processing. The success of the MapReduce pattern has led to a variety of implementations for different computational scenarios. In this paper we present MRJ, a MapReduce Java framework for multi-core architectures. We evaluate its scalability on a four-core, hyperthreaded Intel Core i7 processor, using a set of standard MapReduce benchmarks. We investigate the significant impact that Java runtime garbage collection has on the performance and scalability of MRJ. We propose the use of memory management auto-tuning techniques based on machine learning. With our auto-tuning approach, we are able to achieve MRJ performance within 10% of optimal on 75% of our benchmark tests

    Data Mining to Uncover Heterogeneous Water Use Behaviors From Smart Meter Data

    Get PDF
    Knowledge on the determinants and patterns of water demand for different consumers supports the design of customized demand management strategies. Smart meters coupled with big data analytics tools create a unique opportunity to support such strategies. Yet, at present, the information content of smart meter data is not fully mined and usually needs to be complemented with water fixture inventory and survey data to achieve detailed customer segmentation based on end use water usage. In this paper, we developed a dataā€driven approach that extracts information on heterogeneous water end use routines, main end use components, and temporal characteristics, only via data mining existing smart meter readings at the scale of individual households. We tested our approach on data from 327 households in Australia, each monitored with smart meters logging water use readings every 5 s. As part of the approach, we first disaggregated the householdā€level water use time series into different end uses via Autoflow. We then adapted a customer segmentation based on eigenbehavior analysis to discriminate among heterogeneous water end use routines and identify clusters of consumers presenting similar routines. Results revealed three main water end use profile clusters, each characterized by a primary end use: shower, clothes washing, and irrigation. Timeā€ofā€use and intensityā€ofā€use differences exist within each class, as well as different characteristics of regularity and periodicity over time. Our customer segmentation analysis approach provides utilities with a concise snapshot of recurrent water use routines from smart meter data and can be used to support customized demand management strategies.TU Berlin, Open-Access-Mittel - 201

    Mining typical load profiles in buildings to support energy management in the smart city context

    Get PDF
    Mining typical load profiles in buildings to drive energy management strategies is a fundamental task to be addressed in a smart city environment. In this work, a general framework on load profiles characterisation in buildings based on the recent scientific literature is proposed . The process relies on the combination of different pattern recognition and classification algorithms in order to provide a robust insight of the energy usage patterns at different level s and at different scales (from single building to stock of buildings). Several im plications related to energy profiling in buildings, including tariff design, demand side management and advanced energy diagnos is are discussed. Moreover, a robust methodology to mine typical energy patterns to support advanced energy diagnosis in buildin gs is introduced by analysing the monitored energy consumption of a cooling/heating mechanical room

    Identifying clusters of anomalous payments in the salvadorian payment system

    Get PDF
    We develop an unsupervised methodology to group payments and identify possible anomalies. With our methodology, we identify clusters based on a set of network features, using transactional (unlabeled) information from a systemically important payment system of El Salvador. We first preprocess network features, such as degree and strength, through a principal components analysis we reduce the dimensionality of the newly defined data, then we place the main variables into clustering algorithms (k-means and DBSCAN) to analyze anomalous payments. We then analyze, these clusters using random forest to obtain the main network feature. Our results suggest that the proposed methodology works very well to detect anomalous payments, and it is very important to study the case of El Salvador, because of the recent restructuring of the Massive Payment System in El Salvador (promoted by the Transfer365 project), because the authorities want to increase financial inclusion. This change will make the SPM available to the public, to diversify services and incorporate more participants because, historically, it has operated with only three active participants. We expected that Transfer365 will interconnect the LBTR participants' systems with their banking core, the systems of the Ministry of Finance, and other authorized participants to channel large payment flows. Then, identifying possible anomalies through methodology will enhance risk monitoring and management by payment systems overseers
    • ā€¦
    corecore