4 research outputs found

    Rover: An online Spark SQL tuning service via generalized transfer learning

    Full text link
    Distributed data analytic engines like Spark are common choices to process massive data in industry. However, the performance of Spark SQL highly depends on the choice of configurations, where the optimal ones vary with the executed workloads. Among various alternatives for Spark SQL tuning, Bayesian optimization (BO) is a popular framework that finds near-optimal configurations given sufficient budget, but it suffers from the re-optimization issue and is not practical in real production. When applying transfer learning to accelerate the tuning process, we notice two domain-specific challenges: 1) most previous work focus on transferring tuning history, while expert knowledge from Spark engineers is of great potential to improve the tuning performance but is not well studied so far; 2) history tasks should be carefully utilized, where using dissimilar ones lead to a deteriorated performance in production. In this paper, we present Rover, a deployed online Spark SQL tuning service for efficient and safe search on industrial workloads. To address the challenges, we propose generalized transfer learning to boost the tuning performance based on external knowledge, including expert-assisted Bayesian optimization and controlled history transfer. Experiments on public benchmarks and real-world tasks show the superiority of Rover over competitive baselines. Notably, Rover saves an average of 50.1% of the memory cost on 12k real-world Spark SQL tasks in 20 iterations, among which 76.2% of the tasks achieve a significant memory reduction of over 60%.Comment: Accepted by KDD 202

    A data mining-based method for revealing occupant behavior patterns in using mechanical ventilation systems of Dutch dwellings

    No full text
    Occupant behaviors influence the energy consumption of dwelling mechanical ventilation systems significantly. There is still a lack of effective method to analyze the occupant behaviors in adjusting the mechanical ventilation systems in buildings. Therefore, this study proposes a data mining-based method to reveal the occupant behavior patterns and the motivations behind. A first derivative Gaussian filter-based approach is developed to detect when an occupant increases or decreases the mechanical ventilation flowrate without direct measurements. A logistic regression-based statistical analysis approach is developed to find the crucial factors influencing the behaviors of increasing and decreasing ventilation flowrate. A K-means clustering-based analysis approach is introduced to further find the motivations behind the behaviors. The proposed data mining-based method discovers the ventilation behaviors and the crucial factors influencing them successfully for the occupants from the 10 dwellings located in a Dutch community. The motivation patterns of the ventilation flowrate adjustment behaviors are further revealed based on the discovered crucial factors. The discovered insights are useful to provide more accurate assumptions and inputs for the mechanical ventilation system models. It is also helpful to generate tailored design, refurbishment and control strategies

    A data mining approach to analyze occupant behavior motivation

    No full text
    \u3cp\u3eOccupants' behavior could bring significant impact on the performance of built environment. Methods of analyzing people's behavior have not been adequately developed. The traditional methods such as survey or interview are not efficient. This study proposed a data-driven method to analyze the occupants' behavior, supported by a specific case of analyzing people's adjustment to ventilation system in a Dutch community. In the individual level, to analyze the motivation of a single person, a logistic regression based approach was proposed to classify occupants' behavior of increasing/decreasing the ventilation flowrate and then reveal the motivations behind. In the community level, the behavior motivations derived from different occupants were compared. Three motivational behavior patterns, namely the environment-driven type, the time-driven type and the mixed-type were summarized. The proposed mining method is useful to discover and develop occupant behavior models.\u3c/p\u3

    A data mining-based method for revealing occupant behavior patterns in using mechanical ventilation systems of Dutch dwellings

    No full text
    \u3cp\u3eOccupant behaviors influence the energy consumption of dwelling mechanical ventilation systems significantly. There is still a lack of effective method to analyze the occupant behaviors in adjusting the mechanical ventilation systems in buildings. Therefore, this study proposes a data mining-based method to reveal the occupant behavior patterns and the motivations behind. A first derivative Gaussian filter-based approach is developed to detect when an occupant increases or decreases the mechanical ventilation flowrate without direct measurements. A logistic regression-based statistical analysis approach is developed to find the crucial factors influencing the behaviors of increasing and decreasing ventilation flowrate. A K-means clustering-based analysis approach is introduced to further find the motivations behind the behaviors. The proposed data mining-based method discovers the ventilation behaviors and the crucial factors influencing them successfully for the occupants from the 10 dwellings located in a Dutch community. The motivation patterns of the ventilation flowrate adjustment behaviors are further revealed based on the discovered crucial factors. The discovered insights are useful to provide more accurate assumptions and inputs for the mechanical ventilation system models. It is also helpful to generate tailored design, refurbishment and control strategies.\u3c/p\u3
    corecore