5 research outputs found

    Assessing the impact of employing machine learning-based baseline load prediction pipelines with sliding-window training scheme on offered flexibility estimation for different building categories

    Get PDF
    The present study is focused on assessing the impact of the performance of baseline load prediction pipelines on the estimation (by the grid operator) accuracy of the flexibility offered by different categories of buildings. Accordingly, the corresponding impact of employing different machine learning (ML) algorithms, with sliding-window and offline training schemes, for hour-ahead baseline load prediction has been investigated and compared. Using a smart meter measurements dataset, training window sizes and the most promising pipeline for each building category are first identified. Next, the consumption profiles of five buildings (belonging to each category), with the regular operation (baseline load) and while offering flexibility, are physically simulated. Finally, the identified pipelines are used for predicting the baseline loads, and the resulting error in estimating the provided flexibility is determined. Obtained results demonstrate that the identified most promising prediction pipeline (extra trees algorithm with a sliding window of 5 weeks) offers a notably superior performance compared to that of offline training (average score of 0.91 vs. 0.87). Employing these pipelines permits estimating the provided flexibility with acceptable accuracy (flexibility index's mean relative error between -2.45% to +2.79%), permitting the grid operator to guarantee fair compensation for buildings' offered flexibility.publishedVersio

    Assessing the impact of employing machine learning-based baseline load prediction pipelines with sliding-window training scheme on offered flexibility estimation for different building categories

    Get PDF
    The present study is focused on assessing the impact of the performance of baseline load prediction pipelines on the estimation (by the grid operator) accuracy of the flexibility offered by different categories of buildings. Accordingly, the corresponding impact of employing different machine learning (ML) algorithms, with sliding-window and offline training schemes, for hour-ahead baseline load prediction has been investigated and compared. Using a smart meter measurements dataset, training window sizes and the most promising pipeline for each building category are first identified. Next, the consumption profiles of five buildings (belonging to each category), with the regular operation (baseline load) and while offering flexibility, are physically simulated. Finally, the identified pipelines are used for predicting the baseline loads, and the resulting error in estimating the provided flexibility is determined. Obtained results demonstrate that the identified most promising prediction pipeline (extra trees algorithm with a sliding window of 5 weeks) offers a notably superior performance compared to that of offline training (average R2 score of 0.91 vs. 0.87). Employing these pipelines permits estimating the provided flexibility with acceptable accuracy (flexibility index's mean relative error between -2.45% to +2.79%), permitting the grid operator to guarantee fair compensation for buildings' offered flexibility

    Estimação de energia e qualidade de dados em condições de fina segmentação e alto ruído de empilhamento

    Get PDF
    The hadronic calorimeter (TileCal) of ATLAS (A Toroidal LHC ApparatuS), one of the major LHC (Large Hadron Collider) particle accelerator experiments at CERN, consists of more than 10,000 read channels that work at a 40 MHz event rate. The quality of the results obtained in this experiment depends on the correct estimation of the energy of the particles that interact with its material. The energy estimation can be compromised by a number of factors, such as noisy channels, the method chosen for the online or offline estimation of energy and, mainly, by electronic and pile-up noise. The present work presents a method that uses a minimum variance estimator to mitigate noise in groupings of reading channels of a calorimeter built with read redundancy. It is also be shown that this method can be used to identify and mask noisy channels of a calorimeter. We will also present measures to evaluate energy estimation algorithms using real particle collision data. The results show that the proposed method achieves better precision of energy estimation by up to 41% without compromising and, in some cases, improving its accuracy approximating the estimation to real value. The method is also independent of the estimation algorithms used for the channel, as well as being effective in several stacking noise scenarios. The algorithm evaluation measures were effective in evaluating an online estimation algorithm in TileCal.O calorímetro hadrônico (TileCal) do ATLAS (A Toroidal LHC ApparatuS), um do principais experimentos do acelerador de partículas LHC (Large Hadron Collider) no CERN, é composto por mais de 10.000 canais de leitura que trabalham com uma taxa de eventos de 40 MHz. A qualidade dos resultados obtidos nesse experimento depende da correta estimação da energia das partículas que interagem com seu material. A estimação da energia pode ser comprometida por uma série de fatores como, canais ruidosos, o método escolhido para a estimação online ou offline de energia e, principalmente, pelo ruído eletrônico e de empilhamento. O presente trabalho apresenta um método que utiliza um estimador de mínima variância para mitigar o ruído em agrupamentos de canais de leitura de um calorímetro construído com redundância de leitura. Também será mostrado que este método pode ser utilizado para identificar e mascarar canais ruidosos de um calorímetro. Além disso, apresentaremos medidas de avaliação de algoritmos de estimação de energia utilizando dados reais de colisão de partículas. Os resultados obtidos mostram que o método proposto consegue melhor em até 41% a precisão da estimação de energia, sem comprometer e, em alguns casos, melhorando sua exatidão aproximando a estimação do valor real. O método também se mostra independente dos algoritmos de estimação utilizado para o canal, além de ter se mostrado eficaz em diversos cenários de ruído de empilhamento. As medidas de avaliação de algoritmos mostraram-se eficazes na avaliação de um algoritmo de estimação online no TileCal

    Analytical Techniques for the Improvement of Mass Spectrometry Protein Profiling

    Get PDF
    Bioinformatics is rapidly advancing through the "post-genomic" era following the sequencing of the human genome. In preparation for studying the inner workings behind genes, proteins and even smaller biological elements, several subdivisions of bioinformatics have developed. The subdivision of proteomics, concerning the structure and function of proteins, has been aided by the mass spectrometry data source. Biofluid or tissue samples are rapidly assayed for their protein composition. The resulting mass spectra are analyzed using machine learning techniques to discover reliable patterns which discriminate samples from two populations, for example, healthy or diseased, or treatment responders versus non-responders. However, this data source is imperfect and faces several challenges: unwanted variability arising from the data collection process, obtaining a robust discriminative model that generalizes well to future data, and validating a predictive pattern statistically and biologically.This thesis presents several techniques which attempt to intelligently deal with the problems facing each stage of the analytical process. First, an automatic preprocessing method selection system is demonstrated. This system learns from data and selects a combination of preprocessing methods which is most appropriate for the task at hand. This reduces the noise affecting potential predictive patterns. Our results suggest that this method can help adapt to data from different technologies, improving downstream predictive performance. Next, the issues of feature selection and predictive modeling are revisited with respect to the unique challenges posed by proteomic profile data. Approaches to model selection through kernel learning are also investigated. Key insights are obtained for designing the feature selection and predictive modeling portion of the analytical framework. Finally, methods for interpreting the resultsof predictive modeling are demonstrated. These methods are used to assure the user of various desirable properties: validation of the strength of a predictive model, validation of reproducible signal across multiple data generation sessions and generalizability of predictive models to future data. A method for labeling profile features with biological identities is also presented, which aids in the interpretation of the data. Overall, these novel techniques give the protein profiling community additional support and leverage to aid the predictive capability of the technology
    corecore