107,605 research outputs found

    Facilitating and Enhancing the Performance of Model Selection for Energy Time Series Forecasting in Cluster Computing Environments

    Get PDF
    Applying Machine Learning (ML) manually to a given problem setting is a tedious and time-consuming process which brings many challenges with it, especially in the context of Big Data. In such a context, gaining insightful information, finding patterns, and extracting knowledge from large datasets are quite complex tasks. Additionally, the configurations of the underlying Big Data infrastructure introduce more complexity for configuring and running ML tasks. With the growing interest in ML the last few years, particularly people without extensive ML expertise have a high demand for frameworks assisting people in applying the right ML algorithm to their problem setting. This is especially true in the field of smart energy system applications where more and more ML algorithms are used e.g. for time series forecasting. Generally, two groups of non-expert users are distinguished to perform energy time series forecasting. The first one includes the users who are familiar with statistics and ML but are not able to write the necessary programming code for training and evaluating ML models using the well-known trial-and-error approach. Such an approach is time consuming and wastes resources for constructing multiple models. The second group is even more inexperienced in programming and not knowledgeable in statistics and ML but wants to apply given ML solutions to their problem settings. The goal of this thesis is to scientifically explore, in the context of more concrete use cases in the energy domain, how such non-expert users can be optimally supported in creating and performing ML tasks in practice on cluster computing environments. To support the first group of non-expert users, an easy-to-use modular extendable microservice-based ML solution for instrumenting and evaluating ML algorithms on top of a Big Data technology stack is conceptualized and evaluated. Our proposed solution facilitates applying trial-and-error approach by hiding the low level complexities from the users and introduces the best conditions to efficiently perform ML tasks in cluster computing environments. To support the second group of non-expert users, the first solution is extended to realize meta learning approaches for automated model selection. We evaluate how meta learning technology can be efficiently applied to the problem space of data analytics for smart energy systems to assist energy system experts which are not data analytics experts in applying the right ML algorithms to their data analytics problems. To enhance the predictive performance of meta learning, an efficient characterization of energy time series datasets is required. To this end, Descriptive Statistics Time based Meta Features (DSTMF), a new kind of meta features, is designed to accurately capture the deep characteristics of energy time series datasets. We find that DSTMF outperforms the other state-of-the-art meta feature sets introduced in the literature to characterize energy time series datasets in terms of the accuracy of meta learning models and the time needed to extract them. Further enhancement in the predictive performance of the meta learning classification model is achieved by training the meta learner on new efficient meta examples. To this end, we proposed two new approaches to generate new energy time series datasets to be used as training meta examples by the meta learner depending on the type of time series dataset (i.e. generation or energy consumption time series). We find that extending the original training sets with new meta examples generated by our approaches outperformed the case in which the original is extended by new simulated energy time series datasets

    Detecting modules in dense weighted networks with the Potts method

    Full text link
    We address the problem of multiresolution module detection in dense weighted networks, where the modular structure is encoded in the weights rather than topology. We discuss a weighted version of the q-state Potts method, which was originally introduced by Reichardt and Bornholdt. This weighted method can be directly applied to dense networks. We discuss the dependence of the resolution of the method on its tuning parameter and network properties, using sparse and dense weighted networks with built-in modules as example cases. Finally, we apply the method to data on stock price correlations, and show that the resulting modules correspond well to known structural properties of this correlation network.Comment: 14 pages, 6 figures. v2: 1 figure added, 1 reference added, minor changes. v3: 3 references added, minor change

    Modular invariants and subfactors

    Full text link
    In this lecture we explain the intimate relationship between modular invariants in conformal field theory and braided subfactors in operator algebras. Our analysis is based on an approach to modular invariants using braided sector induction ("α\alpha-induction") arising from the treatment of conformal field theory in the Doplicher-Haag-Roberts framework. Many properties of modular invariants which have so far been noticed empirically and considered mysterious can be rigorously derived in a very general setting in the subfactor context. For example, the connection between modular invariants and graphs (cf. the A-D-E classification for SU(2)kSU(2)_k) finds a natural explanation and interpretation. We try to give an overview on the current state of affairs concerning the expected equivalence between the classifications of braided subfactors and modular invariant two-dimensional conformal field theories.Comment: 25 pages, AMS LaTeX, epic, eepic, doc-class fic-1.cl

    Modular generalized Springer correspondence III: exceptional groups

    Get PDF
    We complete the construction of the modular generalized Springer correspondence for an arbitrary connected reductive group, with a uniform proof of the disjointness of induction series that avoids the case-by-case arguments for classical groups used in previous papers in the series. We show that the induction series containing the trivial local system on the regular nilpotent orbit is determined by the Sylow subgroups of the Weyl group. Under some assumptions, we give an algorithm for determining the induction series associated to the minimal cuspidal datum with a given central character. We also provide tables and other information on the modular generalized Springer correspondence for quasi-simple groups of exceptional type, including a complete classification of cuspidal pairs in the case of good characteristic, and a full determination of the correspondence in type G2G_2.Comment: 40 pages. Version 2: added section 7.5, modified Table 5.2 to match current conventions of GAP3. Version 3 has minor edits suggested by the referee, including a slight strengthening of Proposition 3.2; final version, to appear in Math. Annale

    3d Modularity

    Get PDF
    We find and propose an explanation for a large variety of modularity-related symmetries in problems of 3-manifold topology and physics of 3d N=2\mathcal{N}=2 theories where such structures a priori are not manifest. These modular structures include: mock modular forms, SL(2,Z)SL(2,\mathbb{Z}) Weil representations, quantum modular forms, non-semisimple modular tensor categories, and chiral algebras of logarithmic CFTs.Comment: 119 pages, 10 figures and 20 table

    Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks

    Get PDF
    In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classification rule induction, parallelisation of classification rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classification rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classification rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach.are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classification rule induction, parallelisation of classification rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classification rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classification rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach
    corecore