Search CORE

107,605 research outputs found

Facilitating and Enhancing the Performance of Model Selection for Energy Time Series Forecasting in Cluster Computing Environments

Author: Shahoud Shadi
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 14/01/2023
Field of study

Applying Machine Learning (ML) manually to a given problem setting is a tedious and time-consuming process which brings many challenges with it, especially in the context of Big Data. In such a context, gaining insightful information, finding patterns, and extracting knowledge from large datasets are quite complex tasks. Additionally, the configurations of the underlying Big Data infrastructure introduce more complexity for configuring and running ML tasks. With the growing interest in ML the last few years, particularly people without extensive ML expertise have a high demand for frameworks assisting people in applying the right ML algorithm to their problem setting. This is especially true in the field of smart energy system applications where more and more ML algorithms are used e.g. for time series forecasting. Generally, two groups of non-expert users are distinguished to perform energy time series forecasting. The first one includes the users who are familiar with statistics and ML but are not able to write the necessary programming code for training and evaluating ML models using the well-known trial-and-error approach. Such an approach is time consuming and wastes resources for constructing multiple models. The second group is even more inexperienced in programming and not knowledgeable in statistics and ML but wants to apply given ML solutions to their problem settings. The goal of this thesis is to scientifically explore, in the context of more concrete use cases in the energy domain, how such non-expert users can be optimally supported in creating and performing ML tasks in practice on cluster computing environments. To support the first group of non-expert users, an easy-to-use modular extendable microservice-based ML solution for instrumenting and evaluating ML algorithms on top of a Big Data technology stack is conceptualized and evaluated. Our proposed solution facilitates applying trial-and-error approach by hiding the low level complexities from the users and introduces the best conditions to efficiently perform ML tasks in cluster computing environments. To support the second group of non-expert users, the first solution is extended to realize meta learning approaches for automated model selection. We evaluate how meta learning technology can be efficiently applied to the problem space of data analytics for smart energy systems to assist energy system experts which are not data analytics experts in applying the right ML algorithms to their data analytics problems. To enhance the predictive performance of meta learning, an efficient characterization of energy time series datasets is required. To this end, Descriptive Statistics Time based Meta Features (DSTMF), a new kind of meta features, is designed to accurately capture the deep characteristics of energy time series datasets. We find that DSTMF outperforms the other state-of-the-art meta feature sets introduced in the literature to characterize energy time series datasets in terms of the accuracy of meta learning models and the time needed to extract them. Further enhancement in the predictive performance of the meta learning classification model is achieved by training the meta learner on new efficient meta examples. To this end, we proposed two new approaches to generate new energy time series datasets to be used as training meta examples by the meta learner depending on the type of time series dataset (i.e. generation or energy consumption time series). We find that extending the original training sets with new meta examples generated by our approaches outperformed the case in which the original is extended by new simulated energy time series datasets

KITopen

Recommended from our members

Single-Route and Dual-Route Approaches to Reading Aloud Difficulties Associated with Dysphasia.

Author: Mack Siobhan Kathryn
Publication venue
Publication date: 01/01/1999
Field of study

The study of reading aloud is currently informed by two main types of theory: modular dual-route and connectionist single-route. One difference between the theories is the type of word classification system which they favour. Dual-route theory employs the regular-irregular dichotomy of classification, whereas single route considers body neighbourhoods to be a more informative approach. This thesis explores the reading aloud performance of a group of people with dysphasia from the two theoretical standpoints by employing a specifically prepared set of real and pseudoword stimuli. As well as being classified according to regularity and body neighbourhood, all the real word stimuli were controlled for frequency. The pseudowords were divided into two groups, common pseudowords and pseudohomophones, and classified according to body neighbourhood. There were two main phases to the study. In the first phase, the stimuli were piloted and the response time performances of a group of people with dysphasia and a group of matched control people were compared. In the second phase, a series of tasks was developed to investigate which means of word classification best explained the visual lexical decision and reading aloud performance of people with dysphasia. The influence of word knowledge was also considered. The data was analysed both quantitatively and qualitatively. The quantitative analysis of the number of errors made indicated that classification of items by body neighbourhood and frequency provided the more comprehensive explanation of the data. Investigation of the types of errors that were made did not find a significant relationship between word type and error type, but again the results indicated that the influence of frequency and body neighbourhood was stronger than that of regularity. The findings are discussed both in terms of their implications for the two theories of reading aloud and their relevance to clinical practice

Open Research Online (The Open University)

Queen Margaret University eResearch

OpenGrey Repository

Detecting modules in dense weighted networks with the Potts method

Author: Arenas A
Blondel V D Guillaume J L Lambiotte R Lefebvre E
Dorogovtsev S N
Erdös P
Farkas I
Fortunato S Castellano C
Jari Saramäki
Jussi M Kumpula
Kimmo Kaski
Lambiotte R Blondel V D de Kerchove C Huens E Prieur C Smoreda Z Van Dooren P
Lancichinetti A Fortunato S Kertesz J
Newman M E J
Onnela J P
Ronhovde P Nussinov Z
Tapio Heimo
Publication venue: 'IOP Publishing'
Publication date: 01/01/2008
Field of study

We address the problem of multiresolution module detection in dense weighted networks, where the modular structure is encoded in the weights rather than topology. We discuss a weighted version of the q-state Potts method, which was originally introduced by Reichardt and Bornholdt. This weighted method can be directly applied to dense networks. We discuss the dependence of the resolution of the method on its tuning parameter and network properties, using sparse and dense weighted networks with built-in modules as example cases. Finally, we apply the method to data on stock price correlations, and show that the resulting modules correspond well to known structural properties of this correlation network.Comment: 14 pages, 6 figures. v2: 1 figure added, 1 reference added, minor changes. v3: 3 references added, minor change

arXiv.org e-Print Archive

CiteSeerX

Crossref

Modular invariants and subfactors

Author: Böckenhauer J.
Evans D. E.
Publication venue
Publication date: 01/01/2000
Field of study

In this lecture we explain the intimate relationship between modular invariants in conformal field theory and braided subfactors in operator algebras. Our analysis is based on an approach to modular invariants using braided sector induction ("

\alpha

-induction") arising from the treatment of conformal field theory in the Doplicher-Haag-Roberts framework. Many properties of modular invariants which have so far been noticed empirically and considered mysterious can be rigorously derived in a very general setting in the subfactor context. For example, the connection between modular invariants and graphs (cf. the A-D-E classification for

SU(2)_k

) finds a natural explanation and interpretation. We try to give an overview on the current state of affairs concerning the expected equivalence between the classifications of braided subfactors and modular invariant two-dimensional conformal field theories.Comment: 25 pages, AMS LaTeX, epic, eepic, doc-class fic-1.cl

arXiv.org e-Print Archive

Online Research @ Cardiff

Modular generalized Springer correspondence III: exceptional groups

Author: Achar Pramod N.
Henderson Anthony
Juteau Daniel
Riche Simon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/10/2016
Field of study

We complete the construction of the modular generalized Springer correspondence for an arbitrary connected reductive group, with a uniform proof of the disjointness of induction series that avoids the case-by-case arguments for classical groups used in previous papers in the series. We show that the induction series containing the trivial local system on the regular nilpotent orbit is determined by the Sylow subgroups of the Weyl group. Under some assumptions, we give an algorithm for determining the induction series associated to the minimal cuspidal datum with a given central character. We also provide tables and other information on the modular generalized Springer correspondence for quasi-simple groups of exceptional type, including a complete classification of cuspidal pairs in the case of good characteristic, and a full determination of the correspondence in type

G_2

.Comment: 40 pages. Version 2: added section 7.5, modified Table 5.2 to match current conventions of GAP3. Version 3 has minor edits suggested by the referee, including a slight strengthening of Proposition 3.2; final version, to appear in Math. Annale

arXiv.org e-Print Archive

HAL - Normandie Université

CiteSeerX

HAL Clermont Université

Louisiana State University

3d Modularity

Author: Cheng Miranda C. N.
Chun Sungbong
Ferrari Francesca
Gukov Sergei
Harrison Sarah M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2019
Field of study

We find and propose an explanation for a large variety of modularity-related symmetries in problems of 3-manifold topology and physics of 3d

\mathcal{N}=2

theories where such structures a priori are not manifest. These modular structures include: mock modular forms,

SL(2,\mathbb{Z})

Weil representations, quantum modular forms, non-semisimple modular tensor categories, and chiral algebras of logarithmic CFTs.Comment: 119 pages, 10 figures and 20 table

arXiv.org e-Print Archive

Caltech Authors

Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks

Author: Berrar
Bramer
Bramer
Bramer
Bramer
Bramer
Cendrowska
Cohen
Corkill
Frederic Stahl
Hennessy
Hunt
Hwang
Jiang
Max Bramer
Michalski
Mutlu
Nolle
Pham
Provost
Quinlan
Quinlan
Smyth
Stahl
Stahl
Stahl
Stahl
Szalay
Witten
Xavier
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classiﬁcation rule induction, parallelisation of classiﬁcation rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classiﬁcation rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classiﬁcation rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach.are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classiﬁcation rule induction, parallelisation of classiﬁcation rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classiﬁcation rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classiﬁcation rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach

Central Archive at the University of Reading

Crossref

Bournemouth University Research Online