3,677 research outputs found
1. Helgoland Power and Energy Conference - 24. Dresdener Kreis 2023
Der Sammelband "1. Helgoland Power and Energy Conference" beinhaltet neben einem kurzen Bericht zum 24. Treffen des Dresdener Kreises 2023 wissenschaftliche Beiträge von Doktoranden der beteiligten Hochschulinstitute zum Thema Elektroenergieversorgung. Der Dresdener Kreis setzt sich aus der Professur für Elektroenergieversorgung der Technischen Universität Dresden, dem Fachgebiet Elektrische Anlagen und Netze der Universität Duisburg-Essen, dem Fachgebiet Elektrische Energieversorgung der Leibniz Universität Hannover und dem Lehrstuhl Elektrische Netze und Erneuerbare Energie der Otto-von-Guericke Universität Magdeburg zusammen und trifft sich einmal im Jahr zum fachlichen Austausch an einer der beteiligten Universitäten
Monitoring carbon emissions using deep learning and statistical process control: a strategy for impact assessment of governments' carbon reduction policies.
Across the globe, governments are developing policies and strategies to reduce carbon emissions to address climate change. Monitoring the impact of governments' carbon reduction policies can significantly enhance our ability to combat climate change and meet emissions reduction targets. One promising area in this regard is the role of artificial intelligence (AI) in carbon reduction policy and strategy monitoring. While researchers have explored applications of AI on data from various sources, including sensors, satellites, and social media, to identify areas for carbon emissions reduction, AI applications in tracking the effect of governments' carbon reduction plans have been limited. This study presents an AI framework based on long short-term memory (LSTM) and statistical process control (SPC) for the monitoring of variations in carbon emissions, using UK annual CO2 emission (per capita) data, covering a period between 1750 and 2021. This paper used LSTM to develop a surrogate model for the UK's carbon emissions characteristics and behaviours. As observed in our experiments, LSTM has better predictive abilities than ARIMA, Exponential Smoothing and feedforward artificial neural networks (ANN) in predicting CO2 emissions on a yearly prediction horizon. Using the deviation of the recorded emission data from the surrogate process, the variations and trends in these behaviours are then analysed using SPC, specifically Shewhart individual/moving range control charts. The result shows several assignable variations between the mid-1990s and 2021, which correlate with some notable UK government commitments to lower carbon emissions within this period. The framework presented in this paper can help identify periods of significant deviations from a country's normal CO2 emissions, which can potentially result from the government's carbon reduction policies or activities that can alter the amount of CO2 emissions
Statistical analysis of grouped text documents
L'argomento di questa tesi sono i modelli statistici per l'analisi dei dati testuali, con particolare attenzione ai contesti in cui i campioni di testo sono raggruppati.
Quando si ha a che fare con dati testuali, il primo problema è quello di elaborarli, per renderli compatibili dal punto di vista computazionale e metodologico con i metodi matematici e statistici prodotti e continuamente sviluppati dalla comunità scientifica. Per questo motivo, la tesi passa in rassegna i metodi esistenti per la rappresentazione analitica e l'elaborazione di campioni di dati testuali, compresi i "Vector Space Models", le "rappresentazioni distribuite" di parole e documenti e i "contextualized embeddings". Questa rassegna comporta la standardizzazione di una notazione che, anche all'interno dello stesso approccio di rappresentazione, appare molto eterogenea in letteratura.
Vengono poi esplorati due domini di applicazione: i social media e il turismo culturale. Per quanto riguarda il primo, viene proposto uno studio sull'autodescrizione di gruppi diversi di individui sulla piattaforma StockTwits, dove i mercati finanziari sono gli argomenti dominanti. La metodologia proposta ha integrato diversi tipi di dati, sia testuali che variabili categoriche. Questo studio ha agevolato la comprensione sul modo in cui le persone si presentano online e ha trovato stutture di comportamento ricorrenti all'interno di gruppi di utenti.
Per quanto riguarda il turismo culturale, la tesi approfondisce uno studio condotto nell'ambito del progetto "Data Science for Brescia - Arts and Cultural Places", in cui è stato addestrato un modello linguistico per classificare le recensioni online scritte in italiano in quattro aree semantiche distinte relative alle attrazioni culturali della città di Brescia. Il modello proposto permette di identificare le attrazioni nei documenti di testo, anche quando non sono esplicitamente menzionate nei metadati del documento, aprendo così la possibilità di espandere il database relativo a queste attrazioni culturali con nuove fonti, come piattaforme di social media, forum e altri spazi online.
Infine, la tesi presenta uno studio metodologico che esamina la specificità di gruppo delle parole, analizzando diversi stimatori di specificità di gruppo proposti in letteratura. Lo studio ha preso in considerazione documenti testuali raggruppati con variabile di "outcome" e variabile di gruppo. Il suo contributo consiste nella proposta di modellare il corpus di documenti come una distribuzione multivariata, consentendo la simulazione di corpora di documenti di testo con caratteristiche predefinite. La simulazione ha fornito preziose indicazioni sulla relazione tra gruppi di documenti e parole. Inoltre, tutti i risultati possono essere liberamente esplorati attraverso un'applicazione web, i cui componenti sono altresì descritti in questo manoscritto.
In conclusione, questa tesi è stata concepita come una raccolta di studi, ognuno dei quali suggerisce percorsi di ricerca futuri per affrontare le sfide dell'analisi dei dati testuali raggruppati.The topic of this thesis is statistical models for the analysis of textual data, emphasizing contexts in which text samples are grouped.
When dealing with text data, the first issue is to process it, making it computationally and methodologically compatible with the existing mathematical and statistical methods produced and continually developed by the scientific community. Therefore, the thesis firstly reviews existing methods for analytically representing and processing textual datasets, including Vector Space Models, distributed representations of words and documents, and contextualized embeddings. It realizes this review by standardizing a notation that, even within the same representation approach, appears highly heterogeneous in the literature.
Then, two domains of application are explored: social media and cultural tourism. About the former, a study is proposed about self-presentation among diverse groups of individuals on the StockTwits platform, where finance and stock markets are the dominant topics. The methodology proposed integrated various types of data, including textual and categorical data. This study revealed insights into how people present themselves online and found recurring patterns within groups of users.
About the latter, the thesis delves into a study conducted as part of the "Data Science for Brescia - Arts and Cultural Places" Project, where a language model was trained to classify Italian-written online reviews into four distinct semantic areas related to cultural attractions in the Italian city of Brescia. The model proposed allows for the identification of attractions in text documents, even when not explicitly mentioned in document metadata, thus opening possibilities for expanding the database related to these cultural attractions with new sources, such as social media platforms, forums, and other online spaces.
Lastly, the thesis presents a methodological study examining the group-specificity of words, analyzing various group-specificity estimators proposed in the literature. The study considered grouped text documents with both outcome and group variables. Its contribution consists of the proposal of modeling the corpus of documents as a multivariate distribution, enabling the simulation of corpora of text documents with predefined characteristics. The simulation provided valuable insights into the relationship between groups of documents and words. Furthermore, all its results can be freely explored through a web application, whose components are also described in this manuscript.
In conclusion, this thesis has been conceived as a collection of papers. It aimed to contribute to the field with both applications and methodological proposals, and each study presented here suggests paths for future research to address the challenges in the analysis of grouped textual data
UMSL Bulletin 2023-2024
The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp
Analysis and Design of Non-Orthogonal Multiple Access (NOMA) Techniques for Next Generation Wireless Communication Systems
The current surge in wireless connectivity, anticipated to amplify significantly in future wireless technologies, brings a new wave of users. Given the impracticality of an endlessly expanding bandwidth, there’s a pressing need for communication techniques that efficiently serve this burgeoning user base with limited resources. Multiple Access (MA) techniques, notably Orthogonal Multiple Access (OMA), have long addressed bandwidth constraints. However, with escalating user numbers, OMA’s orthogonality becomes limiting for emerging wireless technologies. Non-Orthogonal Multiple Access (NOMA), employing superposition coding, serves more users within the same bandwidth as OMA by allocating different power levels to users whose signals can then be detected using the gap between them, thus offering superior spectral efficiency and massive connectivity. This thesis examines the integration of NOMA techniques with cooperative relaying, EXtrinsic Information Transfer (EXIT) chart analysis, and deep learning for enhancing 6G and beyond communication systems. The adopted methodology aims to optimize the systems’ performance, spanning from bit-error rate (BER) versus signal to noise ratio (SNR) to overall system efficiency and data rates. The primary focus of this thesis is the investigation of the integration of NOMA with cooperative relaying, EXIT chart analysis, and deep learning techniques. In the cooperative relaying context, NOMA notably improved diversity gains, thereby proving the superiority of combining NOMA with cooperative relaying over just NOMA. With EXIT chart analysis, NOMA achieved low BER at mid-range SNR as well as achieved optimal user fairness in the power allocation stage. Additionally, employing a trained neural network enhanced signal detection for NOMA in the deep learning scenario, thereby producing a simpler signal detection for NOMA which addresses NOMAs’ complex receiver problem
Sound Event Detection by Exploring Audio Sequence Modelling
Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing a sound recognition system are, which portion of a sound event should the system analyse, and what proportion of a sound event should the system process in order to claim a confident detection of that particular sound event. While the classification of sound events has improved a lot in recent years, it is considered that the temporal-segmentation of sound events has not improved in the same extent. The aim of this thesis is to propose and develop methods to improve the segmentation and classification of everyday sound events in SED models. In particular, this thesis explores the segmentation of sound events by investigating audio sequence encoding-based and audio sequence modelling-based methods, in an effort to improve the overall sound event detection performance. In the first phase of this thesis, efforts are put towards improving sound event detection by explicitly conditioning the audio sequence representations of an SED model using sound activity detection (SAD) and onset detection. To achieve this, we propose multi-task learning-based SED models in which SAD and onset detection are used as auxiliary tasks for the SED task. The next part of this thesis explores self-attention-based audio sequence modelling, which aggregates audio representations based on temporal relations within and between sound events, scored on the basis of the similarity of sound event portions in audio event sequences. We propose SED models that include memory-controlled, adaptive, dynamic, and source separation-induced self-attention variants, with the aim to improve overall sound recognition
GPT models in construction industry: Opportunities, limitations, and a use case validation
Large Language Models (LLMs) trained on large data sets came into prominence in 2018 after Google introduced BERT. Subsequently, different LLMs such as GPT models from OpenAI have been released. These models perform well on diverse tasks and have been gaining widespread applications in fields such as business and education. However, little is known about the opportunities and challenges of using LLMs in the construction industry. Thus, this study aims to assess GPT models in the construction industry. A critical review, expert discussion and case study validation are employed to achieve the study's objectives. The findings revealed opportunities for GPT models throughout the project lifecycle. The challenges of leveraging GPT models are highlighted and a use case prototype is developed for materials selection and optimization. The findings of the study would be of benefit to researchers, practitioners and stakeholders, as it presents research vistas for LLMs in the construction industry
Predicting Paid Certification in Massive Open Online Courses
Massive open online courses (MOOCs) have been proliferating because of the free or low-cost offering of content for learners, attracting the attention of many stakeholders across the entire educational landscape. Since 2012, coined as “the Year of the MOOCs”, several platforms have gathered millions of learners in just a decade. Nevertheless, the certification rate of both free and paid courses has been low, and only about 4.5–13% and 1–3%, respectively, of the total number of enrolled learners obtain a certificate at the end of their courses. Still, most research concentrates on completion, ignoring the certification problem, and especially its financial aspects. Thus, the research described in the present thesis aimed to investigate paid certification in MOOCs, for the first time, in a comprehensive way, and as early as the first week of the course, by exploring its various levels. First, the latent correlation between learner activities and their paid certification decisions was examined by (1) statistically comparing the activities of non-paying learners with course purchasers and (2) predicting paid certification using different machine learning (ML) techniques. Our temporal (weekly) analysis showed statistical significance at various levels when comparing the activities of non-paying learners with those of the certificate purchasers across the five courses analysed. Furthermore, we used the learner’s activities (number of step accesses, attempts, correct and wrong answers, and time spent on learning steps) to build our paid certification predictor, which achieved promising balanced accuracies (BAs), ranging from 0.77 to 0.95. Having employed simple predictions based on a few clickstream variables, we then analysed more in-depth what other information can be extracted from MOOC interaction (namely discussion forums) for paid certification prediction. However, to better explore the learners’ discussion forums, we built, as an original contribution, MOOCSent, a cross- platform review-based sentiment classifier, using over 1.2 million MOOC sentiment-labelled reviews. MOOCSent addresses various limitations of the current sentiment classifiers including (1) using one single source of data (previous literature on sentiment classification in MOOCs was based on single platforms only, and hence less generalisable, with relatively low number of instances compared to our obtained dataset;) (2) lower model outputs, where most of the current models are based on 2-polar
iii
iv
classifier (positive or negative only); (3) disregarding important sentiment indicators, such as emojis and emoticons, during text embedding; and (4) reporting average performance metrics only, preventing the evaluation of model performance at the level of class (sentiment). Finally, and with the help of MOOCSent, we used the learners’ discussion forums to predict paid certification after annotating learners’ comments and replies with the sentiment using MOOCSent. This multi-input model contains raw data (learner textual inputs), sentiment classification generated by MOOCSent, computed features (number of likes received for each textual input), and several features extracted from the texts (character counts, word counts, and part of speech (POS) tags for each textual instance). This experiment adopted various deep predictive approaches – specifically that allow multi-input architecture - to early (i.e., weekly) investigate if data obtained from MOOC learners’ interaction in discussion forums can predict learners’ purchase decisions (certification). Considering the staggeringly low rate of paid certification in MOOCs, this present thesis contributes to the knowledge and field of MOOC learner analytics with predicting paid certification, for the first time, at such a comprehensive (with data from over 200 thousand learners from 5 different discipline courses), actionable (analysing learners decision from the first week of the course) and longitudinal (with 23 runs from 2013 to 2017) scale. The present thesis contributes with (1) investigating various conventional and deep ML approaches for predicting paid certification in MOOCs using learner clickstreams (Chapter 5) and course discussion forums (Chapter 7), (2) building the largest MOOC sentiment classifier (MOOCSent) based on learners’ reviews of the courses from the leading MOOC platforms, namely Coursera, FutureLearn and Udemy, and handles emojis and emoticons using dedicated lexicons that contain over three thousand corresponding explanatory words/phrases, (3) proposing and developing, for the first time, multi-input model for predicting certification based on the data from discussion forums which synchronously processes the textual (comments and replies) and numerical (number of likes posted and received, sentiments) data from the forums, adapting the suitable classifier for each type of data as explained in detail in Chapter 7
An ensemble model for predictive energy performance:Closing the gap between actual and predicted energy use in residential buildings
The design stage of a building plays a pivotal role in influencing its life cycle and overall performance. Accurate predictions of a building's performance are crucial for informed decision-making, particularly in terms of energy performance, given the escalating global awareness of climate change and the imperative to enhance energy efficiency in buildings. However, a well-documented energy performance gap persists between actual and predicted energy consumption, primarily attributed to the unpredictable nature of occupant behavior.Existing methodologies for predicting and simulating occupant behavior in buildings frequently neglect or exclusively concentrate on particular behaviors, resulting in uncertainties in energy performance predictions. Machine learning approaches have exhibited increased accuracy in predicting occupant energy behavior, yet the majority of extant studies focus on specific behavior types rather than investigating the interactions among all contributing factors. This dissertation delves into the building energy performance gap, with a particular emphasis on the influence of occupants on energy performance. A comprehensive literature review scrutinizes machine learning models employed for predicting occupants' behavior in buildings and assesses their performance. The review uncovers knowledge gaps, as most studies are case-specific and lack a consolidated database to examine diverse behaviors across various building types.An ensemble model integrating occupant behavior parameters is devised to enhance the accuracy of energy performance predictions in residential buildings. Multiple algorithms are examined, with the selection of algorithms contingent upon evaluation metrics. The ensemble model is validated through a case study that compares actual energy consumption with the predictions of the ensemble model and an EnergyPlus simulation that takes occupant behavior factors into account.The findings demonstrate that the ensemble model provides considerably more accurate predictions of actual energy consumption compared to the EnergyPlus simulation. This dissertation also addresses the research limitations, including the reusability of the model and the requirement for additional datasets to bolster confidence in the model's applicability across diverse building types and occupant behavior patterns.In summary, this dissertation presents an ensemble model that endeavors to bridge the gap between actual and predicted energy usage in residential buildings by incorporating occupant behavior parameters, leading to more precise energy performance predictions and promoting superior energy management strategies
- …