7,392 research outputs found

    Exploiting Structural Properties in the Analysis of High-dimensional Dynamical Systems

    Get PDF
    The physical and cyber domains with which we interact are filled with high-dimensional dynamical systems. In machine learning, for instance, the evolution of overparametrized neural networks can be seen as a dynamical system. In networked systems, numerous agents or nodes dynamically interact with each other. A deep understanding of these systems can enable us to predict their behavior, identify potential pitfalls, and devise effective solutions for optimal outcomes. In this dissertation, we will discuss two classes of high-dimensional dynamical systems with specific structural properties that aid in understanding their dynamic behavior. In the first scenario, we consider the training dynamics of multi-layer neural networks. The high dimensionality comes from overparametrization: a typical network has a large depth and hidden layer width. We are interested in the following question regarding convergence: Do network weights converge to an equilibrium point corresponding to a global minimum of our training loss, and how fast is the convergence rate? The key to those questions is the symmetry of the weights, a critical property induced by the multi-layer architecture. Such symmetry leads to a set of time-invariant quantities, called weight imbalance, that restrict the training trajectory to a low-dimensional manifold defined by the weight initialization. A tailored convergence analysis is developed over this low-dimensional manifold, showing improved rate bounds for several multi-layer network models studied in the literature, leading to novel characterizations of the effect of weight imbalance on the convergence rate. In the second scenario, we consider large-scale networked systems with multiple weakly-connected groups. Such a multi-cluster structure leads to a time-scale separation between the fast intra-group interaction due to high intra-group connectivity, and the slow inter-group oscillation, due to the weak inter-group connection. We develop a novel frequency-domain network coherence analysis that captures both the coherent behavior within each group, and the dynamical interaction between groups, leading to a structure-preserving model-reduction methodology for large-scale dynamic networks with multiple clusters under general node dynamics assumptions

    Tensor representation-based transferability analytics and selective transfer learning of prognostic knowledge for remaining useful life prediction across machines

    Get PDF
    In recent years, deep transfer learning techniques have been successfully applied to solve RUL prediction across different working conditions. However, for RUL prediction across different machines in which the data distribution and fault evolution characteristics vary largely, the extraction and transition of prognostic knowledge become more challenging. Even if fault mode information can assist in the knowledge transfer, model bias will inevitably exist on the target machine with mixed or unknown faults. To address this issue from a transferability perspective, this paper proposes a novel selective transfer learning approach for RUL prediction across machines. First, the paper utilizes the tensor representation to construct the meta-degradation trend of each fault mode and evaluates the transferability of source domain data from fault mode and degradation characteristics through a new cross-machine transfer degree indicator (M-TDI). Second, a Long Short-Term Memory (LSTM)-based selective transfer strategy is proposed using the M-TDIs. The paper designs a training algorithm with an alternating optimization scheme to seek the optimal tensor decomposition and knowledge transfer effect. Theoretical analysis proves that the proposed approach significantly reduces the upper bound of prediction error. Furthermore, experimental results on three benchmark datasets prove the effectiveness of the proposed approach

    Deep ensemble model-based moving object detection and classification using SAR images

    Get PDF
    In recent decades, image processing and computer vision models have played a vital role in moving object detection on the synthetic aperture radar (SAR) images. Capturing of moving objects in the SAR images is a difficult task. In this study, a new automated model for detecting moving objects is proposed using SAR images. The proposed model has four main steps, namely, preprocessing, segmentation, feature extraction, and classification. Initially, the input SAR image is pre-processed using a histogram equalization technique. Then, the weighted Otsu-based segmentation algorithm is applied for segmenting the object regions from the pre-processed images. When using the weighted Otsu, the segmented grayscale images are not only clear but also retain the detailed features of grayscale images. Next, feature extraction is carried out by gray-level co-occurrence matrix (GLCM), median binary patterns (MBPs), and additive harmonic mean estimated local Gabor binary pattern (AHME-LGBP). The final step is classification using deep ensemble models, where the objects are classified by employing the ensemble deep learning technique, combining the models like the bidirectional long short-term memory (Bi-LSTM), recurrent neural network (RNN), and improved deep belief network (IDBN), which is trained with the features extracted previously. The combined models increase the accuracy of the results significantly. Furthermore, ensemble modeling reduces the variance and modeling method bias, which decreases the chances of overfitting. Compared to a single contributing model, ensemble models perform better and make better predictions. Additionally, an ensemble lessens the spread or dispersion of the model performance and prediction accuracy. Finally, the performance of the proposed model is related to the conventional models with respect to different measures. In the mean-case scenario, the proposed ensemble model has a minimum error value of 0.032, which is better related to other models. In both median- and best-case scenario studies, the ensemble model has a lower error value of 0.029 and 0.015

    Statistical analysis of grouped text documents

    Get PDF
    L'argomento di questa tesi sono i modelli statistici per l'analisi dei dati testuali, con particolare attenzione ai contesti in cui i campioni di testo sono raggruppati. Quando si ha a che fare con dati testuali, il primo problema è quello di elaborarli, per renderli compatibili dal punto di vista computazionale e metodologico con i metodi matematici e statistici prodotti e continuamente sviluppati dalla comunità scientifica. Per questo motivo, la tesi passa in rassegna i metodi esistenti per la rappresentazione analitica e l'elaborazione di campioni di dati testuali, compresi i "Vector Space Models", le "rappresentazioni distribuite" di parole e documenti e i "contextualized embeddings". Questa rassegna comporta la standardizzazione di una notazione che, anche all'interno dello stesso approccio di rappresentazione, appare molto eterogenea in letteratura. Vengono poi esplorati due domini di applicazione: i social media e il turismo culturale. Per quanto riguarda il primo, viene proposto uno studio sull'autodescrizione di gruppi diversi di individui sulla piattaforma StockTwits, dove i mercati finanziari sono gli argomenti dominanti. La metodologia proposta ha integrato diversi tipi di dati, sia testuali che variabili categoriche. Questo studio ha agevolato la comprensione sul modo in cui le persone si presentano online e ha trovato stutture di comportamento ricorrenti all'interno di gruppi di utenti. Per quanto riguarda il turismo culturale, la tesi approfondisce uno studio condotto nell'ambito del progetto "Data Science for Brescia - Arts and Cultural Places", in cui è stato addestrato un modello linguistico per classificare le recensioni online scritte in italiano in quattro aree semantiche distinte relative alle attrazioni culturali della città di Brescia. Il modello proposto permette di identificare le attrazioni nei documenti di testo, anche quando non sono esplicitamente menzionate nei metadati del documento, aprendo così la possibilità di espandere il database relativo a queste attrazioni culturali con nuove fonti, come piattaforme di social media, forum e altri spazi online. Infine, la tesi presenta uno studio metodologico che esamina la specificità di gruppo delle parole, analizzando diversi stimatori di specificità di gruppo proposti in letteratura. Lo studio ha preso in considerazione documenti testuali raggruppati con variabile di "outcome" e variabile di gruppo. Il suo contributo consiste nella proposta di modellare il corpus di documenti come una distribuzione multivariata, consentendo la simulazione di corpora di documenti di testo con caratteristiche predefinite. La simulazione ha fornito preziose indicazioni sulla relazione tra gruppi di documenti e parole. Inoltre, tutti i risultati possono essere liberamente esplorati attraverso un'applicazione web, i cui componenti sono altresì descritti in questo manoscritto. In conclusione, questa tesi è stata concepita come una raccolta di studi, ognuno dei quali suggerisce percorsi di ricerca futuri per affrontare le sfide dell'analisi dei dati testuali raggruppati.The topic of this thesis is statistical models for the analysis of textual data, emphasizing contexts in which text samples are grouped. When dealing with text data, the first issue is to process it, making it computationally and methodologically compatible with the existing mathematical and statistical methods produced and continually developed by the scientific community. Therefore, the thesis firstly reviews existing methods for analytically representing and processing textual datasets, including Vector Space Models, distributed representations of words and documents, and contextualized embeddings. It realizes this review by standardizing a notation that, even within the same representation approach, appears highly heterogeneous in the literature. Then, two domains of application are explored: social media and cultural tourism. About the former, a study is proposed about self-presentation among diverse groups of individuals on the StockTwits platform, where finance and stock markets are the dominant topics. The methodology proposed integrated various types of data, including textual and categorical data. This study revealed insights into how people present themselves online and found recurring patterns within groups of users. About the latter, the thesis delves into a study conducted as part of the "Data Science for Brescia - Arts and Cultural Places" Project, where a language model was trained to classify Italian-written online reviews into four distinct semantic areas related to cultural attractions in the Italian city of Brescia. The model proposed allows for the identification of attractions in text documents, even when not explicitly mentioned in document metadata, thus opening possibilities for expanding the database related to these cultural attractions with new sources, such as social media platforms, forums, and other online spaces. Lastly, the thesis presents a methodological study examining the group-specificity of words, analyzing various group-specificity estimators proposed in the literature. The study considered grouped text documents with both outcome and group variables. Its contribution consists of the proposal of modeling the corpus of documents as a multivariate distribution, enabling the simulation of corpora of text documents with predefined characteristics. The simulation provided valuable insights into the relationship between groups of documents and words. Furthermore, all its results can be freely explored through a web application, whose components are also described in this manuscript. In conclusion, this thesis has been conceived as a collection of papers. It aimed to contribute to the field with both applications and methodological proposals, and each study presented here suggests paths for future research to address the challenges in the analysis of grouped textual data

    Explainable Early Prediction of Gestational Diabetes Biomarkers by Combining Medical Background and Wearable Devices: A Pilot Study with a Cohort Group in South Africa

    Get PDF
    This study aims to explore the potential of Internet of Things (IoT) devices and explainable Artificial Intelligence (AI) techniques in predicting biomarker values associated with GDM when measured 13 - 16 weeks prior to diagnosis. We developed a system that forecasts biomarkers such as LDL, HDL, triglycerides, cholesterol, HbA1c, and results from the Oral Glucose Tolerance Test (OGTT) including fasting glucose, 1-hour, and 2-hour post- load glucose values. These biomarker values are predicted based on sensory measurements collected around week 12 of pregnancy, including continuous glucose levels, short physical movement recordings, and medical background information. To the best of our knowledge, this is the first study to forecast GDM-associated biomarker values 13 to 16 weeks prior to the GDM screening test, using continuous glucose monitoring devices, a wristband for activity detection, and medical background data. We applied machine learning models, specifically Decision Tree and Random Forest Regressors, along with Coupled-Matrix Tensor Factorisation (CMTF) and Elastic Net techniques, examining all possible combinations of these methods across different data modalities. The results demonstrated good performance for most biomarkers. On average, the models achieved Mean Squared Error (MSE) between 0.29 and 0.42 and Mean Absolute Error (MAE) between 0.23 and 0.45 for biomarkers like HDL, LDL, cholesterol, and HbA1c. For the OGTT glucose values, the average MSE ranged from 0.95 to 2.44, and the average MAE ranged from 0.72 to 0.91. Additionally, the utilisation of CMTF with Alternating Least Squares technique yielded slightly better results (0.16 MSE and 0.07 MAE on average) compared to the well-known Elastic Net feature se- lection technique. While our study was conducted with a limited cohort in South Africa, our findings offer promising indications regarding the potential for predicting biomarker values in pregnant women through the integration of wearable devices and medical background data in the analysis. Nevertheless, further validation on a larger, more diverse cohort is imperative to substantiate these encouraging results

    Integrated Spatio-Temporal Deep Clustering (ISTDC) for cognitive workload assessment

    Get PDF
    Traditional high-dimensional electroencephalography (EEG) features (spectral or temporal) may not always attain satisfactory results in cognitive workload estimation. In contrast, deep representation learning (DRL) transforms high-dimensional data into cluster-friendly low-dimensional feature space. Therefore, this paper proposes an Integrated Spatio-Temporal Deep Clustering (ISTDC) model that uses DRL followed by a clustering method to achieve better clustering performance. The proposed model is illustrated using four Algorithms and Variational Bayesian Gaussian Mixture Model (VBGMM) clustering method. Temporal and spatial Variational Auto Encoder (VAE) models (mentioned in Algorithm 2 and Algorithm 3) learn temporal and spatial latent features from sequence-wise EEG signals and scalp topographical maps using the Long short-term memory and Convolutional Neural Network models. The concatenated spatio-temporal latent feature (mentioned in Algorithm 4) is passed to the VBGMM clustering method to efficiently estimate workload levels of -back task. For the 0-back vs. 2-back task, the proposed model achieves the maximum mean clustering accuracy of 98.0%, and it improves by 11.0% over the state-of-the-art method. The results also indicate that the proposed multimodal approach outperforms temporal and spatial latent feature-based unimodal models in workload assessment

    An ensemble model for predictive energy performance:Closing the gap between actual and predicted energy use in residential buildings

    Get PDF
    The design stage of a building plays a pivotal role in influencing its life cycle and overall performance. Accurate predictions of a building's performance are crucial for informed decision-making, particularly in terms of energy performance, given the escalating global awareness of climate change and the imperative to enhance energy efficiency in buildings. However, a well-documented energy performance gap persists between actual and predicted energy consumption, primarily attributed to the unpredictable nature of occupant behavior.Existing methodologies for predicting and simulating occupant behavior in buildings frequently neglect or exclusively concentrate on particular behaviors, resulting in uncertainties in energy performance predictions. Machine learning approaches have exhibited increased accuracy in predicting occupant energy behavior, yet the majority of extant studies focus on specific behavior types rather than investigating the interactions among all contributing factors. This dissertation delves into the building energy performance gap, with a particular emphasis on the influence of occupants on energy performance. A comprehensive literature review scrutinizes machine learning models employed for predicting occupants' behavior in buildings and assesses their performance. The review uncovers knowledge gaps, as most studies are case-specific and lack a consolidated database to examine diverse behaviors across various building types.An ensemble model integrating occupant behavior parameters is devised to enhance the accuracy of energy performance predictions in residential buildings. Multiple algorithms are examined, with the selection of algorithms contingent upon evaluation metrics. The ensemble model is validated through a case study that compares actual energy consumption with the predictions of the ensemble model and an EnergyPlus simulation that takes occupant behavior factors into account.The findings demonstrate that the ensemble model provides considerably more accurate predictions of actual energy consumption compared to the EnergyPlus simulation. This dissertation also addresses the research limitations, including the reusability of the model and the requirement for additional datasets to bolster confidence in the model's applicability across diverse building types and occupant behavior patterns.In summary, this dissertation presents an ensemble model that endeavors to bridge the gap between actual and predicted energy usage in residential buildings by incorporating occupant behavior parameters, leading to more precise energy performance predictions and promoting superior energy management strategies

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Smart Gas Sensors: Materials, Technologies, Practical ‎Applications, and Use of Machine Learning – A Review

    Get PDF
    The electronic nose, popularly known as the E-nose, that combines gas sensor arrays (GSAs) with machine learning has gained a strong foothold in gas sensing technology. The E-nose designed to mimic the human olfactory system, is used for the detection and identification of various volatile compounds. The GSAs develop a unique signal fingerprint for each volatile compound to enable pattern recognition using machine learning algorithms. The inexpensive, portable and non-invasive characteristics of the E-nose system have rendered it indispensable within the gas-sensing arena. As a result, E-noses have been widely employed in several applications in the areas of the food industry, health management, disease diagnosis, water and air quality control, and toxic gas leakage detection. This paper reviews the various sensor fabrication technologies of GSAs and highlights the main operational framework of the E-nose system. The paper details vital signal pre-processing techniques of feature extraction, feature selection, in addition to machine learning algorithms such as SVM, kNN, ANN, and Random Forests for determining the type of gas and estimating its concentration in a competitive environment. The paper further explores the potential applications of E-noses for diagnosing diseases, monitoring air quality, assessing the quality of food samples and estimating concentrations of volatile organic compounds (VOCs) in air and in food samples. The review concludes with some challenges faced by E-nose, alternative ways to tackle them and proposes some recommendations as potential future work for further development and design enhancement of E-noses

    Data-assisted modeling of complex chemical and biological systems

    Get PDF
    Complex systems are abundant in chemistry and biology; they can be multiscale, possibly high-dimensional or stochastic, with nonlinear dynamics and interacting components. It is often nontrivial (and sometimes impossible), to determine and study the macroscopic quantities of interest and the equations they obey. One can only (judiciously or randomly) probe the system, gather observations and study trends. In this thesis, Machine Learning is used as a complement to traditional modeling and numerical methods to enable data-assisted (or data-driven) dynamical systems. As case studies, three complex systems are sourced from diverse fields: The first one is a high-dimensional computational neuroscience model of the Suprachiasmatic Nucleus of the human brain, where bifurcation analysis is performed by simply probing the system. Then, manifold learning is employed to discover a latent space of neuronal heterogeneity. Second, Machine Learning surrogate models are used to optimize dynamically operated catalytic reactors. An algorithmic pipeline is presented through which it is possible to program catalysts with active learning. Third, Machine Learning is employed to extract laws of Partial Differential Equations describing bacterial Chemotaxis. It is demonstrated how Machine Learning manages to capture the rules of bacterial motility in the macroscopic level, starting from diverse data sources (including real-world experimental data). More importantly, a framework is constructed though which already existing, partial knowledge of the system can be exploited. These applications showcase how Machine Learning can be used synergistically with traditional simulations in different scenarios: (i) Equations are available but the overall system is so high-dimensional that efficiency and explainability suffer, (ii) Equations are available but lead to highly nonlinear black-box responses, (iii) Only data are available (of varying source and quality) and equations need to be discovered. For such data-assisted dynamical systems, we can perform fundamental tasks, such as integration, steady-state location, continuation and optimization. This work aims to unify traditional scientific computing and Machine Learning, in an efficient, data-economical, generalizable way, where both the physical system and the algorithm matter
    • …
    corecore