576 research outputs found

    University of Windsor Graduate Calendar 2023 Spring

    Get PDF
    https://scholar.uwindsor.ca/universitywindsorgraduatecalendars/1027/thumbnail.jp

    University of Windsor Graduate Calendar 2023 Winter

    Get PDF
    https://scholar.uwindsor.ca/universitywindsorgraduatecalendars/1026/thumbnail.jp

    Novel Mixture Allocation Models for Topic Learning

    Get PDF
    Unsupervised learning has been an interesting area of research in recent years. Novel algorithms are being built on the basis of unsupervised learning methodologies to solve many real world problems. Topic modelling is one such fascinating methodology that identifies patterns as topics within data. Introduction of latent Dirichlet Allocation (LDA) has bolstered research on topic modelling approaches with modifications specific to the application. However, the basic assumption of a Dirichlet prior in LDA for topic proportions, might not be applicable in certain real world scenarios. Hence, in this thesis we explore the use of generalized Dirichlet (GD) and Beta-Liouville (BL) as alternative priors for topic proportions. In addition, we assume a mixture of distributions over topic proportions which provides better fit to the data. In order to accommodate application of the resulting models to real-time streaming data, we also provide an online learning solution for the models. A supervised version of the learning framework is also provided and is shown to be advantageous when labelled data are available. There is a slight chance that the topics thus derived may not be that accurate. In order to alleviate this problem, we integrate an interactive approach which uses inputs from the user to improve the quality of identified topics. We have also tweaked our models to be applied for interesting applications such as parallel topics extraction from multilingual texts and content based recommendation systems proving the adaptability of our proposed models. In the case of multilingual topic extraction, we use global topic proportions sampled from a Dirichlet process (DP) to tackle the problem and in the case of recommendation systems, we use the co-occurrences of words to our advantage. For inference, we use a variational approach which makes computation of variational solutions easier. The applications we validated our models with, show the efficiency of proposed models

    Biclustering random matrix partitions with an application to classification of forensic body fluids

    Full text link
    Classification of unlabeled data is usually achieved by supervised learning from labeled samples. Although there exist many sophisticated supervised machine learning methods that can predict the missing labels with a high level of accuracy, they often lack the required transparency in situations where it is important to provide interpretable results and meaningful measures of confidence. Body fluid classification of forensic casework data is the case in point. We develop a new Biclustering Dirichlet Process (BDP), with a three-level hierarchy of clustering, and a model-based approach to classification which adapts to block structure in the data matrix. As the class labels of some observations are missing, the number of rows in the data matrix for each class is unknown. The BDP handles this and extends existing biclustering methods by simultaneously biclustering multiple matrices each having a randomly variable number of rows. We demonstrate our method by applying it to the motivating problem, which is the classification of body fluids based on mRNA profiles taken from crime scenes. The analyses of casework-like data show that our method is interpretable and produces well-calibrated posterior probabilities. Our model can be more generally applied to other types of data with a similar structure to the forensic data.Comment: 45 pages, 10 figure

    The Democratization of News - Analysis and Behavior Modeling of Users in the Context of Online News Consumption

    Get PDF
    Die Erfindung des Internets ebnete den Weg fĂŒr die Demokratisierung von Information. Die Tatsache, dass Nachrichten fĂŒr die breite Öffentlichkeit zugĂ€nglicher wurden, barg wichtige politische Versprechen, wie zum Beispiel das Erreichen von zuvor uninformierten und daher oft inaktiven BĂŒrgern. Diese konnten sich nun dank des Internets tagesaktuell ĂŒber das politische Geschehen informieren und selbst politisch engagieren. WĂ€hrend viele Politiker und Journalisten ein Jahrzehnt lang mit dieser Entwicklung zufrieden waren, Ă€nderte sich die Situation mit dem Aufkommen der sozialen Online-Netzwerke (OSN). Diese OSNs sind heute nahezu allgegenwĂ€rtig – so beziehen inzwischen 67%67\% der Amerikaner zumindest einen Teil ihrer Nachrichten ĂŒber die sozialen Medien. Dieser Trend hat die Kosten fĂŒr die Veröffentlichung von Inhalten weiter gesenkt. Dies sah zunĂ€chst nach einer positiven Entwicklung aus, stellt inzwischen jedoch ein ernsthaftes Problem fĂŒr Demokratien dar. Anstatt dass eine schier unendliche Menge an leicht zugĂ€nglichen Informationen uns klĂŒger machen, wird die Menge an Inhalten zu einer Belastung. Eine ausgewogene Nachrichtenauswahl muss einer Flut an BeitrĂ€gen und Themen weichen, die durch das digitale soziale Umfeld des Nutzers gefiltert werden. Dies fördert die politische Polarisierung und ideologische Segregation. Mehr als die HĂ€lfte der OSN-Nutzer trauen zudem den Nachrichten, die sie lesen, nicht mehr (54%54\% machen sich Sorgen wegen Falschnachrichten). In dieses Bild passt, dass Studien berichten, dass Nutzer von OSNs dem Populismus extrem linker und rechter politischer Akteure stĂ€rker ausgesetzt sind, als Personen ohne Zugang zu sozialen Medien. Um die negativen Effekt dieser Entwicklung abzumildern, trĂ€gt meine Arbeit zum einen zum VerstĂ€ndnis des Problems bei und befasst sich mit Grundlagenforschung im Bereich der Verhaltensmodellierung. Abschließend beschĂ€ftigen wir uns mit der Gefahr der Beeinflussung der Internetnutzer durch soziale Bots und prĂ€sentieren eine auf Verhaltensmodellierung basierende Lösung. Zum besseren VerstĂ€ndnis des Nachrichtenkonsums deutschsprachiger Nutzer in OSNs, haben wir deren Verhalten auf Twitter analysiert und die Reaktionen auf kontroverse - teils verfassungsfeindliche - und nicht kontroverse Inhalte verglichen. ZusĂ€tzlich untersuchten wir die Existenz von Echokammern und Ă€hnlichen PhĂ€nomenen. Hinsichtlich des Nutzerverhaltens haben wir uns auf Netzwerke konzentriert, die ein komplexeres Nutzerverhalten zulassen. Wir entwickelten probabilistische Verhaltensmodellierungslösungen fĂŒr das Clustering und die Segmentierung von Zeitserien. Neben den BeitrĂ€gen zum VerstĂ€ndnis des Problems haben wir Lösungen zur Erkennung automatisierter Konten entwickelt. Diese Bots nehmen eine wichtige Rolle in der frĂŒhen Phase der Verbreitung von Fake News ein. Unser Expertenmodell - basierend auf aktuellen Deep-Learning-Lösungen - identifiziert, z. B., automatisierte Accounts anhand ihres Verhaltens. Meine Arbeit sensibilisiert fĂŒr diese negative Entwicklung und befasst sich mit der Grundlagenforschung im Bereich der Verhaltensmodellierung. Auch wird auf die Gefahr der Beeinflussung durch soziale Bots eingegangen und eine auf Verhaltensmodellierung basierende Lösung prĂ€sentiert

    Mixture-Based Clustering and Hidden Markov Models for Energy Management and Human Activity Recognition: Novel Approaches and Explainable Applications

    Get PDF
    In recent times, the rapid growth of data in various fields of life has created an immense need for powerful tools to extract useful information from data. This has motivated researchers to explore and devise new ideas and methods in the field of machine learning. Mixture models have gained substantial attention due to their ability to handle high-dimensional data efficiently and effectively. However, when adopting mixture models in such spaces, four crucial issues must be addressed, including the selection of probability density functions, estimation of mixture parameters, automatic determination of the number of components, identification of features that best discriminate the different components, and taking into account the temporal information. The primary objective of this thesis is to propose a unified model that addresses these interrelated problems. Moreover, this thesis proposes a novel approach that incorporates explainability. This thesis presents innovative mixture-based modelling approaches tailored for diverse applications, such as household energy consumption characterization, energy demand management, fault detection and diagnosis and human activity recognition. The primary contributions of this thesis encompass the following aspects: Initially, we propose an unsupervised feature selection approach embedded within a finite bounded asymmetric generalized Gaussian mixture model. This model is adept at handling synthetic and real-life smart meter data, utilizing three distinct feature extraction methods. By employing the expectation-maximization algorithm in conjunction with the minimum message length criterion, we are able to concurrently estimate the model parameters, perform model selection, and execute feature selection. This unified optimization process facilitates the identification of household electricity consumption profiles along with the optimal subset of attributes defining each profile. Furthermore, we investigate the impact of household characteristics on electricity usage patterns to pinpoint households that are ideal candidates for demand reduction initiatives. Subsequently, we introduce a semi-supervised learning approach for the mixture of mixtures of bounded asymmetric generalized Gaussian and uniform distributions. The integration of the uniform distribution within the inner mixture bolsters the model's resilience to outliers. In the unsupervised learning approach, the minimum message length criterion is utilized to ascertain the optimal number of mixture components. The proposed models are validated through a range of applications, including chiller fault detection and diagnosis, occupancy estimation, and energy consumption characterization. Additionally, we incorporate explainability into our models and establish a moderate trade-off between prediction accuracy and interpretability. Finally, we devise four novel models for human activity recognition (HAR): bounded asymmetric generalized Gaussian mixture-based hidden Markov model with feature selection~(BAGGM-FSHMM), bounded asymmetric generalized Gaussian mixture-based hidden Markov model~(BAGGM-HMM), asymmetric generalized Gaussian mixture-based hidden Markov model with feature selection~(AGGM-FSHMM), and asymmetric generalized Gaussian mixture-based hidden Markov model~(AGGM-HMM). We develop an innovative method for simultaneous estimation of feature saliencies and model parameters in BAGGM-FSHMM and AGGM-FSHMM while integrating the bounded support asymmetric generalized Gaussian distribution~(BAGGD), the asymmetric generalized Gaussian distribution~(AGGD) in the BAGGM-HMM and AGGM-HMM respectively. The aforementioned proposed models are validated using video-based and sensor-based HAR applications, showcasing their superiority over several mixture-based hidden Markov models~(HMMs) across various performance metrics. We demonstrate that the independent incorporation of feature selection and bounded support distribution in a HAR system yields benefits; Simultaneously, combining both concepts results in the most effective model among the proposed models

    Sequential Inference with the Mallows Model

    Get PDF
    The Mallows model is a widely used probabilistic model for analysing rank data. It assumes that a collection of n items can be ranked by each assessor and then summarised as a permutation of size n. The associated probability distribution is defined on the permutation space of these items. A hierarchical Bayesian framework for the Mallows model, named the Bayesian Mallows model, has been developed recently to perform inference and to provide uncertainty estimates of the model parameters. This framework typically uses Markov chain Monte Carlo (MCMC) methods to simulate from the target posterior distribution. However, MCMC can be considerably slow when additional computational effort is presented in the form of new ranking data. It can therefore be difficult to update the Bayesian Mallows model in real time. This thesis extends the Bayesian Mallows model to allow for sequential updates of its posterior estimates each time a collection of new preference data is observed. The posterior is updated over a sequence of discrete time steps with fixed computational complexity. This can be achieved using Sequential Monte Carlo (SMC) methods. SMC offers a standard alternative to MCMC by constructing a sequence of posterior distributions using a set of weighted samples. The samples are propagated via a combination of importance sampling, resampling and moving steps. We propose an SMC framework that can perform sequential updates for the posterior distribution for both a single Mallows model and a Mallows mixture each time we observe new full rankings in an online setting. We also construct a framework to conduct SMC with partial rankings for a single Mallows model. We propose an alternative proposal distribution for data augmentation in partial rankings that incorporates the current posterior estimates of the Mallows model parameters in each SMC iteration. We also extend the framework to consider how the posterior is updated when known assessors provide additional information in their partial ranking. We show how these corrections in the latent information are performed to account for the changes in the posterior

    CFD modeling of biomass combustion and gasification in fluidized bed reactors

    Get PDF
    Biomass is an environmentally friendly renewable energy source and carbon-neutral fuel alternative. Direct combustion/gasification of biomass in the dense particle-fluid system is an important pathway to biomass energy utilization. To efficiently utilize biomass for energy conversion, a full understanding of biomass thermal conversion in lab/industrial-scale equipment is essential. This thesis aims to gain a deeper understanding of the physical and chemical mechanisms of biomass combustion/gasification in fluidized bed (FB) furnaces using computational fluid dynamics (CFD) simulations. A three-dimensional reactive CFD model based on the Eulerian-Lagrangian method is developed to investigate the hydrodynamics, heat transfer, and gasification/combustion characteristics of biomass in multiple-scale FB furnaces. The CFD model considered here is based on the multi-phase particle-in-cell (MP-PIC) collision model and the coarse grain method (CGM). CGM is computationally efficient; however, it can cause numerical instability if the clustered parcels pass through small computational cells, resulting in the over-loading of solid particles in the cells. To address this issue, a distribution kernel method (DKM) is proposed. This method is to spread the solid volume and source terms of the parcel to the surrounding domain. The numerical stiffness problem caused by the CGM clustering can be remedied using DKM. Validation of the model is performed using experimental data from various lab-scale reactors. The validated model is employed to investigate further the heat transfer and biomass combustion/gasification process. Biomass pyrolysis produces a large variety of species in the products, which poses great challenges to the modeling of biomass gasification. A conventional single-step pyrolysis model is widely employed in FB simulations due to its low computational cost. However, the prediction of pyrolysis products of this model under varying operating temperatures needs to be improved. To address this issue, an empirical pyrolysis model based on element conservation law is developed. The empirical parameters are based on a number of experiments from the literature. The simulation results agree well with the experimental data under differentoperating conditions. The pyrolysis model improves the sensitivity of gasification product yields to operating temperature. Furthermore, the mixture characteristics of the biomass and sand particles and the effect of the operating conditions on the yields of gasification products are analyzed. The validated CFD model is employed to investigate the fluidization, combustion, and emission processes in industrial-scale FB furnaces. A major challenge in the CFD simulation of industrial-scale FB furnaces is the enormous computational time and memory required to track quadrillions of particles in the systems. The CFD model coupling MP-PIC and CGM greatly reduces the computational cost, and the DKM overcomes the unavoidable particle overloading issue due to the refined mesh in complex geometry. The CFD predictions agree well with onsite temperature experiments in the furnace. The CFD results are used to understand the granular flow and the impact of operating conditions on the physical and chemical processes in biomass FB-fired furnaces

    Explain what you see:argumentation-based learning and robotic vision

    Get PDF
    In this thesis, we have introduced new techniques for the problems of open-ended learning, online incremental learning, and explainable learning. These methods have applications in the classification of tabular data, 3D object category recognition, and 3D object parts segmentation. We have utilized argumentation theory and probability theory to develop these methods. The first proposed open-ended online incremental learning approach is Argumentation-Based online incremental Learning (ABL). ABL works with tabular data and can learn with a small number of learning instances using an abstract argumentation framework and bipolar argumentation framework. It has a higher learning speed than state-of-the-art online incremental techniques. However, it has high computational complexity. We have addressed this problem by introducing Accelerated Argumentation-Based Learning (AABL). AABL uses only an abstract argumentation framework and uses two strategies to accelerate the learning process and reduce the complexity. The second proposed open-ended online incremental learning approach is the Local Hierarchical Dirichlet Process (Local-HDP). Local-HDP aims at addressing two problems of open-ended category recognition of 3D objects and segmenting 3D object parts. We have utilized Local-HDP for the task of object part segmentation in combination with AABL to achieve an interpretable model to explain why a certain 3D object belongs to a certain category. The explanations of this model tell a user that a certain object has specific object parts that look like a set of the typical parts of certain categories. Moreover, integrating AABL and Local-HDP leads to a model that can handle a high degree of occlusion

    Predictive embodied concepts: an exploration of higher cognition within the predictive processing paradigm

    Get PDF
    Predictive processing, an increasingly popular paradigm in cognitive sciences, has focused primarily on giving accounts of perception, motor control and a host of psychological phenomena, including consciousness. But higher cognitive processes, like conceptual thought, language, and logic, have received only limited attention to date and PP still stands disconnected from a huge body of research in those areas. In this thesis, I aim to address this gap and I attempt to go some way towards developing and defending a cognitive-computational approach to higher cognition within the predictive processing paradigm. To test its explanatory potential, I apply it to a range of linguistic and conceptual phenomena. I proceed in three steps. First, I lay out an account of concepts and suggest how concepts are represented, how they can be context-sensitively processed, and how the apparent diversity of formats arise. Secondly, I propose how paradigmatic higher cognitive competencies, like language and logical reasoning, could fit into the PP picture. Thirdly, I apply the PP account of concepts and language to a range of linguistic-conceptual phenomena as test cases, namely: metaphor, the semantic paradox (specifically the Liar Paradox) and copredication. Finally, I discuss some challenges and objections to the PP framework as applied to higher cognition and in general
    • 

    corecore