13 research outputs found

    Least square multi-class kernel machines with prior knowledge and applications.

    Get PDF
    In this study, the problem of discriminating between objects of two or more classes with (or without) prior knowledge is investigated. We present how a two-class discrimination model with or without prior knowledge can be extended to the case of multi-categorical discrimination with or without prior knowledge. The prior knowledge of interest is in the form of multiple polyhedral sets belonging to one or more categories, classes, or labels, and it is introduced as additional constraints into a classification model formulation. The solution of the knowledge-based support vector machine (KBSVM) model for two-class discrimination is characterized by a linear programming (LP) problem, and this is due to the specific norm (L1 or Linfinity) that is used to compute the distance between the two classes. We propose solutions to classification problems expressed as a single unconstrained optimization problem with (or without) prior knowledge via a regularized least square cost function in order to obtain a linear system of equations in input space and/or dual space induced by a kernel function that can be solved using matrix methods or iterative methods. Advantages of this formulation include the explicit expressions for the classification weights of the classifier(s); its ability to incorporate and handle prior knowledge directly to the classifiers; its ability to incorporate several classes in a single formulation and provide fast solutions to the optimal classification weights for multicategorical separation.Comparisons with other learning techniques such as the least square SVM & MSVM developed by Suykens & Vandewalle (1999b & 1999c), and the knowledge-based SVM developed by Fung et al. (2002) indicate that the regularized least square methods are more efficient in terms of misclassification testing error and computational time

    Unconstrained Learning Machines

    Get PDF
    With the use of information technology in industries, a new need has arisen in analyzing large scale data sets and automating data analysis that was once performed by human intuition and simple analog processing machines. The new generation of computer programs now has to outperform their predecessors in detecting complex and non-trivial patterns buried in data warehouses. Improved Machines Learning (ML) techniques such as Neural Networks (NNs) and Support Vector Machines (SVMs) have shown remarkable performances on supervised learning problems for the past couple of decades (e.g. anomaly detection, classification and identification, interpolation and extrapolation, etc.).Nevertheless, many such techniques have ill-conditioned structures which lack adaptability for processing exotic data or very large amounts of data. Some techniques cannot even process data in an on-line fashion. Furthermore, as the processing power of computers increases, there is a pressing need for ML algorithms to perform supervised learning tasks in less time than previously required over even larger sets of data, which means that time and memory complexities of these algorithms must be improved.The aims of this research is to construct an improved type of SVM-like algorithms for tasks such as nonlinear classification and interpolation that is more scalable, error-tolerant and accurate. Additionally, this family of algorithms must be able to compute solutions in a controlled timing, preferably small with respect to modern computational technologies. These new algorithms should also be versatile enough to have useful applications in engineering, meteorology or quality control.This dissertation introduces a family of SVM-based algorithms named Unconstrained Learning Machines (ULMs) which attempt to solve the robustness, scalability and timing issues of traditional supervised learning algorithms. ULMs are not based on geometrical analogies (e.g. SVMs) or on the replication of biological models (e.g. NNs). Their construction is strictly based on statistical considerations taken from the recently developed statistical learning theory. Like SVMs, ULMS are using kernel methods extensively in order to process exotic and/or non-numerical objects stored in databases and search for hidden patterns in data with tailored measures of similarities.ULMs are applied to a variety of problems in manufacturing engineering and in meteorology. The robust nonlinear nonparametric interpolation abilities of ULMs allow for the representation of sub-millimetric deformations on the surface of manufactured parts, the selection of conforming objects and the diagnostic and modeling of manufacturing processes. ULMs play a role in assimilating the system states of computational weather models, removing the intrinsic noise without any knowledge of the underlying mathematical models and helping the establishment of more accurate forecasts

    A Human-Centric Approach to Data Fusion in Post-Disaster Managment: The Development of a Fuzzy Set Theory Based Model

    Get PDF
    It is critical to provide an efficient and accurate information system in the post-disaster phase for individuals\u27 in order to access and obtain the necessary resources in a timely manner; but current map based post-disaster management systems provide all emergency resource lists without filtering them which usually leads to high levels of energy consumed in calculation. Also an effective post-disaster management system (PDMS) will result in distribution of all emergency resources such as, hospital, storage and transportation much more reasonably and be more beneficial to the individuals in the post disaster period. In this Dissertation, firstly, semi-supervised learning (SSL) based graph systems was constructed for PDMS. A Graph-based PDMS\u27 resource map was converted to a directed graph that presented by adjacent matrix and then the decision information will be conducted from the PDMS by two ways, one is clustering operation, and another is graph-based semi-supervised optimization process. In this study, PDMS was applied for emergency resource distribution in post-disaster (responses phase), a path optimization algorithm based ant colony optimization (ACO) was used for minimizing the cost in post-disaster, simulation results show the effectiveness of the proposed methodology. This analysis was done by comparing it with clustering based algorithms under improvement ACO of tour improvement algorithm (TIA) and Min-Max Ant System (MMAS) and the results also show that the SSL based graph will be more effective for calculating the optimization path in PDMS. This research improved the map by combining the disaster map with the initial GIS based map which located the target area considering the influence of disaster. First, all initial map and disaster map will be under Gaussian transformation while we acquired the histogram of all map pictures. And then all pictures will be under discrete wavelet transform (DWT), a Gaussian fusion algorithm was applied in the DWT pictures. Second, inverse DWT (iDWT) was applied to generate a new map for a post-disaster management system. Finally, simulation works were proposed and the results showed the effectiveness of the proposed method by comparing it to other fusion algorithms, such as mean-mean fusion and max-UD fusion through the evaluation indices including entropy, spatial frequency (SF) and image quality index (IQI). Fuzzy set model were proposed to improve the presentation capacity of nodes in this GIS based PDMS

    Uncovering the Potential of Federated Learning: Addressing Algorithmic and Data-driven Challenges under Privacy Restrictions

    Get PDF
    Federated learning is a groundbreaking distributed machine learning paradigm that allows for the collaborative training of models across various entities without directly sharing sensitive data, ensuring privacy and robustness. This Ph.D. dissertation delves into the intricacies of federated learning, investigating the algorithmic and data-driven challenges of deep learning models in the presence of additive noise in this framework. The main objective is to provide strategies to measure the generalization, stability, and privacy-preserving capabilities of these models and further improve them. To this end, five noise infusion mechanisms at varying noise levels within centralized and federated learning settings are explored. As model complexity is a key component of the generalization and stability of deep learning models during training and evaluation, a comparative analysis of three Convolutional Neural Network (CNN) architectures is provided. A key contribution of this study is introducing specific metrics for training with noise. Signal-to-Noise Ratio (SNR) is introduced as a quantitative measure of the trade-off between privacy and training accuracy of noise-infused models, aiming to find the noise level that yields optimal privacy and accuracy. Moreover, the Price of Stability and Price of Anarchy are defined in the context of privacy-preserving deep learning, contributing to the systematic investigation of the noise infusion mechanisms to enhance privacy without compromising performance. This research sheds light on the delicate balance between these critical factors, fostering a deeper understanding of the implications of noise-based regularization in machine learning. The present study also explores a real-world application of federated learning in weather prediction applications that suffer from the issue of imbalanced datasets. Utilizing data from multiple sources combined with advanced data augmentation techniques improves the accuracy and generalization of weather prediction models, even when dealing with imbalanced datasets. Overall, federated learning is pivotal in harnessing decentralized datasets for real-world applications while safeguarding privacy. By leveraging noise as a tool for regularization and privacy enhancement, this research study aims to contribute to the development of robust, privacy-aware algorithms, ensuring that AI-driven solutions prioritize both utility and privacy

    Machine Learning-Based Data and Model Driven Bayesian Uncertanity Quantification of Inverse Problems for Suspended Non-structural System

    Get PDF
    Inverse problems involve extracting the internal structure of a physical system from noisy measurement data. In many fields, the Bayesian inference is used to address the ill-conditioned nature of the inverse problem by incorporating prior information through an initial distribution. In the nonparametric Bayesian framework, surrogate models such as Gaussian Processes or Deep Neural Networks are used as flexible and effective probabilistic modeling tools to overcome the high-dimensional curse and reduce computational costs. In practical systems and computer models, uncertainties can be addressed through parameter calibration, sensitivity analysis, and uncertainty quantification, leading to improved reliability and robustness of decision and control strategies based on simulation or prediction results. However, in the surrogate model, preventing overfitting and incorporating reasonable prior knowledge of embedded physics and models is a challenge. Suspended Nonstructural Systems (SNS) pose a significant challenge in the inverse problem. Research on their seismic performance and mechanical models, particularly in the inverse problem and uncertainty quantification, is still lacking. To address this, the author conducts full-scale shaking table dynamic experiments and monotonic & cyclic tests, and simulations of different types of SNS to investigate mechanical behaviors. To quantify the uncertainty of the inverse problem, the author proposes a new framework that adopts machine learning-based data and model driven stochastic Gaussian process model calibration to quantify the uncertainty via a new black box variational inference that accounts for geometric complexity measure, Minimum Description length (MDL), through Bayesian inference. It is validated in the SNS and yields optimal generalizability and computational scalability

    Knowledge and pre-trained language models inside and out: a deep-dive into datasets and external knowledge

    Get PDF
    Pre-trained Language Models (PLMs) have greatly advanced the performance of various NLP tasks and have undoubtedly been serving as foundation models for this field. These pre-trained models are able to capture rich semantic patterns from large-scale text corpora and learn high-quality representations of texts. However, such models still have shortcomings - they underperform when faced with tasks that requires implicit external knowledge to be understood, which is difficult to learn with commonly employed pre-training objectives. Moreover, there lacks a comprehensive understanding of PLMs’ behaviour in learning knowledge during the fine-tuning phase. Therefore, in order to address the aforementioned challenges, we propose a set of approaches to inject external knowledge into PLMs and demonstrate experiments investigating their behaviour in learning knowledge during the fine-tuning phase, primarily focusing on Sentiment Analysis, Question Answering and Video Question Answering. Specifically, we introduce novel approaches explicitly using textual historical reviews of users and products for improving sentiment analysis. To overcome the problem of context-question lexical overlap and data scarcity for question generation, we propose a novel method making use of linguistic and semantic knowledge with heuristics. Additionally, we explore how to utilise multimodal (visual and acoustic) information/knowledge to improve Video Question Answering. Experiments conducted on benchmark datasets show that our proposed approaches achieve superior performance compared to state-of-the-art models, demonstrating the effectiveness of our methods for injecting external knowledge. Furthermore, we conduct a set of experiments investigating the learning of knowledge for PLMs for question answering under various scenarios. Results reveal that the internal characteristics of QA datasets can pose strong bias for PLMs when learning from downstream tasks datasets. Finally, we present an in-depth discussion of future directions for improving PLMs with external knowledge
    corecore