478 research outputs found

    Two-Stage Fuzzy Multiple Kernel Learning Based on Hilbert-Schmidt Independence Criterion

    Full text link
    © 1993-2012 IEEE. Multiple kernel learning (MKL) is a principled approach to kernel combination and selection for a variety of learning tasks, such as classification, clustering, and dimensionality reduction. In this paper, we develop a novel fuzzy multiple kernel learning model based on the Hilbert-Schmidt independence criterion (HSIC) for classification, which we call HSIC-FMKL. In this model, we first propose an HSIC Lasso-based MKL formulation, which not only has a clear statistical interpretation that minimum redundant kernels with maximum dependence on output labels are found and combined, but also enables the global optimal solution to be computed efficiently by solving a Lasso optimization problem. Since the traditional support vector machine (SVM) is sensitive to outliers or noises in the dataset, fuzzy SVM (FSVM) is used to select the prediction hypothesis once the optimal kernel has been obtained. The main advantage of FSVM is that we can associate a fuzzy membership with each data point such that these data points can have different effects on the training of the learning machine. We propose a new fuzzy membership function using a heuristic strategy based on the HSIC. The proposed HSIC-FMKL is a two-stage kernel learning approach and the HSIC is applied in both stages. We perform extensive experiments on real-world datasets from the UCI benchmark repository and the application domain of computational biology which validate the superiority of the proposed model in terms of prediction accuracy

    A Hybrid Sailfish Whale Optimization and Deep Long Short-Term Memory (SWO-DLSTM) Model for Energy Efficient Autonomy in India by 2048

    Get PDF
    In order to formulate the long-term and short-term development plans to meet the energy needs, there is a great demand for accurate energy forecasting. Most of the existing energy demand forecasting models predict the amount of energy at a regional or national scale and failed to forecast the demand for power generation for small-scale decentralized energy systems, like micro grids, buildings, and energy communities. Deep learning models play a vital role in accurately forecasting the energy de-mand. A novel model called Sail Fish Whale Optimization-based Deep Long Short- Term memory (SFWO-based Deep LSTM) to forecast electricity demand in the distribution systems is proposed. The proposed SFWO is designed by integrating the Sail Fish Optimizer (SFO) with the Whale Optimiza-tion Algorithm (WOA). The Hilbert-Schmidt Independence Criterion Lasso (HSIC) is applied on the dataset, which is collected from the Central electricity authority, Government of India, for selecting the optimal features using the technical indicators. The proposed algorithm was implemented in MATLAB software package and the study was done using real-time data. The feature selection pro-cess improves the accuracy of the proposed model by training the features using Deep LSTM. The results of the proposed model in terms of install capacity prediction, village electrified prediction, length of R & D lines prediction, hydro, coal, diesel, nuclear prediction, etc. are compared with the existing models. The proposed model achieves good accuracy with the average normalized Root Mean Squared Error (RMSE) value of 4.4559. The hybrid approach provides improved accuracy for the prediction of energy demand in India by the year 2047.publishedVersio

    Affective Speech Recognition

    Get PDF
    Speech, as a medium of interaction, carries two different streams of information. Whereas one stream carries explicit messages, the other one contains implicit information about speakers themselves. Affective speech recognition is a set of theories and tools that intend to automate unfolding the part of the implicit stream that has to do with humans emotion. Application of affective speech recognition is to human computer interaction; a machine that is able to recognize humans emotion could engage the user in a more effective interaction. This thesis proposes a set of analyses and methodologies that advance automatic recognition of affect from speech. The proposed solution spans two dimensions of the problem: speech signal processing, and statistical learning. At the speech signal processing dimension, extraction of speech low-level descriptors is dis- cussed, and a set of descriptors that exploit the spectrum of the signal are proposed, which have shown to be particularly practical for capturing affective qualities of speech. Moreover, consider- ing the non-stationary property of the speech signal, further proposed is a measure of dynamicity that captures that property of speech by quantifying changes of the signal over time. Furthermore, based on the proposed set of low-level descriptors, it is shown that individual human beings are different in conveying emotions, and that parts of the spectrum that hold the affective information are different from one person to another. Therefore, the concept of emotion profile is proposed that formalizes those differences by taking into account different factors such as cultural and gender-specific differences, as well as those distinctions that have to do with individual human beings. At the statistical learning dimension, variable selection is performed to identify speech features that are most imperative to extracting affective information. In doing so, low-level descriptors are distinguished from statistical functionals, therefore, effectiveness of each of the two are studied dependently and independently. The major importance of variable selection as a standalone component of a solution is to real-time application of affective speech recognition. Although thousands of speech features are commonly used to tackle this problem in theory, extracting that many features in a real-time manner is unrealistic, especially for mobile applications. Results of the conducted investigations show that the required number of speech features is far less than the number that is commonly used in the literature of the problem. At the core of an affective speech recognition solution is a statistical model that uses speech features to recognize emotions. Such a model comes with a set of parameters that are estimated through a learning process. Proposed in this thesis is a learning algorithm, developed based on the notion of Hilbert-Schmidt independence criterion and named max-dependence regression, that maximizes the dependence between predicted and actual values of affective qualities. Pearson’s correlation coefficient is commonly used as the measure of goodness of a fit in the literature of affective computing, therefore max-dependence regression is proposed to make the learning and hypothesis testing criteria consistent with one another. Results of this research show that doing so results in higher prediction accuracy. Lastly, sparse representation for affective speech datasets is considered in this thesis. For this purpose, the application of a dictionary learning algorithm based on Hilbert-Schmidt independence criterion is proposed. Dictionary learning is used to identify the most important bases of the data in order to improve the generalization capability of the proposed solution to affective speech recognition. Based on the dictionary learning approach of choice, fusion of feature vectors is proposed. It is shown that sparse representation leads to higher generalization capability for affective speech recognition

    Multi-graph learning with positive and unlabeled bags

    Full text link
    © SIAM. In this paper, we formulate a new multi-graph learning task with only positive and unlabeled bags, where labels are only available for bags but not for individual graphs inside the bag. This problem setting raises significant challenges because bag-of-graph setting does not have features to directly represent graph data, and no negative bags exits for deriving discriminative classification models. To solve the challenge, we propose a puMGL learning framework which relies on two iteratively combined processes for multigraph learning: (1) deriving features to represent graphs for learning; and (2) deriving discriminative models with only positive and unlabeled graph bags. For the former, we derive a subgraph scoring criterion to select a set of informative subgraphs to convert each graph into a feature space. To handle unlabeled bags, we assign a weight value to each bag and use the adjusted weight values to select most promising unlabeled bags as negative bags. A margin graph pool (MGP), which contains some representative graphs from positive bags and identified negative bags, is used for selecting subgraphs and training graph classifiers. The iterative subgraph scoring, bag weight updating, and MGP based graph classification forms a closed loop to find optimal subgraphs and most suitable unlabeled bags for multi-graph learning. Experiments and comparisons on real-world multigraph data demonstrate the algorithm performance. Copyrigh

    Stochastic and deterministic algorithms for continuous black-box optimization

    Get PDF
    Continuous optimization is never easy: the exact solution is always a luxury demand and the theory of it is not always analytical and elegant. Continuous optimization, in practice, is essentially about the efficiency: how to obtain the solution with same quality using as minimal resources (e.g., CPU time or memory usage) as possible? In this thesis, the number of function evaluations is considered as the most important resource to save. To achieve this goal, various efforts have been implemented and applied successfully. One research stream focuses on the so-called stochastic variation (mutation) operator, which conducts an (local) exploration of the search space. The efficiency of those operator has been investigated closely, which shows a good stochastic variation should be able to generate a good coverage of the local neighbourhood around the current search solution. This thesis contributes on this issue by formulating a novel stochastic variation that yields good space coverage. Algorithms and the Foundations of Software technolog

    A Bayes Interpretation of Stacking for M-Complete and M-Open Settings

    Get PDF
    In M-open problems where no true model can be conceptualized, it is common to back off from modeling and merely seek good prediction. Even in M-complete problems, taking a predictive approach can be very useful. Stacking is a model averaging procedure that gives a composite predictor by combining individual predictors from a list of models using weights that optimize a cross validation criterion. We show that the stacking weights also asymptotically minimize a posterior expected loss. Hence we formally provide a Bayesian justification for cross-validation. Often the weights are constrained to be positive and sum to one. For greater generality, we omit the positivity constraint and relax the ‘sum to one’ constraint

    On the intelligent management of sepsis in the intensive care unit

    Get PDF
    The management of the Intensive Care Unit (ICU) in a hospital has its own, very specific requirements that involve, amongst others, issues of risk-adjusted mortality and average length of stay; nurse turnover and communication with physicians; technical quality of care; the ability to meet patient's family needs; and avoid medical error due rapidly changing circumstances and work overload. In the end, good ICU management should lead to an improvement in patient outcomes. Decision making at the ICU environment is a real-time challenge that works according to very tight guidelines, which relate to often complex and sensitive research ethics issues. Clinicians in this context must act upon as much available information as possible, and could therefore, in general, benefit from at least partially automated computer-based decision support based on qualitative and quantitative information. Those taking executive decisions at ICUs will require methods that are not only reliable, but also, and this is a key issue, readily interpretable. Otherwise, any decision tool, regardless its sophistication and accuracy, risks being rendered useless. This thesis addresses this through the design and development of computer based decision making tools to assist clinicians at the ICU. It focuses on one of the main problems that they must face: the management of the Sepsis pathology. Sepsis is one of the main causes of death for non-coronary ICU patients. Its mortality rate can reach almost up to one out of two patients for septic shock, its most acute manifestation. It is a transversal condition affecting people of all ages. Surprisingly, its definition has only been standardized two decades ago as a systemic inflammatory response syndrome with confirmed infection. The research reported in this document deals with the problem of Sepsis data analysis in general and, more specifically, with the problem of survival prediction for patients affected with Severe Sepsis. The tools at the core of the investigated data analysis procedures stem from the fields of multivariate and algebraic statistics, algebraic geometry, machine learning and computational intelligence. Beyond data analysis itself, the current thesis makes contributions from a clinical point of view, as it provides substantial evidence to the debate about the impact of the preadmission use of statin drugs in the ICU outcome. It also sheds light into the dependence between Septic Shock and Multi Organic Dysfunction Syndrome. Moreover, it defines a latent set of Sepsis descriptors to be used as prognostic factors for the prediction of mortality and achieves an improvement on predictive capability over indicators currently in use.La gestió d'una Unitat de Cures Intensives (UCI) hospitalària presenta uns requisits força específics incloent, entre altres, la disminució de la taxa de mortalitat, la durada de l'ingrès, la rotació d'infermeres i la comunicació entre metges amb al finalitad de donar una atenció de qualitat atenent als requisits tant dels malalts com dels familiars. També és força important controlar i minimitzar els error mèdics deguts a canvis sobtats i a la presa ràpida de deicisions assistencials. Al cap i a la fi, la bona gestió de la UCI hauria de resultar en una reducció de la mortalitat i durada d'estada. La presa de decisions en un entorn de crítics suposa un repte de presa de decisions en temps real d'acord a unes guies clíniques molt restrictives i que, pel que fa a la recerca, poden resultar en problemes ètics força sensibles i complexos. Per tant, el personal sanitari que ha de prendre decisions sobre la gestió de malalts crítics no només requereix eines de suport a la decisió que siguin fiables sinó que, a més a més, han de ser interpretables. Altrament qualsevol eina de decisió que no presenti aquests trets no és considerarà d'utilitat clínica. Aquesta tesi doctoral adreça aquests requisits mitjançant el desenvolupament d'eines de suport a la decisió per als intensivistes i es focalitza en un dels principals problemes als que s'han denfrontar: el maneig del malalt sèptic. La Sèpsia és una de les principals causes de mortalitats a les UCIS no-coronàries i la seva taxa de mortalitat pot arribar fins a la meitat dels malalts amb xoc sèptic, la seva manifestació més severa. La Sèpsia és un síndrome transversal, que afecta a persones de totes les edats. Sorprenentment, la seva definició ha estat estandaritzada, fa només vint anys, com a la resposta inflamatòria sistèmica a una infecció corfimada. La recerca presentada en aquest document fa referència a l'anàlisi de dades de la Sèpsia en general i, de forma més específica, al problema de la predicció de la supervivència de malalts afectats amb Sèpsia Greu. Les eines i mètodes que formen la clau de bòveda d'aquest treball provenen de diversos camps com l'estadística multivariant i algebràica, geometria algebraica, aprenentatge automàtic i inteligència computacional. Més enllà de l'anàlisi per-se, aquesta tesi també presenta una contribució des de el punt de vista clínic atès que presenta evidència substancial en el debat sobre l'impacte de l'administració d'estatines previ a l'ingrès a la UCI en els malalts sèptics. També s'aclareix la forta dependència entre el xoc sèptic i el Síndrome de Disfunció Multiorgànica. Finalment, també es defineix un conjunt de descriptors latents de la Sèpsia com a factors de pronòstic per a la predicció de la mortalitat, que millora sobre els mètodes actualment més utilitzats en la UCI

    From fuzzy-rough to crisp feature selection

    Get PDF
    A central problem in machine learning and pattern recognition is the process of recognizing the most important features in a dataset. This process plays a decisive role in big data processing by reducing the size of datasets. One major drawback of existing feature selection methods is the high chance of redundant features appearing in the final subset, where in most cases, finding and removing them can greatly improve the resulting classification accuracy. To tackle this problem on two different fronts, we employed fuzzy-rough sets and perturbation theories. On one side, we used three strategies to improve the performance of fuzzy-rough set-based feature selection methods. The first strategy was to code both features and samples in one binary vector and use a shuffled frog leaping algorithm to choose the best combination using fuzzy dependency degree as the fitness function. In the second strategy, we designed a measure to evaluate features based on fuzzy-rough dependency degree in a fashion where redundant features are given less priority to be selected. In the last strategy, we designed a new binary version of the shuffled frog leaping algorithm that employs a fuzzy positive region as its similarity measure to work in complete harmony with the fitness function (i.e. fuzzy-rough dependency degree). To extend the applicability of fuzzy-rough set-based feature selection to multi-party medical datasets, we designed a privacy-preserving version of the original method. In addition, we studied the feasibility and applicability of perturbation theory to feature selection, which to the best of our knowledge has never been researched. We introduced a new feature selection based on perturbation theory that is not only capable of detecting and discarding redundant features but also is very fast and flexible in accommodating the special needs of the application. It employs a clustering algorithm to group likely-behaved features based on the sensitivity of each feature to perturbation, the angle of each feature to the outcome and the effect of removing each feature to the outcome, and it chooses the closest feature to the centre of each cluster and returns all those features as the final subset. To assess the effectiveness of the proposed methods, we compared the results of each method with well-known feature selection methods against a series of artificially generated datasets, and biological, medical and cancer datasets adopted from the University of California Irvine machine learning repository, Arizona State University repository and Gene Expression Omnibus repository
    corecore