10 research outputs found

    Integrated Retinal Information System for Analyzing Kidney Condition

    Get PDF
    Iridology is a science and practice that can express body state based on the analysis of iris structure. The changes or disturbances of disease on body network will be informed by neuron nerve fiber to brain. This energy wave information spread to eye by brain, recorded and fixed by pupil.Then, these recorded fixation become data trails which can be detected by disturbance/disease that is filed by body organ. The research about iridology to analyzing kidney condition has been conducted before using Learning Vector Quantization (LVQ) method. The accuracy is not 100%. In this research, the researcher implements Support Vector Machine(SVM) in classifying the kidney condition to replace LVQ using Matlab R2007b. The accuracy in classifying the kidney condition for right eyes is 100% and for the left eyes is 100% in training set data. If we compared to the accuracy of classification using LVQ, implementing SVM is much better because by implementing LVQ, the accuracy is only 96% for right eyes and only 92% for left eyes

    An Overview of the Algorithm Selection Problem

    Get PDF
    Users of machine learning algorithms need methods that can help them to identify algorithm or their combinations (workflows) that achieve the potentially best performance. Selecting the best algorithm to solve a given problem has been the subject of many studies over the past four decades. This survey presents an overview of the contributions made in the area of algorithm selection problems. We present different methods for solving the algorithm selection problem identifying some of the future research challenges in this domain

    Analysis of kernel matrices and their relation to SVM performance

    Get PDF
    The thesis explores the relationship between kernel matrix properties and RVM model performance in terms of its generalization power. It pinpoints desirable qualities of a kernel matrix and provides a heuristic for kernel parameter choice that increases the likelihood of obtaining good RVM model

    Meta-level learning for the effective reduction of model search space.

    Get PDF
    The exponential growth of volume, variety and velocity of the data is raising the need for investigation of intelligent ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the mapping of learning methods that leads to the optimized performance on a given task. Moreover, numerous configurations of these learning algorithms add another level of complexity. Thus, it triggers the need for an intelligent recommendation engine that can advise the best learning algorithm and its configurations for a given task. The techniques that are commonly used by experts are; trial-and-error, use their prior experience on the specific domain, etc. These techniques sometimes work for less complex tasks that require thousands of parameters to learn. However, the state-of-the-art models, e.g. deep learning models, require well-tuned hyper-parameters to learn millions of parameters which demand specialized skills and numerous computationally expensive and time-consuming trials. In that scenario, Meta-level learning can be a potential solution that can recommend the most appropriate options efficiently and effectively regardless of the complexity of data. On the contrary, Meta-learning leads to several challenges; the most critical ones being model selection and hyper-parameter optimization. The goal of this research is to investigate model selection and hyper-parameter optimization approaches of automatic machine learning in general and the challenges associated with them. In machine learning pipeline there are several phases where Meta-learning can be used to effectively facilitate the best recommendations including 1) pre-processing steps, 2) learning algorithm or their combination, 3) adaptivity mechanism parameters, 4) recurring concept extraction, and 5) concept drift detection. The scope of this research is limited to feature engineering for problem representation, and learning strategy for algorithm and its hyper-parameters recommendation at Meta-level. There are three studies conducted around the two different approaches of automatic machine learning which are model selection using Meta-learning and hyper-parameter optimization. The first study evaluates the situation in which the use of additional data from a different domain can improve the performance of a meta-learning system for time-series forecasting, with focus on cross- domain Meta-knowledge transfer. Although the experiments revealed limited room for improvement over the overall best base-learner, the meta-learning approach turned out to be a safe choice, minimizing the risk of selecting the least appropriate base-learner. There are only 2% of cases recommended by meta- learning that are the worst performing base-learning methods. The second study proposes another efficient and accurate domain adaption approach but using a different meta-learning approach. This study empirically confirms the intuition that there exists a relationship between the similarity of the two different tasks and the depth of network needed to fine-tune in order to achieve accuracy com- parable with that of a model trained from scratch. However, the approach is limited to a single hyper-parameter which is fine-tuning of the network depth based on task similarity. The final study of this research has expanded the set of hyper-parameters while implicitly considering task similarity at the intrinsic dynamics of the training process. The study presents a framework to automatically find a good set of hyper-parameters resulting in reasonably good accuracy, by framing the hyper-parameter selection and tuning within the reinforcement learning regime. The effectiveness of a recommended tuple can be tested very quickly rather than waiting for the network to converge. This approach produces accuracy close to the state-of-the-art approach and is found to be comparatively 20% less computationally expensive than previous approaches. The proposed methods in these studies, belonging to different areas of automatic machine learning, have been thoroughly evaluated on a number of benchmark datasets which confirmed the great potential of these methods

    Classification de textes : de nouvelles pondérations adaptées aux petits volumes

    Get PDF
    Every day, classification is omnipresent and unconscious. For example in the process of decision when faced with something (an object, an event, a person), we will instinctively think of similar elements in order to adapt our choices and behaviors. This storage in a particular category is based on past experiences and characteristics of the element. The largest and the most accurate will be experiments, the most relevant will be the decision. It is the same when we need to categorize a document based on its content. For example detect if there is a children’s story or a philosophical treatise. This treatment is of course more effective if we have a large number of works of these two categories and if books had a large number of words.In this thesis we address the problem of decision making precisely when we have few learning documents and when the documents had a limited number of words. For this we propose a new approach based on new weights. It enables us to accurately determine the weight to be given to the words which compose the document. To optimize treatment, we propose a configurable approach. Five parameters make our adaptable approach, regardless of the classification given problem. Numerous experiments have been conducted on various types of documents in different languages and in different configurations. According to the corpus, they highlight that our proposal allows us to achieve superior results in comparison with the best approaches in the literature to address the problems of small dataset.The use of parameters adds complexity since it is then necessary to determine optimital values. Detect the best settings and best algorithms is a complicated task whose difficulty is theorized through the theorem of No-Free-Lunch. We treat this second problem by proposing a new meta-classification approach based on the concepts of distance and semantic similarities. Specifically we propose new meta-features to deal in the context of classification of documents. This original approach allows us to achieve similar results with the best approaches to literature while providing additional features.In conclusion, the work presented in this manuscript has been integrated into various technical implementations, one in the Weka software, one in a industrial prototype and a third in the product of the company that funded this work.Au quotidien, le réflexe de classifier est omniprésent et inconscient. Par exemple, dans le processus de prise de décision où face à un élément (un objet, un événement, une personne) nous allons instinctivement chercher à rapprocher cet élément d’autres similaires afin d’adapter nos choix et nos comportements. Cette association à telle ou telle catégorie repose sur les expériences passées et les caractéristiques de l’élément. Plus les expériences seront nombreuses et les caractéristiques détaillées, plus fine et pertinente sera la décision. Il en est de même lors- qu’il nous faut catégoriser un document en fonction de son contenu. Par exemple détecter s’il s’agit d’un conte pour enfants ou d’un traité de philosophie. Ce traitement est bien sûr d’autant plus efficace si nous possédons un grand nombre d’ouvrages de ces deux catégories et que l’ouvrage à classifier possède un nombre important de mots.Dans cette thèse, nous nous intéressons à la problématique de la prise de décision lorsque nous disposons de peu de documents d’apprentissage et que le document possède un nombre de mots limité. Nous proposons pour cela une nouvelle approche qui repose sur de nouvelles pondérations. Elle nous permet de déterminer avec précision l’importance à accorder aux mots composant le document.Afin d’optimiser les traitements, nous proposons une approche paramétrable. Cinq paramètres rendent notre système adaptable, quel que soit le problème de classification donné. De très nombreuses expérimentations ont été menées sur différents types de documents qui sont de langues variées et en appliquant différentes configurations. Selon les corpus, elles mettent en évidence que notre proposition permet d’obtenir des résultats supérieurs en comparaison avec les meilleures approches de la littérature pour traiter de petits volumes de données.L’utilisation de paramètres introduit une complexité supplémentaire puisqu’il faut alors déterminer les valeurs optimales. Détecter les meilleurs paramètres et algorithmes est une tâche compliquée dont la difficulté est théorisée à travers le théorème du No-Free-Lunch. Nous traitons cette seconde problématique en proposant une nouvelle approche de méta-classification reposant sur les notions de distances et de similarités. Plus précisément, nous proposons de nouveaux méta-descripteurs adaptés dans un contexte de classification de documents. Cette approche originale nous permet d’obtenir des résultats similaires aux meilleures approches de la littérature tout en offrant des qualités supplémentaires

    Efficient piecewise linear classifiers and applications

    Get PDF
    Supervised learning has become an essential part of data mining for industry, military, science and academia. Classification, a type of supervised learning allows a machine to learn from data to then predict certain behaviours, variables or outcomes. Classification can be used to solve many problems including the detection of malignant cancers, potentially bad creditors and even enabling autonomy in robots. The ability to collect and store large amounts of data has increased significantly over the past few decades. However, the ability of classification techniques to deal with large scale data has not been matched. Many data transformation and reduction schemes have been tried with mixed success. This problem is further exacerbated when dealing with real time classification in embedded systems. The real time classifier must classify using only limited processing, memory and power resources. Piecewise linear boundaries are known to provide efficient real time classifiers. They have low memory requirements, require little processing effort, are parameterless and classify in real time. Piecewise linear functions are used to approximate non-linear decision boundaries between pattern classes. Finding these piecewise linear boundaries is a difficult optimization problem that can require a long training time. Multiple optimization approaches have been used for real time classification, but can lead to suboptimal piecewise linear boundaries. This thesis develops three real time piecewise linear classifiers that deal with large scale data. Each classifier uses a single optimization algorithm in conjunction with an incremental approach that reduces the number of points as the decision boundaries are built. Two of the classifiers further reduce complexity by augmenting the incremental approach with additional schemes. One scheme uses hyperboxes to identify points inside the so-called “indeterminate” regions. The other uses a polyhedral conic set to identify data points lying on or close to the boundary. All other points are excluded from the process of building the decision boundaries. The three classifiers are applied to real time data classification problems and the results of numerical experiments on real world data sets are reported. These results demonstrate that the new classifiers require a reasonable training time and their test set accuracy is consistently good on most data sets compared with current state of the art classifiers.Doctor of Philosoph
    corecore