119 research outputs found

    Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data

    Get PDF
    Biomarkers which predict patient’s survival can play an important role in medical diagnosis and treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers ofsurvival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were located based on the position of optimized features. Kaplan-Meier curve and Cox regression model 2 were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time

    Data Science: Measuring Uncertainties

    Get PDF
    With the increase in data processing and storage capacity, a large amount of data is available. Data without analysis does not have much value. Thus, the demand for data analysis is increasing daily, and the consequence is the appearance of a large number of jobs and published articles. Data science has emerged as a multidisciplinary field to support data-driven activities, integrating and developing ideas, methods, and processes to extract information from data. This includes methods built from different knowledge areas: Statistics, Computer Science, Mathematics, Physics, Information Science, and Engineering. This mixture of areas has given rise to what we call Data Science. New solutions to the new problems are reproducing rapidly to generate large volumes of data. Current and future challenges require greater care in creating new solutions that satisfy the rationality for each type of problem. Labels such as Big Data, Data Science, Machine Learning, Statistical Learning, and Artificial Intelligence are demanding more sophistication in the foundations and how they are being applied. This point highlights the importance of building the foundations of Data Science. This book is dedicated to solutions and discussions of measuring uncertainties in data analysis problems

    Fuzzy Logic

    Get PDF
    Fuzzy Logic is becoming an essential method of solving problems in all domains. It gives tremendous impact on the design of autonomous intelligent systems. The purpose of this book is to introduce Hybrid Algorithms, Techniques, and Implementations of Fuzzy Logic. The book consists of thirteen chapters highlighting models and principles of fuzzy logic and issues on its techniques and implementations. The intended readers of this book are engineers, researchers, and graduate students interested in fuzzy logic systems

    A COMPARATIVE STUDY ON PERFORMANCE OF BASIC AND ENSEMBLE CLASSIFIERS WITH VARIOUS DATASETS

    Get PDF
    Classification plays a critical role in machine learning (ML) systems for processing images, text and high -dimensional data. Predicting class labels from training data is the primary goal of classification. An optimal model for a particular classification problem is chosen on the basis of the model's performance and execution time. This paper compares and analyses the performance of basic as well as ensemble classifiers utilizing 10 -fold cross validation and also discusses their essential concepts, advantages, and disadvantages. In this study five basic classifiers namely Naïve Bayes (NB), Multi-layer Perceptron (MLP), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF) and the ensemble of all the five classifiers along with few more combinations are compared with five University of California Irvine (UCI) ML Repository datasets and a Diabetes Health Indicators dataset from kaggle repository. To analyze and compare the performance of classifiers, evaluation metrics like Accuracy, Recall, Precision, Area Under Curve (AUC) and F-Score are used. Experimental results showed that SVM performs best on two out of the six datasets (Diabetes Health Indicators and waveform), RF performs best for Arrhythmia, Sonar, Tic-tac-toe datasets, and the best ensemble combination is found to be DT+SVM+RF on Ionosphere dataset having respective accuracies 72.58%, 90.38%, 81.63%, 73.59%, 94.78% and 94.01% and the proposed ensemble combinations outperformed over the conventional models for few datasets

    Supervised Classification of Healthcare Text Data Based on Context-Defined Categories

    Get PDF
    Achieving a good success rate in supervised classification analysis of a text dataset, where the relationship between the text and its label can be extracted from the context, but not from isolated words in the text, is still an important challenge facing the fields of statistics and machine learning. For this purpose, we present a novel mathematical framework. We then conduct a comparative study between established classification methods for the case where the relationship between the text and the corresponding label is clearly depicted by specific words in the text. In particular, we use logistic LASSO, artificial neural networks, support vector machines, and decision-tree-like procedures. This methodology is applied to a real case study involving mapping Consolidated Framework for Implementation and Research (CFIR) constructs to health-related text data and achieves a prediction success rate of over 80% when just the first 55% of the text, or more, is used for training and the remaining for testing. The results indicate that the methodology can be useful to accelerate the CFIR coding process.A.N.-R. is supported by Grant MTM2017-86061-C2-2-P funded by “ERDF A way of making Europe” and MCIN/AEI/10.13039/501100011033. For H.L.R., this study was funded by Instituto de Salud Carlos III through the project “PI17/02070” (co-funded by the European Regional Development Fund/European Social Fund “A way to make Europe”/“Investing in your future”) and the Basque Government Department of Health project “2017111086”. The funding bodies had no role in the design of the study, collection, analysis, nor interpretation of data, nor the writing of the manuscript. The APC was paid by PI17/0207

    Intelligent Systems

    Get PDF
    This book is dedicated to intelligent systems of broad-spectrum application, such as personal and social biosafety or use of intelligent sensory micro-nanosystems such as "e-nose", "e-tongue" and "e-eye". In addition to that, effective acquiring information, knowledge management and improved knowledge transfer in any media, as well as modeling its information content using meta-and hyper heuristics and semantic reasoning all benefit from the systems covered in this book. Intelligent systems can also be applied in education and generating the intelligent distributed eLearning architecture, as well as in a large number of technical fields, such as industrial design, manufacturing and utilization, e.g., in precision agriculture, cartography, electric power distribution systems, intelligent building management systems, drilling operations etc. Furthermore, decision making using fuzzy logic models, computational recognition of comprehension uncertainty and the joint synthesis of goals and means of intelligent behavior biosystems, as well as diagnostic and human support in the healthcare environment have also been made easier

    Decision Support Systems for Risk Assessment in Credit Operations Against Collateral

    Get PDF
    With the global economic crisis, which reached its peak in the second half of 2008, and before a market shaken by economic instability, financial institutions have taken steps to protect the banks’ default risks, which had an impact directly in the form of analysis in credit institutions to individuals and to corporate entities. To mitigate the risk of banks in credit operations, most banks use a graded scale of customer risk, which determines the provision that banks must do according to the default risk levels in each credit transaction. The credit analysis involves the ability to make a credit decision inside a scenario of uncertainty and constant changes and incomplete transformations. This ability depends on the capacity to logically analyze situations, often complex and reach a clear conclusion, practical and practicable to implement. Credit Scoring models are used to predict the probability of a customer proposing to credit to become in default at any given time, based on his personal and financial information that may influence the ability of the client to pay the debt. This estimated probability, called the score, is an estimate of the risk of default of a customer in a given period. This increased concern has been in no small part caused by the weaknesses of existing risk management techniques that have been revealed by the recent financial crisis and the growing demand for consumer credit.The constant change affects several banking sections because it prevents the ability to investigate the data that is produced and stored in computers that are too often dependent on manual techniques. Among the many alternatives used in the world to balance this risk, the provision of guarantees stands out of guarantees in the formalization of credit agreements. In theory, the collateral does not ensure the credit return, as it is not computed as payment of the obligation within the project. There is also the fact that it will only be successful if triggered, which involves the legal area of the banking institution. The truth is, collateral is a mitigating element of credit risk. Collaterals are divided into two types, an individual guarantee (sponsor) and the asset guarantee (fiduciary). Both aim to increase security in credit operations, as an payment alternative to the holder of credit provided to the lender, if possible, unable to meet its obligations on time. For the creditor, it generates liquidity security from the receiving operation. The measurement of credit recoverability is a system that evaluates the efficiency of the collateral invested return mechanism. In an attempt to identify the sufficiency of collateral in credit operations, this thesis presents an assessment of smart classifiers that uses contextual information to assess whether collaterals provide for the recovery of credit granted in the decision-making process before the credit transaction become insolvent. The results observed when compared with other approaches in the literature and the comparative analysis of the most relevant artificial intelligence solutions, considering the classifiers that use guarantees as a parameter to calculate the risk contribute to the advance of the state of the art advance, increasing the commitment to the financial institutions.Com a crise econômica global, que atingiu seu auge no segundo semestre de 2008, e diante de um mercado abalado pela instabilidade econômica, as instituições financeiras tomaram medidas para proteger os riscos de inadimplência dos bancos, medidas que impactavam diretamente na forma de análise nas instituições de crédito para pessoas físicas e jurídicas. Para mitigar o risco dos bancos nas operações de crédito, a maioria destas instituições utiliza uma escala graduada de risco do cliente, que determina a provisão que os bancos devem fazer de acordo com os níveis de risco padrão em cada transação de crédito. A análise de crédito envolve a capacidade de tomar uma decisão de crédito dentro de um cenário de incerteza e mudanças constantes e transformações incompletas. Essa aptidão depende da capacidade de analisar situações lógicas, geralmente complexas e de chegar a uma conclusão clara, prática e praticável de implementar. Os modelos de Credit Score são usados para prever a probabilidade de um cliente propor crédito e tornar-se inadimplente a qualquer momento, com base em suas informações pessoais e financeiras que podem influenciar a capacidade do cliente de pagar a dívida. Essa probabilidade estimada, denominada pontuação, é uma estimativa do risco de inadimplência de um cliente em um determinado período. A mudança constante afeta várias seções bancárias, pois impede a capacidade de investigar os dados que são produzidos e armazenados em computadores que frequentemente dependem de técnicas manuais. Entre as inúmeras alternativas utilizadas no mundo para equilibrar esse risco, destacase o aporte de garantias na formalização dos contratos de crédito. Em tese, a garantia não “garante” o retorno do crédito, já que não é computada como pagamento da obrigação dentro do projeto. Tem-se ainda, o fato de que esta só terá algum êxito se acionada, o que envolve a área jurídica da instituição bancária. A verdade é que, a garantia é um elemento mitigador do risco de crédito. As garantias são divididas em dois tipos, uma garantia individual (patrocinadora) e a garantia do ativo (fiduciário). Ambos visam aumentar a segurança nas operações de crédito, como uma alternativa de pagamento ao titular do crédito fornecido ao credor, se possível, não puder cumprir suas obrigações no prazo. Para o credor, gera segurança de liquidez a partir da operação de recebimento. A mensuração da recuperabilidade do crédito é uma sistemática que avalia a eficiência do mecanismo de retorno do capital investido em garantias. Para tentar identificar a suficiência das garantias nas operações de crédito, esta tese apresenta uma avaliação dos classificadores inteligentes que utiliza informações contextuais para avaliar se as garantias permitem prever a recuperação de crédito concedido no processo de tomada de decisão antes que a operação de crédito entre em default. Os resultados observados quando comparados com outras abordagens existentes na literatura e a análise comparativa das soluções de inteligência artificial mais relevantes, mostram que os classificadores que usam garantias como parâmetro para calcular o risco contribuem para o avanço do estado da arte, aumentando o comprometimento com as instituições financeiras

    Reliability assessment of manufacturing systems: A comprehensive overview, challenges and opportunities

    Get PDF
    Reliability assessment refers to the process of evaluating reliability of components or systems during their lifespan or prior to their implementation. In the manufacturing industry, the reliability of systems is directly linked to production efficiency, product quality, energy consumption, and other crucial performance indicators. Therefore, reliability plays a critical role in every aspect of manufacturing. In this review, we provide a comprehensive overview of the most significant advancements and trends in the assessment of manufacturing system reliability. For this, we also consider the three main facets of reliability analysis of cyber–physical systems, i.e., hardware, software, and human-related reliability. Beyond the overview of literature, we derive challenges and opportunities for reliability assessment of manufacturing systems based on the reviewed literature. Identified challenges encompass aspects like failure data availability and quality, fast-paced technological advancements, and the increasing complexity of manufacturing systems. In turn, the opportunities include the potential for integrating various assessment methods, and leveraging data to automate the assessment process and to increase accuracy of derived reliability models
    corecore