Search CORE

2,872 research outputs found

Statistical methods for tissue array images - algorithmic scoring and co-training

Author: Knudsen Beatrice
Linden Michael
Randolph Timothy
Wang Pei
Yan Donghui
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

Recent advances in tissue microarray technology have allowed immunohistochemistry to become a powerful medium-to-high throughput analysis tool, particularly for the validation of diagnostic and prognostic biomarkers. However, as study size grows, the manual evaluation of these assays becomes a prohibitive limitation; it vastly reduces throughput and greatly increases variability and expense. We propose an algorithm - Tissue Array Co-Occurrence Matrix Analysis (TACOMA) - for quantifying cellular phenotypes based on textural regularity summarized by local inter-pixel relationships. The algorithm can be easily trained for any staining pattern, is absent of sensitive tuning parameters and has the ability to report salient pixels in an image that contribute to its score. Pathologists' input via informative training patches is an important aspect of the algorithm that allows the training for any specific marker or cell type. With co-training, the error rate of TACOMA can be reduced substantially for a very small training sample (e.g., with size 30). We give theoretical insights into the success of co-training via thinning of the feature set in a high-dimensional setting when there is "sufficient" redundancy among the features. TACOMA is flexible, transparent and provides a scoring process that can be evaluated with clarity and confidence. In a study based on an estrogen receptor (ER) marker, we show that TACOMA is comparable to, or outperforms, pathologists' performance in terms of accuracy and repeatability.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS543 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Predictive Maintenance of an External Gear Pump using Machine Learning Algorithms

Author: KAYALVIZHI LAKSHMANAN
Publication venue: 'Swansea University'
Publication date: 01/01/2021
Field of study

The importance of Predictive Maintenance is critical for engineering industries, such as manufacturing, aerospace and energy. Unexpected failures cause unpredictable downtime, which can be disruptive and high costs due to reduced productivity. This forces industries to ensure the reliability of their equip-ment. In order to increase the reliability of equipment, maintenance actions, such as repairs, replacements, equipment updates, and corrective actions are employed. These actions aﬀect the ﬂexibility, quality of operation and manu-facturing time. It is therefore essential to plan maintenance before failure occurs.Traditional maintenance techniques rely on checks conducted routinely based on running hours of the machine. The drawback of this approach is that maintenance is sometimes performed before it is required. Therefore, conducting maintenance based on the actual condition of the equipment is the optimal solu-tion. This requires collecting real-time data on the condition of the equipment, using sensors (to detect events and send information to computer processor).Predictive Maintenance uses these types of techniques or analytics to inform about the current, and future state of the equipment. In the last decade, with the introduction of the Internet of Things (IoT), Machine Learning (ML), cloud computing and Big Data Analytics, manufacturing industry has moved forward towards implementing Predictive Maintenance, resulting in increased uptime and quality control, optimisation of maintenance routes, improved worker safety and greater productivity.The present thesis describes a novel computational strategy of Predictive Maintenance (fault diagnosis and fault prognosis) with ML and Deep Learning applications for an FG304 series external gear pump, also known as a domino pump. In the absence of a comprehensive set of experimental data, synthetic data generation techniques are implemented for Predictive Maintenance by perturbing the frequency content of time series generated using High-Fidelity computational techniques. In addition, various types of feature extraction methods considered to extract most discriminatory informations from the data. For fault diagnosis, three types of ML classiﬁcation algorithms are employed, namely Multilayer Perceptron (MLP), Support Vector Machine (SVM) and Naive Bayes (NB) algorithms. For prognosis, ML regression algorithms, such as MLP and SVM, are utilised. Although signiﬁcant work has been reported by previous authors, it remains diﬃcult to optimise the choice of hyper-parameters (important parameters whose value is used to control the learning process) for each speciﬁc ML algorithm. For instance, the type of SVM kernel function or the selection of the MLP activation function and the optimum number of hidden layers (and neurons).It is widely understood that the reliability of ML algorithms is strongly depen-dent upon the existence of a suﬃciently large quantity of high-quality training data. In the present thesis, due to the unavailability of experimental data, a novel high-ﬁdelity in-silico dataset is generated via a Computational Fluid Dynamic (CFD) model, which has been used for the training of the underlying ML metamodel. In addition, a large number of scenarios are recreated, ranging from healthy to faulty ones (e.g. clogging, radial gap variations, axial gap variations, viscosity variations, speed variations). Furthermore, the high-ﬁdelity dataset is re-enacted by using degradation functions to predict the remaining useful life (fault prognosis) of an external gear pump.The thesis explores and compares the performance of MLP, SVM and NB algo-rithms for fault diagnosis and MLP and SVM for fault prognosis. In order to enable fast training and reliable testing of the MLP algorithm, some predeﬁned network architectures, like 2n neurons per hidden layer, are used to speed up the identiﬁcation of the precise number of neurons (shown to be useful when the sample data set is suﬃciently large). Finally, a series of benchmark tests are presented, enabling to conclude that for fault diagnosis, the use of wavelet features and a MLP algorithm can provide the best accuracy, and the MLP al-gorithm provides the best prediction results for fault prognosis. In addition, benchmark examples are simulated to demonstrate the mesh convergence for the CFD model whereas, quantiﬁcation analysis and noise inﬂuence on training data are performed for ML algorithms

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Cronfa at Swansea University

Robust classification of advanced power quality disturbances in smart grids

Author: ALMEIDA Gabriel Caldas Sardinha de
Publication venue: UNIFEI
Publication date: 23/11/2021
Field of study

The insertion of new devices, increased data flow, intermittent generation and massive computerization have considerably increased current electrical systems’ complexity. This increase resulted in necessary changes, such as the need for more intelligent electrical net works to adapt to this different reality. Artificial Intelligence (AI) plays an important role in society, especially the techniques based on the learning process, and it is extended to the power systems. In the context of Smart Grids (SG), where the information and innovative solutions in monitoring is a primary concern, those techniques based on AI can present several applications. This dissertation investigates the use of advanced signal processing and ML algorithms to create a Robust Classifier of Advanced Power Quality (PQ) Dis turbances in SG. For this purpose, known models of PQ disturbances were generated with random elements to approach real applications. From these models, thousands of signals were generated with the performance of these disturbances. Signal processing techniques using Discrete Wavelet Transform (DWT) were used to extract the signal’s main charac teristics. This research aims to use ML algorithms to classify these data according to their respective features. ML algorithms were trained, validated, and tested. Also, the accuracy and confusion matrix were analyzed, relating the logic behind the results. The stages of data generation, feature extraction and optimization techniques were performed in the MATLAB software. The Classification Learner toolbox was used for training, validation and testing the 27 different ML algorithms and assess each performance. All stages of the work were previously idealized, enabling their correct development and execution. The results show that the Cubic Support Vector Machine (SVM) classifier achieved the maximum accuracy of all algorithms, indicating the effectiveness of the proposed method for classification. Considerations about the results were interpreted as explaining the per formance of each technique, its relations and their respective justifications.A inserção de novos dispositivos na rede, aumento do fluxo de dados, geração intermitente e a informatização massiva aumentaram consideravelmente a complexidade dos sistemas elétricos atuais. Esse aumento resultou em mudanças necessárias, como a necessidade de redes elétricas mais inteligentes para se adaptarem a essa realidade diferente. A nova ger ação de técnicas de Inteligência Artificial, representada pelo "Big Data", Aprendizado de Máquina ("Machine Learning"), Aprendizagem Profunda e Reconhecimento de Padrões representa uma nova era na sociedade e no desenvolvimento global baseado na infor mação e no conhecimento. Com as mais recentes Redes Inteligentes, o uso de técnicas que utilizem esse tipo de inteligência será ainda mais necessário. Esta dissertação investiga o uso de processamento de sinais avançado e também algoritmos de Aprendizagem de Máquina para desenvolver um classificador robusto de distúrbios de qualidade de energia no contexto das Redes Inteligentes. Para isso, modelos já conhecidos de alguns proble mas de qualidade foram gerados junto com ruídos aleatórios para que o sistema fosse similar a aplicações reais. A partir desses modelos, milhares de sinais foram gerados e a Transformada Wavelet Discreta foi usada para extrair as principais características destas perturbações. Esta dissertação tem como objetivo utilizar algoritmos baseados no con ceito de Aprendizado de Máquina para classificar os dados gerados de acordo com suas classes. Todos estes algoritmos foram treinados, validados e por fim, testados. Além disso, a acurácia e a matriz de confusão de cada um dos modelos foi apresentada e analisada. As etapas de geração de dados, extração das principais características e otimização dos dados foram realizadas no software MATLAB. Uma toolbox específica deste programa foi us ada para treinar, validar e testar os 27 algoritmos diferentes e avaliar cada desempenho. Todas as etapas do trabalho foram previamente idealizadas, possibilitando seu correto desenvolvimento e execução. Os resultados mostram que o classificador "Cubic Support Vector Machine" obteve a máxima precisão entre todos os algoritmos, indicando a eficácia do método proposto para classificação. As considerações sobre os resultados foram inter pretadas, como por exemplo a explicação da performance de cada técnica, suas relações e suas justificativas

Repositório UNIFEI

Recommended from our members

HYPOTHYROID DISEASE ANALYSIS BY USING MACHINE LEARNING

Author: SEELAM SANJANA
Publication venue: CSUSB ScholarWorks
Publication date: 01/12/2023
Field of study

Thyroid illness frequently manifests as hypothyroidism. It is evident that people with hypothyroidism are primarily female. Because the majority of people are unaware of the illness, it is quickly becoming more serious. It is crucial to catch it early on so that medical professionals can treat it more effectively and prevent it from getting worse. Machine learning illness prediction is a challenging task. Disease prediction is aided greatly by machine learning. Once more, unique feature selection strategies have made the process of disease assumption and prediction easier. To properly monitor and cure this illness, accurate detection is essential. In order to build models that can forecast the development of hypothyroidism. In this project, we utilized machine learning approaches such Logistic Regression, Decision Trees, and Naive Bayes. Here we used thyroid function-related measures and characteristics from a UCI Machine Learning Repository dataset. The main goals were to properly assess each machine learning model\u27s performance and fine-tune its hyperparameters. With an accuracy rate of 99.87%, the findings of this study generated the model\u27s ability to predict hypothyroidism were pretty remarkable. This high degree of accuracy shows how useful these machine learning algorithms are as diagnostic v and therapeutic tools for hypothyroid patients early on. This experiment demonstrates the potential of machine learning in healthcare and has an impact on diagnosis. It is crucial that you do this appropriately

CSUSB ScholarWorks

A survey of machine learning techniques applied to self organizing cellular networks

Author: Imran Muhammad Ali
Onireti Oluwakayode
Souza Richard Demo
Valente Klaine Paulo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future

Enlighten

Data Driven Sample Generator Model with Application to Classification

Author: Ulloa Cerna Alvaro Emilio
Publication venue: UNM Digital Repository
Publication date: 01/05/2016
Field of study

Despite the rapidly growing interest, progress in the study of relations between physiological abnormalities and mental disorders is hampered by complexity of the human brain and high costs of data collection. The complexity can be captured by machine learning approaches, but they still may require significant amounts of data. In this thesis, we seek to mitigate the latter challenge by developing a data driven sample generator model for the generation of synthetic realistic training data. Our method greatly improves generalization in classification of schizophrenia patients and healthy controls from their structural magnetic resonance images. A feed forward neural network trained exclusively on continuously generated synthetic data produces the best area under the curve compared to classifiers trained on real data alone