Search CORE

1,933 research outputs found

Online Tool Condition Monitoring Based on Parsimonious Ensemble+

Author: Dimla Eric
Lughofer Edwin
Pedrycz Witold
Pratama Mahardhika
Tjahjowidowo Tegoeh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2019
Field of study

Accurate diagnosis of tool wear in metal turning process remains an open challenge for both scientists and industrial practitioners because of inhomogeneities in workpiece material, nonstationary machining settings to suit production requirements, and nonlinear relations between measured variables and tool wear. Common methodologies for tool condition monitoring still rely on batch approaches which cannot cope with a fast sampling rate of metal cutting process. Furthermore they require a retraining process to be completed from scratch when dealing with a new set of machining parameters. This paper presents an online tool condition monitoring approach based on Parsimonious Ensemble+, pENsemble+. The unique feature of pENsemble+ lies in its highly flexible principle where both ensemble structure and base-classifier structure can automatically grow and shrink on the fly based on the characteristics of data streams. Moreover, the online feature selection scenario is integrated to actively sample relevant input attributes. The paper presents advancement of a newly developed ensemble learning algorithm, pENsemble+, where online active learning scenario is incorporated to reduce operator labelling effort. The ensemble merging scenario is proposed which allows reduction of ensemble complexity while retaining its diversity. Experimental studies utilising real-world manufacturing data streams and comparisons with well known algorithms were carried out. Furthermore, the efficacy of pENsemble was examined using benchmark concept drift data streams. It has been found that pENsemble+ incurs low structural complexity and results in a significant reduction of operator labelling effort.Comment: this paper has been published by IEEE Transactions on Cybernetic

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Learning, Categorization, Rule Formation, and Prediction by Fuzzy Neural Networks

Author: Carpenter Gail A.
Grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/10/1994
Field of study

National Science Foundation (IRI 94-01659); Office of Naval Research (N00014-91-J-4100, N00014-92-J-4015) Air Force Office of Scientific Research (90-0083, N00014-92-J-4015

Boston University Institutional Repository (OpenBU)

Context-Specific Preference Learning of One Dimensional Quantitative Geospatial Attributes Using a Neuro-Fuzzy Approach

Author: Mountrakis Georgios
Publication venue: DigitalCommons@UMaine
Publication date: 01/12/2004
Field of study

Change detection is a topic of great importance for modern geospatial information systems. Digital aerial imagery provides an excellent medium to capture geospatial information. Rapidly evolving environments, and the availability of increasing amounts of diverse, multiresolutional imagery bring forward the need for frequent updates of these datasets. Analysis and query of spatial data using potentially outdated data may yield results that are sometimes invalid. Due to measurement errors (systematic, random) and incomplete knowledge of information (uncertainty) it is ambiguous if a change in a spatial dataset has really occurred. Therefore we need to develop reliable, fast, and automated procedures that will effectively report, based on information from a new image, if a change has actually occurred or this change is simply the result of uncertainty. This thesis introduces a novel methodology for change detection in spatial objects using aerial digital imagery. The uncertainty of the extraction is used as a quality estimate in order to determine whether change has occurred. For this goal, we develop a fuzzy-logic system to estimate uncertainty values fiom the results of automated object extraction using active contour models (a.k.a. snakes). The differential snakes change detection algorithm is an extension of traditional snakes that incorporates previous information (i.e., shape of object and uncertainty of extraction) as energy functionals. This process is followed by a procedure in which we examine the improvement of the uncertainty at the absence of change (versioning). Also, we introduce a post-extraction method for improving the object extraction accuracy. In addition to linear objects, in this thesis we extend differential snakes to track deformations of areal objects (e.g., lake flooding, oil spills). From the polygonal description of a spatial object we can track its trajectory and areal changes. Differential snakes can also be used as the basis for similarity indices for areal objects. These indices are based on areal moments that are invariant under general affine transformation. Experimental results of the differential snakes change detection algorithm demonstrate their performance. More specifically, we show that the differential snakes minimize the false positives in change detection and track reliably object deformations

University of Maine

Dynamic Data Mining: Methodology and Algorithms

Author: Deng Xiong
Deng Xiong
Publication venue: Computing, Imperial College London
Publication date: 01/07/2011
Field of study

Supervised data stream mining has become an important and challenging data mining task in modern organizations. The key challenges are threefold: (1) a possibly infinite number of streaming examples and time-critical analysis constraints; (2) concept drift; and (3) skewed data distributions. To address these three challenges, this thesis proposes the novel dynamic data mining (DDM) methodology by effectively applying supervised ensemble models to data stream mining. DDM can be loosely defined as categorization-organization-selection of supervised ensemble models. It is inspired by the idea that although the underlying concepts in a data stream are time-varying, their distinctions can be identified. Therefore, the models trained on the distinct concepts can be dynamically selected in order to classify incoming examples of similar concepts. First, following the general paradigm of DDM, we examine the different concept-drifting stream mining scenarios and propose corresponding effective and efficient data mining algorithms. • To address concept drift caused merely by changes of variable distributions, which we term pseudo concept drift, base models built on categorized streaming data are organized and selected in line with their corresponding variable distribution characteristics. • To address concept drift caused by changes of variable and class joint distributions, which we term true concept drift, an effective data categorization scheme is introduced. A group of working models is dynamically organized and selected for reacting to the drifting concept. Secondly, we introduce an integration stream mining framework, enabling the paradigm advocated by DDM to be widely applicable for other stream mining problems. Therefore, we are able to introduce easily six effective algorithms for mining data streams with skewed class distributions. In addition, we also introduce a new ensemble model approach for batch learning, following the same methodology. Both theoretical and empirical studies demonstrate its effectiveness. Future work would be targeted at improving the effectiveness and efficiency of the proposed algorithms. Meantime, we would explore the possibilities of using the integration framework to solve other open stream mining research problems

Spiral - Imperial College Digital Repository

Connectionist Taxonomy Learning

Author: Frey Miłosław L.
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

The thesis presents an unsupervised connectionist network using spreading activation mechanism. By means of self-organization, the network is capable of creating a taxonomy of concepts which serves as a backbone for a respective ontology. The system is a biologically inspired constructivist hybrid between connectionist networks using distributed and localist data representation. Unlike most currently developed models it is capable to deal with analog signals and displays cognitive properties of categorization process. The thesis presents the general overview over the system’s architecture and method of network build-up and shows results of several experiments exploring the nature of categorization performed with the use of the described network

bonndoc – Der Publikationsserver der Universität Bonn

Responding to the COVID-19 crisis: a principled or pragmatist approach?

Author: Boin Arjen
Lodge Martin
Publication venue: 'Informa UK Limited'
Publication date: 03/08/2021
Field of study

Uncertainties run deep during a crisis. Yet, leaders will have to make critical decisions in the absence of information they would like to have. How do political leaders cope with this challenge? One way to deal with crisis-induced uncertainty is to base all decisions on a core principle or value. This is what we call a principled approach. The pragmatist approach offers an alternative: an experimental, trial-and-error strategy based on quick feedback. In this paper, we consider both approaches in light of the COVID-19 experience in four European countries. We conclude that the pragmatic approach may be superior, in theory, but is hard to effectuate in practice. We discuss implications for the practice of strategic crisis management

LSE Research Online

Solving the challenges of concept drift in data stream classification.

Author: Hu Hanqing
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/08/2022
Field of study

The rise of network connected devices and applications leads to a significant increase in the volume of data that are continuously generated overtime time, called data streams. In real world applications, storing the entirety of a data stream for analyzing later is often not practical, due to the data stream’s potentially infinite volume. Data stream mining techniques and frameworks are therefore created to analyze streaming data as they arrive. However, compared to traditional data mining techniques, challenges unique to data stream mining also emerge, due to the high arrival rate of data streams and their dynamic nature. In this dissertation, an array of techniques and frameworks are presented to improve the solutions on some of the challenges. First, this dissertation acknowledges that a “no free lunch” theorem exists for data stream mining, where no silver bullet solution can solve all problems of data stream mining. The dissertation focuses on detection of changes of data distribution in data stream mining. These changes are called concept drift. Concept drift can be categorized into many types. A detection algorithm often works only on some types of drift, but not all of them. Because of this, the dissertation finds specific techniques to solve specific challenges, instead of looking for a general solution. Then, this dissertation considers improving solutions for the challenges of high arrival rate of data streams. Data stream mining frameworks often need to process vast among of data samples in limited time. Some data mining activities, notably data sample labeling for classification, are too costly or too slow in such large scale. This dissertation presents two techniques that reduce the amount of labeling needed for data stream classification. The first technique presents a grid-based label selection process that apply to highly imbalanced data streams. Such data streams have one class of data samples vastly outnumber another class. Many majority class samples need to be labeled before a minority class sample can be found due to the imbalance. The presented technique divides the data samples into groups, called grids, and actively search for minority class samples that are close by within a grid. Experiment results show the technique can reduce the total number of data samples needed to be labeled. The second technique presents a smart preprocessing technique that reduce the number of times a new learning model needs to be trained due to concept drift. Less model training means less data labels required, and thus costs less. Experiment results show that in some cases the reduced performance of learning models is the result of improper preprocessing of the data, not due to concept drift. By adapting preprocessing to the changes in data streams, models can retain high performance without retraining. Acknowledging the high cost of labeling, the dissertation then considers the scenario where labels are unavailable when needed. The framework Sliding Reservoir Approach for Delayed Labeling (SRADL) is presented to explore solutions to such problem. SRADL tries to solve the delayed labeling problem where concept drift occurs, and no labels are immediately available. SRADL uses semi-supervised learning by employing a sliding windowed approach to store historical data, which is combined with newly unlabeled data to train new models. Experiments show that SRADL perform well in some cases of delayed labeling. Next, the dissertation considers improving solutions for the challenge of dynamism within data streams, most notably concept drift. The complex nature of concept drift means that most existing detection algorithms can only detect limited types of concept drift. To detect more types of concept drift, an ensemble approach that employs various algorithms, called Heuristic Ensemble Framework for Concept Drift Detection (HEFDD), is presented. The occurrence of each type of concept drift is voted on by the detection results of each algorithm in the ensemble. Types of concept drift with votes past majority are then declared detected. Experiment results show that HEFDD is able to improve detection accuracy significantly while reducing false positives. With the ability to detect various types of concept drift provided by HEFDD, the dissertation tries to improve the delayed labeling framework SRADL. A new combined framework, SRADL-HEFDD is presented, which produces synthetic labels to handle the unavailability of labels by human expert. SRADL-HEFDD employs different synthetic labeling techniques based on different types of drift detected by HEFDD. Experimental results show that comparing to the default SRADL, the combined framework improves prediction performance when small amount of labeled samples is available. Finally, as machine learning applications are increasingly used in critical domains such as medical diagnostics, accountability, explainability and interpretability of machine learning algorithms needs to be considered. Explainable machine learning aims to use a white box approach for data analytics, which enables learning models to be explained and interpreted by human users. However, few studies have been done on explaining what has changed in a dynamic data stream environment. This dissertation thus presents Data Stream Explainability (DSE) framework. DSE visualizes changes in data distribution and model classification boundaries between chunks of streaming data. The visualizations can then be used by a data mining researcher to generate explanations of what has changed within the data stream. To show that DSE can help average users understand data stream mining better, a survey was conducted with an expert group and a non-expert group of users. Results show DSE can reduce the gap of understanding what changed in data stream mining between the two groups

University of Louisville

Processing of Graded Signaling Systems

Author: Wadewitz Philip
Publication venue
Publication date: 04/12/2015
Field of study

Georg-August-University Göttingen

Jätevedenpuhdistamojen prosessinohjauksen ja operoinnin kehittäminen data-analytiikan avulla: esimerkkejä teollisuudesta ja kansainvälisiltä puhdistamoilta

Author: Eerikäinen Sanni
Publication venue
Publication date: 17/08/2020
Field of study

Instrumentation, control and automation are central for operation of municipal wastewater treatment plants. Treatment performance can be further improved and secured by processing and analyzing the collected process and equipment data. New challenges from resource efficiency, climate change and aging infrastructure increase the demand for understanding and controlling plant-wide interactions. This study aims to review what needs, barriers, incentives and opportunities Finnish wastewater treatment plants have for developing current process control and operation systems with data analytics. The study is conducted through interviews, thematic analysis and case studies of real-life applications in process industries and international utilities. Results indicate that for many utilities, additional measures for quality assurance of instruments, equipment and controllers are necessary before advanced control strategies can be applied. Readily available data could be used to improve the operational reliability of the process. 14 case studies of advanced data processing, analysis and visualization methods used in Finnish and international wastewater treatment plants as well as Finnish process industries are reviewed. Examples include process optimization and quality assurance solutions that have proven benefits in operational use. Applicability of these solutions for identified development needs is initially evaluated. Some of the examples are estimated to have direct potential for application in Finnish WWTPs. For other case studies, further piloting or research efforts to assess the feasibility and cost-benefits for WWTPs are suggested. As plant operation becomes more centralized and outsourced in the future, need for applying data analytics is expected to increase.Prosessinohjaus- ja automaatiojärjestelmillä on keskeinen rooli modernien jätevedenpuhdistamojen operoinnissa. Prosessi- ja laitetietoa paremmin hyödyntämällä prosessia voidaan ohjata entistä tehokkaammin ja luotettavammin. Kiertotalous, ilmastonmuutos ja infrastruktuurin ikääntyminen korostavat entisestään tarvetta ymmärtää ja ohjata myös eri osaprosessien välisiä vuorovaikutuksia. Tässä työssä tarkastellaan tarpeita, esteitä, kannustimia ja mahdollisuuksia kehittää jätevedenpuhdistamojen ohjausta ja operointia data-analytiikan avulla. Eri sidosryhmien näkemyksiä kartoitetaan haastatteluilla, joiden tuloksia käsitellään temaattisen analyysin kautta. Löydösten perusteella potentiaalisia ratkaisuja kartoitetaan suomalaisten ja kansainvälisten puhdistamojen sekä prosessiteollisuuden jo käyttämistä sovelluksista. Löydökset osoittavat, että monilla puhdistamoilla tarvitaan nykyistä merkittävästi kattavampia menetelmiä instrumentoinnin, laitteiston ja ohjauksen laadunvarmistukseen, ennen kuin edistyneempien prosessinohjausmenetelmien käyttöönotto on mahdollista. Operoinnin toimintavarmuutta ja luotettavuutta voitaisiin kehittää monin tavoin hyödyntämällä jo kerättyä prosessi- ja laitetietoa. Työssä esitellään yhteensä 14 esimerkkiä puhdistamoilla ja prosessiteollisuudessa käytössä olevista prosessinohjaus- ja laadunvarmistusmenetelmistä. Osalla ratkaisuista arvioidaan sellaisenaan olevan laajaa sovelluspotentiaalia suomalaisilla jätevedenpuhdistamoilla. Useiden ratkaisujen käyttöönottoa voitaisiin edistää pilotoinnilla tai jatkotutkimuksella potentiaalisten hyötyjen ja kustannusten arvioimiseksi. Jo kerättyä prosessi- ja laitetietoa hyödyntävien ratkaisujen kysynnän odotetaan tulevaisuudessa lisääntyvän, kun puhdistamojen operointi keskittyy ja paineet kustannus- ja energiatehokkuudelle kasvavat

Aaltodoc Publication Archive

Condition Assessment Models for Sewer Pipelines

Author: Alkadour Firas Amer Abdulrazak
Publication venue
Publication date: 01/06/2017
Field of study

Underground pipeline system is a complex infrastructure system that has significant impact on social, environmental and economic aspects. Sewer pipeline networks are considered to be an extremely expensive asset. This study aims to develop condition assessment models for sewer pipeline networks. Seventeen factors affecting the condition of sewer network were considered for gravity pipelines in addition to the operating pressure for pressurized pipelines. Two different methodologies were adopted for models’ development. The first method by using an integrated Fuzzy Analytic Network Process (FANP) and Monte-Carlo simulation and the second method by using FANP, fuzzy set theory (FST) and Evidential Reasoning (ER). The models’ output is the assessed pipeline condition. In order to collect the necessary data for developing the models, questionnaires were distributed among experts in sewer pipelines in the state of Qatar. In addition, actual data for an existing sewage network in the state of Qatar was used to validate the models’ outputs. The “Ground Disturbance” factor was found to be the most influential factor followed by the “Location” factor with a weight of 10.6% and 9.3% for pipelines under gravity and 8.8% and 8.6% for pipelines under pressure, respectively. On the other hand, the least affecting factor was the “Length” followed by “Diameter” with weights of 2.2% and 2.5% for pipelines under gravity and 2.5% and 2.6% for pipelines under pressure. The developed models were able to satisfactorily assess the conditions of deteriorating sewer pipelines with an average validity of approximately 85% for the first approach and 86% for the second approach. The developed models are expected to be a useful tool for decision makers to properly plan for their inspections and provide effective rehabilitation of sewer networks.1)- NPRP grant # (NPRP6-357-2-150) from the QatarNational Research Fund (Member of Qatar Foundation) 2)-Tarek Zayed, Professor of Civil Engineeringat Concordia University for his support in the analysis part, the Public Works 3)-Authority of Qatar (ASHGAL) for their support in the data collection

Qatar University Institutional Repository